Found 473 repositories(showing 30)
Aryia-Behroziuan
An ANN is a model based on a collection of connected units or nodes called "artificial neurons", which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit information, a "signal", from one artificial neuron to another. An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it. In common ANN implementations, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs. The connections between artificial neurons are called "edges". Artificial neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Artificial neurons may have a threshold such that the signal is only sent if the aggregate signal crosses that threshold. Typically, artificial neurons are aggregated into layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first layer (the input layer) to the last layer (the output layer), possibly after traversing the layers multiple times. The original goal of the ANN approach was to solve problems in the same way that a human brain would. However, over time, attention moved to performing specific tasks, leading to deviations from biology. Artificial neural networks have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis. Deep learning consists of multiple hidden layers in an artificial neural network. This approach tries to model the way the human brain processes light and sound into vision and hearing. Some successful applications of deep learning are computer vision and speech recognition.[68] Decision trees Main article: Decision tree learning Decision tree learning uses a decision tree as a predictive model to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modeling approaches used in statistics, data mining, and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees. In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data, but the resulting classification tree can be an input for decision making. Support vector machines Main article: Support vector machines Support vector machines (SVMs), also known as support vector networks, are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other.[69] An SVM training algorithm is a non-probabilistic, binary, linear classifier, although methods such as Platt scaling exist to use SVM in a probabilistic classification setting. In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces. Illustration of linear regression on a data set. Regression analysis Main article: Regression analysis Regression analysis encompasses a large variety of statistical methods to estimate the relationship between input variables and their associated features. Its most common form is linear regression, where a single line is drawn to best fit the given data according to a mathematical criterion such as ordinary least squares. The latter is often extended by regularization (mathematics) methods to mitigate overfitting and bias, as in ridge regression. When dealing with non-linear problems, go-to models include polynomial regression (for example, used for trendline fitting in Microsoft Excel[70]), logistic regression (often used in statistical classification) or even kernel regression, which introduces non-linearity by taking advantage of the kernel trick to implicitly map input variables to higher-dimensional space. Bayesian networks Main article: Bayesian network A simple Bayesian network. Rain influences whether the sprinkler is activated, and both rain and the sprinkler influence whether the grass is wet. A Bayesian network, belief network, or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independence with a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning. Bayesian networks that model sequences of variables, like speech signals or protein sequences, are called dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision problems under uncertainty are called influence diagrams. Genetic algorithms Main article: Genetic algorithm A genetic algorithm (GA) is a search algorithm and heuristic technique that mimics the process of natural selection, using methods such as mutation and crossover to generate new genotypes in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s.[71][72] Conversely, machine learning techniques have been used to improve the performance of genetic and evolutionary algorithms.[73] Training models Usually, machine learning models require a lot of data in order for them to perform well. Usually, when training a machine learning model, one needs to collect a large, representative sample of data from a training set. Data from the training set can be as varied as a corpus of text, a collection of images, and data collected from individual users of a service. Overfitting is something to watch out for when training a machine learning model. Federated learning Main article: Federated learning Federated learning is an adapted form of distributed artificial intelligence to training machine learning models that decentralizes the training process, allowing for users' privacy to be maintained by not needing to send their data to a centralized server. This also increases efficiency by decentralizing the training process to many devices. For example, Gboard uses federated machine learning to train search query prediction models on users' mobile phones without having to send individual searches back to Google.[74] Applications There are many applications for machine learning, including: Agriculture Anatomy Adaptive websites Affective computing Banking Bioinformatics Brain–machine interfaces Cheminformatics Citizen science Computer networks Computer vision Credit-card fraud detection Data quality DNA sequence classification Economics Financial market analysis[75] General game playing Handwriting recognition Information retrieval Insurance Internet fraud detection Linguistics Machine learning control Machine perception Machine translation Marketing Medical diagnosis Natural language processing Natural language understanding Online advertising Optimization Recommender systems Robot locomotion Search engines Sentiment analysis Sequence mining Software engineering Speech recognition Structural health monitoring Syntactic pattern recognition Telecommunication Theorem proving Time series forecasting User behavior analytics In 2006, the media-services provider Netflix held the first "Netflix Prize" competition to find a program to better predict user preferences and improve the accuracy of its existing Cinematch movie recommendation algorithm by at least 10%. A joint team made up of researchers from AT&T Labs-Research in collaboration with the teams Big Chaos and Pragmatic Theory built an ensemble model to win the Grand Prize in 2009 for $1 million.[76] Shortly after the prize was awarded, Netflix realized that viewers' ratings were not the best indicators of their viewing patterns ("everything is a recommendation") and they changed their recommendation engine accordingly.[77] In 2010 The Wall Street Journal wrote about the firm Rebellion Research and their use of machine learning to predict the financial crisis.[78] In 2012, co-founder of Sun Microsystems, Vinod Khosla, predicted that 80% of medical doctors' jobs would be lost in the next two decades to automated machine learning medical diagnostic software.[79] In 2014, it was reported that a machine learning algorithm had been applied in the field of art history to study fine art paintings and that it may have revealed previously unrecognized influences among artists.[80] In 2019 Springer Nature published the first research book created using machine learning.[81] Limitations Although machine learning has been transformative in some fields, machine-learning programs often fail to deliver expected results.[82][83][84] Reasons for this are numerous: lack of (suitable) data, lack of access to the data, data bias, privacy problems, badly chosen tasks and algorithms, wrong tools and people, lack of resources, and evaluation problems.[85] In 2018, a self-driving car from Uber failed to detect a pedestrian, who was killed after a collision.[86] Attempts to use machine learning in healthcare with the IBM Watson system failed to deliver even after years of time and billions of dollars invested.[87][88] Bias Main article: Algorithmic bias Machine learning approaches in particular can suffer from different data biases. A machine learning system trained on current customers only may not be able to predict the needs of new customer groups that are not represented in the training data. When trained on man-made data, machine learning is likely to pick up the same constitutional and unconscious biases already present in society.[89] Language models learned from data have been shown to contain human-like biases.[90][91] Machine learning systems used for criminal risk assessment have been found to be biased against black people.[92][93] In 2015, Google photos would often tag black people as gorillas,[94] and in 2018 this still was not well resolved, but Google reportedly was still using the workaround to remove all gorillas from the training data, and thus was not able to recognize real gorillas at all.[95] Similar issues with recognizing non-white people have been found in many other systems.[96] In 2016, Microsoft tested a chatbot that learned from Twitter, and it quickly picked up racist and sexist language.[97] Because of such challenges, the effective use of machine learning may take longer to be adopted in other domains.[98] Concern for fairness in machine learning, that is, reducing bias in machine learning and propelling its use for human good is increasingly expressed by artificial intelligence scientists, including Fei-Fei Li, who reminds engineers that "There’s nothing artificial about AI...It’s inspired by people, it’s created by people, and—most importantly—it impacts people. It is a powerful tool we are only just beginning to understand, and that is a profound responsibility.”[99] Model assessments Classification of machine learning models can be validated by accuracy estimation techniques like the holdout method, which splits the data in a training and test set (conventionally 2/3 training set and 1/3 test set designation) and evaluates the performance of the training model on the test set. In comparison, the K-fold-cross-validation method randomly partitions the data into K subsets and then K experiments are performed each respectively considering 1 subset for evaluation and the remaining K-1 subsets for training the model. In addition to the holdout and cross-validation methods, bootstrap, which samples n instances with replacement from the dataset, can be used to assess model accuracy.[100] In addition to overall accuracy, investigators frequently report sensitivity and specificity meaning True Positive Rate (TPR) and True Negative Rate (TNR) respectively. Similarly, investigators sometimes report the false positive rate (FPR) as well as the false negative rate (FNR). However, these rates are ratios that fail to reveal their numerators and denominators. The total operating characteristic (TOC) is an effective method to express a model's diagnostic ability. TOC shows the numerators and denominators of the previously mentioned rates, thus TOC provides more information than the commonly used receiver operating characteristic (ROC) and ROC's associated area under the curve (AUC).[101] Ethics Machine learning poses a host of ethical questions. Systems which are trained on datasets collected with biases may exhibit these biases upon use (algorithmic bias), thus digitizing cultural prejudices.[102] For example, using job hiring data from a firm with racist hiring policies may lead to a machine learning system duplicating the bias by scoring job applicants against similarity to previous successful applicants.[103][104] Responsible collection of data and documentation of algorithmic rules used by a system thus is a critical part of machine learning. Because human languages contain biases, machines trained on language corpora will necessarily also learn these biases.[105][106] Other forms of ethical challenges, not related to personal biases, are more seen in health care. There are concerns among health care professionals that these systems might not be designed in the public's interest but as income-generating machines. This is especially true in the United States where there is a long-standing ethical dilemma of improving health care, but also increasing profits. For example, the algorithms could be designed to provide patients with unnecessary tests or medication in which the algorithm's proprietary owners hold stakes. There is huge potential for machine learning in health care to provide professionals a great tool to diagnose, medicate, and even plan recovery paths for patients, but this will not happen until the personal biases mentioned previously, and these "greed" biases are addressed.[107] Hardware Since the 2010s, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks (a particular narrow subdomain of machine learning) that contain many layers of non-linear hidden units.[108] By 2019, graphic processing units (GPUs), often with AI-specific enhancements, had displaced CPUs as the dominant method of training large-scale commercial cloud AI.[109] OpenAI estimated the hardware compute used in the largest deep learning projects from AlexNet (2012) to AlphaZero (2017), and found a 300,000-fold increase in the amount of compute required, with a doubling-time trendline of 3.4 months.[110][111] Software Software suites containing a variety of machine learning algorithms include the following: Free and open-source so
BangaloreSharks
Automate swing trading using deep reinforcement learning. The deep deterministic policy gradient-based neural network model trains to choose an action to sell, buy, or hold the stocks to maximize the gain in asset value. The paper also acknowledges the need for a system that predicts the trend in stock value to work along with the reinforcement learning algorithm. We implement a sentiment analysis model using a recurrent convolutional neural network to predict the stock trend from the financial news. The objective of this paper is not to build a better trading bot, but to prove that reinforcement learning is capable of learning the tricks of stock trading.
📈 A neural network regression model trained to predict the mean stock price percentage change everyday using financial factors like previous close price, actual previous close price, open price, market capitalization, total market value, price-to-earning ratio and price-to-book ratio, along with corresponding Sina Weibo and Baidu News sentiment scores.
Built a sentiment analysis model to predict the sentiment of a Financial News article. A comparative study of different optimizers used for training was done.
asupraja3
An End-to-End Machine Learning Pipeline that combines financial news sentiment analysis with historical stock price features to predict short-term market movements.
Poulinakis-Konstantinos
Predicting the price movement of stocks using past prices and sentiment analysis scores from financial News.
ananya2001gupta
Identify the software project, create business case, arrive at a problem statement. REQUIREMENT: Window XP, Internet, MS Office, etc. Problem Description: - 1. Introduction of AI and Machine Learning: - Artificial Intelligence applies machine learning, deep learning and other techniques to solve actual problems. Artificial intelligence (AI) brings the genuine human-to-machine interaction. Simply, Machine Learning is the algorithm that give computers the ability to learn from data and then make decisions and predictions, AI refers to idea where machines can execute tasks smartly. It is a faster process in learning the risk factors, and profitable opportunities. They have a feature of learning from their mistakes and experiences. When Machine learning is combined with Artificial Intelligence, it can be a large field to gather an immense amount of information and then rectify the errors and learn from further experiences, developing in a smarter, faster and accuracy handling technique. The main difference between Machine Learning and Artificial Intelligence is , If it is written in python then it is probably machine learning, If it is written in power point then it is artificial intelligence. As there are many existing projects that are implemented using AI and Machine Learning , And one of the project i.e., Bitcoin Price Prediction :- Bitcoin (₿ ) (founder - Satoshi Nakamoto , Ledger start: 3 January 2009 ) is a digital currency, a type of electronic money. It is decentralized advanced cash without a national bank or single chairman that can be sent from client to client on the shared Bitcoin arrange without middle people's requirement. Machine learning models can likely give us the insight we need to learn about the future of Cryptocurrency. It will not tell us the future but it might tell us the general trend and direction to expect the prices to move. These machine learning models predict the future of Bitcoin by coding them out in Python. Machine learning and AI-assisted trading have attracted growing interest for the past few years. this approach is to test the hypothesis that the inefficiency of the cryptocurrency market can be exploited to generate abnormal profits. the application of machine learning algorithms to the cryptocurrency market has been limited so far to the analysis of Bitcoin prices, using random forests , Bayesian neural network , long short-term memory neural network , and other algorithms. 2. Applications/Scope of AI and Machine Learning :- a) Sentiment Analysis :- It is the classification of subjective opinions or emotions (positive, negative, and neutral) within text data using natural language processing. b) It is Characterized as a use of computerized reasoning where accessible data is utilized through calculations to process or help the handling of factual information. BITCOIN PRICE PREDICTION USING AI AND MACHINE LEARNING: - The main aim of this is to find the actual Bitcoin price in US dollars can be predicted. The chance to make a model equipped for anticipating digital currencies fundamentally Bitcoin. # It works the prediction by taking the coinMarkup cap. # CoinMarketCap provides with historical data for Bitcoin price changes, keep a record of all the transactions by recording the amount of coins in circulation and the volume of coins traded in the last 24-hours. # Quandl is used to filter the dataset by using the MAT Lab properties. 3. Problem statement: - Some AI and Machine Learning problem statements are: - a) Data Privacy and Security: Once a company has dug up the data, privacy and security is eye-catching aspect that needs to be taken care of. b) Data Scarcity: The data is a very important aspect of AI, and labeled data is used to train machines to learn and make predictions. c) Data acquisition: In the process of machine learning, a large amount of data is used in the process of training and learning. d) High error susceptibility: In the process of artificial intelligence and machine learning, the high amount of data is used. Some problem statements of Bitcoin Price Prediction using AI and Machine Learning: - a) Experimental Phase Risk: It is less experimental than other counterparts. In addition, relative to traditional assets, its level can be assessed as high because this asset is not intended for conservative investors. b) Technology Risks: There is a technological risk to other cryptocurrencies in the form of the potential appearance of a more advanced cryptocurrency. Investors may simply not notice the moment when their virtual assets lose their real value. c) Price Variability: The variability of the value of cryptocurrency are the large volumes of exchange trading, the integration of Bitcoin with various companies, legislative initiatives of regulatory bodies and many other, sometimes disregarded phenomena. d) Consumer Protection: The property of the irreversibility of transactions in itself has little effect on the risks of investing in Bitcoin as an asset. e) Price Fluctuation Prediction: Since many investors care more about whether the sudden rise or fall is worth following. Bitcoin price often fluctuates by more than 10% (or even more than 30%) at some times. f) Lacks Government Regulation: Regulators in traditional financial markets are basically missing in the field of cryptocurrencies. For instance, fake news frequently affects the decisions of individual investors. g) It is difficult to use large interval data (e.g., day-level, and month-level data) . h) The change time of mining difficulties is much longer. Moreover, do not consider the news information since it is hard to determine the authenticity of a news or predict the occurrence of emergencies.
safteinzz
A machine learning-driven platform that integrates sentiment analysis and advanced predictive modeling to forecast stock market trends. It leverages TensorFlow, financial news analysis, and technical indicators within a Django web interface to offer insightful stock price predictions.
amratansh
We compiled the analyst reports from Morningstar for 15 largest companies in retail and technology sector and extracted the specific text. Then extracteed sentiments using VADER general sentiment lexicon and through Loughran and MCdonald financial sentiment lexicon. S&P Capital IQ and Yahoo Finance was also our data source. We applied statistical modeling, both linear and logisitc regressions to predict the percentage change in the stock price from day of publication of report to 3 time periods and our model showed some sigificant results with over 95% accuracy and validated our hypothesis.
shubhamkotal
FinBERT is a pre-trained NLP model to analyze the sentiment of the financial text. It is built by further training the BERT language model in the finance domain, using a large financial corpus and thereby fine-tuning it for financial sentiment classification. For the details, please see FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. The project was deployed on the flask. Providing an indicator to buy, hold or sell stock based on the sentiment predicted for the following news headline/article.
This project predicts corporate default probability using sentiment analysis of annual reports and financial data from U.S. companies. Key steps included data preprocessing, textual analysis, feature selection, and testing machine learning models. A Random Forest classifier was identified as the best model, trained, optimized, and exported for use.
FelixCharotte
NLP Project based on sentiment analysis of financial news to predict investment decision on stock market.
This repository contains code for analyzing the correlation between financial news sentiment and stock market movements. Using NLP for sentiment analysis and statistical techniques for correlation, the project aims to enhance predictive analytics in financial forecasting. Python, TA-Lib, PyNance, and GitHub Actions are utilized.
Sankethhhh
Stock-market LLM: A Language Model for Financial Analysis and Prediction in Stock Markets. Leveraging state-of-the-art NLP techniques to analyze market sentiment, predict trends, and provide insights for informed decision-making. Powered by cutting-edge deep learning algorithms.
DanielHaggstrom
Archived academic project exploring whether weekly sentiment extracted from Financial Times coverage can help explain or predict S&P 500 stock movements. It combines market data collection, news scraping, sentiment scoring, dataset generation, and a Dash interface for browsing the resulting prediction outputs and headlines.
An end-to-end application that predicts stock price movements using sentiment analysis of financial news headlines. Powered by machine learning, NLP, and real-time data integration, this project offers investors a reliable tool for data-driven decision-making.
AlexRomanenkoNorthwestern
This stock screener analyzes correlations between a company’s financial performance and its website, search, and foot traffic. It takes into account several alternative data sources to predict a company's future earnings by performing linear regressions on data from previous quarters. In addition, it performs sentiment analysis of news headlines, tracks insider buys, analyst price target changes, tweets, and reddit discussion. Stock analysis in displayed in one comprehensive dashboard.
robcamp-code
Using NLP for the text classification task of predicting financial news sentiment based on article headlines
kktallam
The central objective is to explore how sentiment extracted from financial news articles can be quantified and used as a predictive signal in systematic trading models, particularly when combined with well-established risk factors from academic finance.
SyedAliHaider02
NewsPulse is an AI-powered application designed to predict the sentiment of financial news headlines. By utilizing machine learning models, the application categorizes news articles into positive, neutral, or negative sentiments, providing valuable insights for financial decision-making.
Alpha-Mintamir
This project explores how the sentiment (positive or negative) of financial news affects stock market movements. It involves processing large datasets, analyzing news sentiment, and using machine learning to predict stock trends.
bhavyadubey
Retrieved financial institutions data using Twitter’s API, created a database, and performed pre-processing of this data • Created an AI module which classified each tweet into three different sentiment groups: Positive, Negative and Neutral Developed a POC which predicted the stock price of these financial institutions based on the different sentiment groups
Aistox– A hybrid AI-driven system that combines financial fundamentals, technical indicators, macroeconomic data, and real-time sentiment analysis from news and social media to predict stock price movements with greater contextual accuracy.
ThomasMathewz
Ohari is a software solution that integrates Long Short-Term Memory (LSTM) models with sentiment analysis of financial news and historical data to predict the next day's stock prices for selected stocks.
jmassengille
Machine learning project that aims to analyze stock market data and develop predictive models for trading insights. Using Python, Pandas, Scikit-learn, and TensorFlow, we process historical stock price data, financial indicators, and news sentiment to create feature-rich datasets.
rohanchutke
It used to take days for financial news to spread via radio, newspapers, and word of mouth. Now, in the age of the internet, it takes seconds. Did you know news articles are automatically being generated from figures and earnings call streams? In this project, you will generate investing insight by applying sentiment analysis on financial news headlines from Finviz. Using this natural language processing technique, you will understand the emotion behind the headlines and predict whether the market feels good or bad about a stock. This project lets you apply the skills from Intermediate Python for Data Science, Manipulating DataFrames with pandas, and Natural Language Processing Fundamentals in Python. We recommend that you take those courses before starting this project. Familiarity with the Beautiful Soup package may also be helpful. The datasets used in this project are raw HTML files for the Facebook (FB) and Tesla (TSLA) stocks from FINVIZ.com, a popular website dedicated to stock information and news.
Introduction This project looks at the mergers and acquisitions of 30 publicly traded companies and attempts to determine the stock price at closing. M&As are incredibly difficult to assess, and while the company's instrinsic value and fundamentals play a significant role in predicting whether a merger will be "successful", public sentiment from Wall Street investors is another commonly referenced topic. Brainstorming for this project prompted two notable observations; data on M&As are often incomplete and highly inconsistent given the confidentiality behind these deals, and determining an appropriate dependent variable y for analysis presents a significant challenge (would most likely require an additional project on its own). The success of a merger could be measured various ways, but often times the unpredictability of management makes all the more challenging. Culture, reorganization, and leadership shake-up are all attributes that play an important role in the success of an M&A but are difficult to quantify. Although I do build and run a model in this proejct, the complexity around this subject urged me to focus primarily on data gathering and manipulation. Since one would most likely need to compose a dataframe with the attributes necessary to run an a useful analysis on Mergers and Acquisition, I believe this is a valuable first step. For a more balanced notebook between EDA, data manipulation, and models, I have a project that focuses on COVID19's impact on Post-Secondary Education below titled COVID19 Effects on Post-Secondary Education https://github.com/clozgil The Process My objective was to build a dataframe with useful attribtues from scratch. I found that three reports per company would have sufficient information to get started. Acquistion data (any and all information on the company's M&A) Financial ratios (data to determine the company's fundamentals) Stock information (data to gain insight into Wall Street sentiment) Since downloading, importanting, and cleaning each one of those files for each of the 30 companies would be cumbersome, I looped on all the data files using the OS module, simulteanously cleaning and merging each one of the files. However, for the purpose of this presentation, I will feature each one of my data cleaning techniques for one company - Apple. Data Sources For reference only. All necessary data for this project can be found in the data dictionary Acquisition data: https://www.capitaliq.com/CIQDotNet/my/dashboard.aspx * Financial ratios: https://www-mergentonline-com.pitt.idm.oclc.org/companyfinancials.php?pagetype=ratios&compnumber=46247&period=Quarters&range=50&Submit=Refresh&csrf_token_mol=3680683535 * Stock info: https://www-mergentonline-com.pitt.idm.oclc.org/equitypricing.php?pagetype=report&compnumber=46247 * (*) = Account required. University of Pittsburgh account used for access
A full-fledged machine learning pipeline to predict Buy / Hold / Sell signals for S&P 500 stocks using sentiment analysis on Twitter & News articles, historical stock performance, and engineered financial indicators. Stock Sentiment-Based Action Predictor
Matts52
An end-to-end analysis on the explanatory and predictive power of sentiment-weighted topic attention of financial news on key market volatility indicators
morghayn
Predicting Price Movements of Financial Securities using Sentiment Analysis of Reddit Data.