While trading may be the most exciting and lucrative domain of application of Machine Learning, it is also one of the most challenging. Trading is not only about buying or selling, nor is it just about analysing the financial state of a target company. One of the reasons why it is so difficult to be a top trader is that it requires to consider a large amount of data of different nature. This also explains the machine learning hype in trading. Text, speech, numbers, images … Machine learning algorithms can deal with almost any type of data. In this series of articles, we will introduce an implementation of a not so common deep learning approach to stock price trend prediction based on financial news. Our inspiration comes from the recent research paper “ Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction “ — LCW.
Recent trends in research paper and blog articles
Many approaches introduced in last years’ research papers suffer from incompleteness. One of those approaches consists in designing an algorithm based on last days’ stock prices only. Recurrent Neural Network are largely to that extent. Another way is to use sentiment analysis in the trading policy. If you are familiar with machine learning for trading, you have certainly come across “ Stock trading with Twitter using sentiment analysis”. Reinforcement learning is also trendy now, as shown in this paper released in July 2018 . While these solutions give impressive results, we think that they underuse the potential power of machine learning algorithms.
What makes this paper so special ?
Nowadays, A.I. is trying to become more human. And algorithmic trading is no exception. Some of the recently published research papers try to design frameworks that imitate real investors. LCW is one of them and that’s why we chose it.
Where is the innovation ?
To us LCW does better at replicating the human behavior than many other papers on this topic. As you will understand, this paper is mainly about text mining. Now, imagine you are an investor trying to predict the variation of one stock tomorrow. You may try to get as much information about the company over the last days. And then you get an idea of how the stock price might evolve the next days. This is the use case that LCW tries to solve using a deep learning framework that takes time sequences of articles as input.
The authors have taken into account three characteristics of the learning process followed by an investor struggling with the “chaotic news” :
- First, the Sequential Context Dependency. This simply refers to the fact that a single news is more informative within a broader context than isolated.
- Second, the Diverse Influence. One critical news can affect the stock price for weeks, whereas a trivial one may have zero effect.
- Third, the “Effective and efficient Learning”. It is learning from the more common situations before turning to the exceptional cases.
This is not a theoretical paper but rather a math-engineering paper. The design process might be like this:
– We have to deal with sequences of press articles. What neural network could we use for that ?
– Recurrent neural network are used for sequence modeling. We can use them here.
– OK. Which one is easier to train and does not kill the gradient ?
– GRU neural network.
– Good !
– Now how to deal with diverse influence ? I want my algorithm to focus on the most important articles ?
– Alright, now give me a simple neural network to perform a three class classification ?
– Multilayer perceptron !
There you got it :
As you can see, the innovation is in the design of the whole framework. And in the way they connect different neural networks to solve the whole problem. We will let you read more about the architecture in paragraph 4.2 of the paper.
They have also implemented a Self-Paced Learning algorithm . It aims at performing a more effective and efficient learning. You may read paragraph 4.3 for more information about this algorithm. We did not have enough to time to implement this part of the paper.
Pierrick and I are two french engineering student currently in our second year of Master program. We are by no means expert in trading and beginners in machine learning. It is our first paper implementation, and first technical blog post as well, so we are open to any constructive criticism both on our code and articles.
Our workflow is divided in 4 steps :
For the purpose of this research project we have scraped all the articles published on reuters.com from 2015 to 2017. We used mainly BeautifulSoup and Urllib library as well as the multiprocessing library. And yes, the whole project is in Python 3.
- Articles Vectorization
We chose not to follow the paper on this part. After collecting more than 1 million articles (see 5.1.1 on the paper) they have trained a Word2Vec on the whole vocabulary of their articles. And then, they computed the vector mean of all the words in an article to make a vector representation of it. We preferred to use Doc2Vec for a better representation of the article. Our choice wasinspired by this comparison. We used Gensim library for that.
- Dataset Creation
This part consisted in creating a dataset that the HAN network can train on. Our data X is a time sequence of vectorized articles day from t-10 to day t, and the target value Y is the variation of the corresponding stock on day t+1. Scikit-Learn and Pickle libraries were very helpful for this task.
- Model Training
We used Keras to build the model. The wrapper Time Distributed was of a great use to apply attention mechanism to the input and output of the GRU network.
Implementing this paper was thrilling, and we look forward to write about each step of the implementation. Many thanks to the authors for this inspiring paper.
Source: Deep Learning on Medium