Source: Deep Learning on Medium
Time Series Analysis with Deep Learning : Simplified
Take the crash course in the ‘whys’ and ‘whens’ of using Deep Learning in Time Series Analysis.
What is Time Series Analysis?
Time series is a sequence of data points, ordered using time stamps. And time series analysis is.. you guessed it.. analysis of the time series data 😛
From the daily price of your favorite fruit to the readings of the voltage output provided by a circuit, the scope of time series is huge and so is the field of time series analysis. Analyzing a time series data is usually focused on forecasting, but can also include classification, clustering, anomaly detection etc. For example, by studying the pattern of price variation in the past, you can try forecasting the price of that watch that you have been eyeing for so long, to judge what would be the best time to buy it!!
Why Deep Learning?
Time Series data can be highly erratic and complex. Deep Learning methods make no assumption about the underlying pattern in the data and are also more robust to noise (which is quite common in time series data), making them the top choice for time series analysis.
Before we move on to predicting, it is important to first process our data in a form that is understandable to a mathematical model. Time series data can be transformed into a supervised learning problem by using a sliding window to cut out datapoints. The expected output of each sliding window is then the timestep after the end of the window.
Recurrent Networks are Deep Learning networks with a twist… they can remember the past, and are thus preferred for sequence processing. RNN cells are the backbone of Recurrent Networks. RNN cells have 2 incoming connections, the input and the previous state. Similarly, they also have 2 outgoing connections, the output and the current state. This state helps them combine information from the past and the current input.
A simple RNN cell is too simplistic to be uniformly used for time series analysis across various domains. So multitudes of variations have been proposed over the years to adapt Recurrent Networks to the various domains, but the core idea remains the same!!
LSTMs over RNNs
LSTM cells are special RNN cells with ‘gates’ present in them, which are essentially values between 0 and 1 corresponding to the state input. The intuition behind these gates is to forget or retain information from the past, which allows them to remember more than just the immediate past. No one can explain LSTMs better than Colah’s blog, so just visit it if you already haven’t been.
We discussed that since the state information travels through every timestep, RNNs are capable of remembering only the recent past. Gated networks like LSTMs and GRUs on the other hand can handle comparatively longer sequences, but even these networks have their limits!! For a better understanding of this issue, one can also look into vanishing and exploding gradients.
So what to do with really long sequences?? Well the obvious solution is shorten them!!! But how? One way is to discard the fine grained time information present in the signal. This can be done by accumulating small groups of datapoints together and creating features from them, which are then passed, acting like a single datapoint, to the LSTM.
Multi-Scale Hierarchical LSTMs
Looking at the CNN-LSTM architecture, one thing comes to mind… Why use CNNs for combining those small groups?? Why not just use a different LSTM there too!!! Multi-Scale Hierarchical LSTMs are built on the same idea.
Inputs are processed at multiple scales, and each scale is devoted to doing something unique. The lower scale which works on more fine-grained inputs is focused on delivering fine-grained (but only recent) time information. The upper scale on the other hand focuses on providing the complete picture (but with no fine-grained details). Together, multiple scales can deliver a better understanding of the time series.
Time Series Analysis is a very old field and contains various inter-disciplinary problem statements, each with their own set of challenges. However, despite the fact that each domain tunes the model to for their own requirements, there are still certain general research directions in time series analysis which needs to be improved upon. For example, every development from very basic RNN cells to Multi-Scale Hierarchical LSTMs have been in some way focuses on processing longer sequences, but even the latest LSTM modifications have their own sequence length limitations and we still don’t have an architecture that can truly handle extremely long sequences.