Deep Learning models for Time series analyse over traditional Machine learning models

Source: Deep Learning on Medium

Deep Learning models for Time series analyse over traditional Machine learning models

Time series forecasting is an important area of machine learning. It is important because there are so many prediction problems that involve a time component. However, while the time component adds additional information, it also makes time series problems more difficult to handle compared to many other prediction tasks.

There are several types of models that can be used for time-series forecasting. But traditional machine learning models fails at time series analysis. This is due to certain challenges:





For most ML models, you train a model, test it, retrain it if necessary until you’ve gotten satisfactory results, and then evaluate it on a holdout/validation data set. Once you are satisfied with the model performance, it is deployed into production. Eventually after a few months, you might want to update your model if a significant amount of new training data comes in. Model training is a one-time activity, or done at most at periodic intervals to maintain the model’s performance to take int account new information.

For time series models, this is not the case. Instead we have to retrain our model from the scratch every time we want to generate a new forecast.

We will use ARIMA models to forecast Australian quarterly beer sales (the data set is taken from Hyndman’s Forecast package in R). First we will train a model on data from 1956 to 1970 and then test it on data from 1970 to 1973. (Using a seasonal ARIMA(1,1,1)(0,1,1) model) We can get a reasonably good forecast (MAPE =1.94% )

Next, if we will use the same model to forecast sales all the way until 1993. that the forecast is no longer as good. The same ARIMA (1,1,1) (0,1,1) model forecast the sales for 1990 to 1993 with worse accuracy of MAPE = 44.92%.

Instead we would have to refit a second model that takes into account the new data and the changes in the pattern of the sales say, ARIMA (1,1,1) (1,1,2) model. We get accuracy reasonably close which is MAPE=5.22%. Common pitfall in ML projects happens when the distribution of the development data set and the distribution of the production data set are not the same. For time series, it is almost always the case. The only way around this is to retrain your model every time you get new data.

Note that this is not the same as continuous learning, where an already trained model is updated as new data comes in (although it would be an interesting research topic to see if continuous learning can be applied to time series forecasting).

Challenge 2: Train/test splits not possible:

Let’s circle back to the basic approach to finding an ML model. Usually you build a model using a train set and then evaluate it on a test set. This requires that you have enough data to set aside a test set and still have data to build a model with. But time series data is often very small compared to the data sets used in image processing or NLP. Two years of weekly sales data for a product at a given location is only 104 data points (barely enough to capture any seasonality). With data sets as small as this, we don’t have the luxury of setting aside 20% or 30% of the data for testing purposes.

Cross validation in a particular fashion could help us in splitting time series data to train and test the model.

Splitting a time-series dataset randomly does not work because the time section of your data will be messed up. For a time series forecasting problem, we perform cross validation in the following manner.

1. Folds for time series cross validation are created in a forward chaining fashion

2. Suppose we have a time series for yearly consumer demand for a product during a period of n year. The folds would be created like:

We progressively select a new train and test set. We start with a train set which has a minimum number of observations needed for fitting the model. Progressively, we change our train and test sets with each fold.

Challenge 3: The uncertainty of the forecast is just as important as, or even more so, than the forecast itself:

One more thing that distinguishes forecasting from other supervised learning tasks is that your forecasts are almost always going to be wrong. Somebody working with an image classification problem or an NLP problem can reasonably expect to eventually classify all new incoming examples accurately — given enough training data. All you have to do is make sure that your training data and the real-world data are sampled from the same distribution. What are the chances that you are going to predict exactly how many size M red Adidas shirts you are going to sell next week? So, you always need not just a point forecast, but also a measure of the uncertainty of your forecast.

In demand forecasting and inventory applications, the uncertainty of your forecast is crucial for the applications that consume the forecast. The uncertainty of your forecast (represented by forecast intervals or by forecast quantiles) is what you will use to calculate your safety stock, that is the additional amount of inventory you want to carry to make sure you don’t lose any customers.

Why DEEP Learning for forecasting?

As we have seen above that we have certain challenges which restrict us from using traditional machine learning models and cause a short fall in prediction when we use them. In order to over come some of these bottlenecks we use neural networks which are capable of overcoming these challenges.

Here are 5 reasons to add Deep Learning to your Time Series analysis:

1. Easy-to-extract features

The Deep Neural Networks of deep learning have the ability to reduce the need for feature engineering processes, data scaling procedures and stationary data, which is required in time series forecasting. These networks can learn on their own and on training can by themselves extract features from the raw input data, which is what time series forecasting demands. On a model of neural networks, a sequence of objects can be treated as a one-dimensional image. This model is what the neural network can refine into the most related elements. Neural networks are therefore widely used in the time series forecasting.

2. Good at extracting patterns easily:

Time series forecasting is basically looking for patterns and eventually spanning them over long sequences. Recurrent Neural Networks (RNN) have high applicability in Time Series forecasting. Each neuron in an RNN is capable to maintain information of the previous input using its internal memory. This makes them good with sequential data and hence in time series. RNN has loops that allow information to be carried across neurons while reading in the input. They help with grasping the temporal dependence from the data and they can easily identify what previous observations are important and how they are relevant to the current forecasting. It can learn what information is important from the input for the mapping and can dynamically change this context as needed.

3. Easy to predict from training data:

The Long Short Memory Network (LSTM) is a neural network that can make predictions according to the previously encountered data. LTSM is very popular in time series, apart from its applications in other domains. Data can be represented at different points in time using deep learning models like a random forest, gradient boosting regressor and time delay neural networks.

4. Support for multiple Inputs and Outputs:

Time Series forecasting often requires dealing with multiple inputs and forecasting multiple time steps. Again, a neural network can be applied which allows for a fixed/multiple a number of inputs for a mapping function. They support multivariate inputs and thereby supporting multivariate forecasting. Complex Time Series evaluation requires multivariate and multi-step forecasting. Neural networks also support an arbitrary number of output values as well to help with multiple outputs in time series forecasting.

5. No assumptions made on the data:

When we are using neural networks, they don’t take into consideration any assumptions. In neural networks the data can be of anytime. The neural nets are designed in such a way that they are capable of extracting the most useful feature to be used for foresting. They don’t wait for any particular type of data. They are universal approximator which means that deep learning can represent wide variety of functions to give predictions. Due to there universal functionality they can be used in time series analysis to easily capture trends, patterns and seasonality.

Few of the neural networks are:

1. Artificial Neural Networks (ANNs)

2. Time Lagged Neural Networks (TLNN)

3. Seasonal Artificial Neural Networks (SANN)

4. Recurrent Neural Networks (RNN (LSTM))

5. Convolutional neural networks (CNN)

For these topics I have attached links to refer and understand. I’ll be creating a separate blog for each of his in detail and in a simplified way.

Thank you for reading this blog. You have made it to the end of the blog.


1. 3 facts about time series forecasting that surprise experienced machine learning practitioners

2. Improve Your Model Performance using Cross Validation

3. Universal Approximation theorem —

4. An Introductory Study on Time Series Modelling and Forecasting

5. A Guide For Time Series Prediction Using Recurrent Neural Networks

6. How to Use Convolutional Neural Networks for Time Series Classification —

Neural Network Models for Time Series Forecasts —