Adaptive Normalization and Fuzzy Targets — Time Series Forecasting tricks

Source: Deep Learning on Medium

Adaptive Normalization and Fuzzy Targets — Time Series Forecasting tricks

Time series prediction sure isn’t easy, especially when you have nonlinear, potentially chaotic dynamics going on — and in the case of the economic or financial time series, it gets even harder, as you have humans behind the whole thing.

There have been two big problems bugging me as I’ve worked on modeling time series this past year. One is dealing with a nonstationary time series like the S&P500 — for although there are various oscillations and patterns that occur on various time scales, there is a lot of amplitude and frequency modulation in these oscillations. In other words, there are patterns, but they aren’t super regular, and even if they were for a few years, there is no guarantee that whatever dynamics hold now will still be applicable in a few years.

I decided to tackle the problem by making the assumption that the S&P500 is locally stationary and the immediate past has more bearing on the future than the distant past. Of course, being “locally stationary” begs the question of how local is local. I decided to try normalizing this time series by using a 44-day (2 months of trading) window — meaning, to normalize a data point, take the last 44 days and find it’s mean and standard deviation, and then normalize that data point by subtracting the mean and dividing by the standard deviation.

The second problem is what kind of forecasting do you want to do? I decided to do an autoregressive experiment — no exogenous variables — just one variable and its lags.

Here’s the setup — take 44 data points and predict something about the future. But what should we predict? Picking something specific — like 1-step or 4-step ahead prediction seemed somewhat pointless because the dynamics of the market can change, and what might have happened in four days’ time might take more or less time in the future.

That’s why I decided to try fuzzy targets — not in any super technical or Bayesian sense, just in the sense of randomly picking a target value from a range. Take the 5 days after your 44-days of input, and randomly pick one value to represent the target. Alternatively, you could take those 5 days and find the median or mean and make that the target. If you are going to go with a random selection, you could even make more than one selection — a basic form of data augmentation not unlike the random flipping or cropping they do in computer vision or like the Synthetic Minority Oversampling (SMOTE) methods they use in supervised classification problems. The following is code for this crude adaptive normalization.

Training Data: SPY (an ETF that tracks the S&P500) from 2009–2016, then adaptive normalized.

Then comes creating randomized, fuzzy targets: 44 days of closing prices followed by a random pick from the next 5 days. This process repeated 3 times to create an augmented data set.

Randomly picking a target from between 1–5 days in the future.

The model: Ridge Regression from sklearn — with alpha (regularization) at 0.01.

The Testing Data — 1/1/2017 through 1/1/2019. I purposefully put a one year gap between the end of the training/validation data and the beginning of the test data. Testing data was normalized using the method outlined above. (Remember, if you want to create a test input of 44 days, you actually need 88 days’ worth of data, because you need the first 44 to normalize the first actual data point. ) After doing this, we use the model to predict on the testing data, and here’s a figure showing the results:

You can see the predictions from the model in blue and the ground truth in light orange. Keep in mind that the ground truth is also fuzzy — one point randomly picked from 5 consecutive closing prices. However, the R2 score of ~72.3 seems good, maybe too good. Hopefully, those of you into time series forecasting and regression can replicate this or experiment with it on some other data sets to see if it’s viable.

Of course, there is also one last step — the inverse transform. Take your prediction, multiply by the standard deviation of the last 44 days, and add the mean of the last 44 days. Hopefully, this all works, and…if you get filthy rich with this scheme, let me know!

PS — I’m looking for a machine learning or data science job. I live in NYC. Please message me if you have any leads. I would love to keep working on time series or sequences (like in NLP). Thanks for reading!