Portfolio Optimization with Deep Reinforcement Learning

Portfolio Optimization or the process of giving optimal weights to assets in a financial portfolio is a fundamental problem in Financial Engineering. There are many approaches one can follow — for passive investments one follows market capitalization weighting, with no view on investment performance one follows equal weighting, if one follows in the Capital Asset Pricing Mode or CAPM, the most elegant solution is the Markovitz Optimal portfolio.

There is no one solution to this problem. It is essentially a problem where an agent that can best learn and adapt to the market environment can deliver best results, which essentially is the essence of any Reinforcement Learning problem. Reinforcement Learning has delivered excellent results in problems with similar premise like video games and board games where they have far outperformed humans.

Problem Framework

We used Reinforcement Learning framework specially designed for the task to manage portfolios proposed by Z. Jiang, D. Xu, J. Liang, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.

In the proposed framework, a neural network is trained to inspect the history of an asset as well as the previous portfolio weights and evaluate its potential growth for the immediate future. The evaluation score of each asset is discounted by the size of its intentional weight change for the asset in the portfolio and is presented to a softmax layer, whose outcome will be the new portfolio weights for the coming trading period.The reward function of the RL framework is the explicit average of the periodic logarithmic returns.

Three different species of networks are tested in this work, a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short Term Memory (LSTM)

The model can be trained on any set of assets here we test it on the cryptocurrency exchange market. The rebalancing period is 30 minutes. 11 most liquid coins are selected for each period.


We compare the performance of the RL with the following frameworks (also detailed in the paper by Z. Jian et al)—

For the period 2015/07/01 to 2017/07/01, the results in the test period (last 3 months) are —

For the period 2016/07/01 to 2018/07/01, the results in the test period (last 3 months) are —


The deep reinforcement learning framework behaved far better than any other optimization framework in the test period in 2017 but it was actually inferior to a few frameworks in the test period in 2018.

It is very visible that the returns of any optimization framework is very much dependent on the market environment. As the RL framework we used also tries to limit transaction costs or turnover, it might have behaved worse off than few of the frameworks where there is no such constraint in 2018.

Source: Deep Learning on Medium