Deep Reinforcement Learning for Trading: Project Overview

In this project we're going to build a deep reinforcement learning trading agent and deploy it in a simulated trading account at Interactive Brokers.

2 years ago   •   5 min read

By Peter Foy

In this project we're going to build a deep reinforcement learning trading agent and deploy it in a simulated trading account at Interactive Brokers.

While there are several interesting tutorials on deep reinforcement learning for trading, we haven't seen many that include broker integration.

That said, instead of building the ultimate deep RL trading agent at this point, the purpose of the project is more focused on the entire workflow of building, testing, and then deploying the trading algorithm at a broker.

The reason to focus on rapid implementation and then focus on tuning the models later is that, as you may know, backtesting has its own myriad of issues when it comes to testing trading performance.

As discussed in this article on Mistakes Quants Make That Cause Backtests to Lie, a few of the common issues include

  • In-sample backtesting
  • Using survivor-biased data
  • Overfitting the model
  • Data mining fallacy

Keep in mind that there's very little chance that the first implementation will result in a profitable trading algorithm, but as mentioned, it will give us a framework that we can continue to improve on with things like:

  • Model hyperparameter tuning
  • Trading logic tuning
  • Additional feature engineering
  • Additional model development

At the end of the project, we'll expand on each of these potential improvements in case you want to continue development on your own.

The steps for this project are as follows:

  1. Import the necessary packages, define our model and trading inputs, and write helper functions for the algorithm
  2. Build a Long-Short-Term Memory (LSTM) model for price prediction
  3. Build a Convolutional Neural Network (CNN) model that will return the probability of an up or down move in the next time period
  4. Build a reinforcement learning trading agent based on gradient ascent to maximize our Sharpe ratio
  5. Write our trading logic, deploy it at Interactive Brokers, and discuss potential model improvements

We'll cover each of these concepts in a bit more detail in subsequent guides, but for now here is an overview of each step and model that we'll be building.

1. Imports, Inputs, & Helper Functions

A few noteworthy packages that we'll be using include:

  • Scikit-learn for data preprocessing & model evaluation
  • Keras for our LSTM & CNN deep learning models
  • IB-insync package for working with the Interactive Brokers TWS API

We will then define the inputs of our models and trading logic, including:

  • Stock symbols
  • DataFrame Timeframe
  • LSTM architecture inputs
  • CNN logic
  • IB timeframe loop
  • Daily portfolio stop loss threshold
  • Percent allocation
  • Stop loss and take profit thresholds

Next we will create the following helper functions for the trading algorithm:

  • Shuffle NumPy arrays of different shapes
  • Split training & testing data
  • Remove NaN values from the data
  • Pull data from the Yahoo Finance API

Finally, we will create functions that allow us to initialize and run the model.

2. Building a LSTM Model for Price Prediction

We'll discuss LSTMs in a bit more detail later in the project, but here's a quick overview:

A recurrent neural network (RNN) attempts to model time-based or sequence-based data. An LSTM network is a type of RNN that uses "special" units as well as standard units. These special units include a memory cell that can store information for longer periods of time.

In the context of trading, our LSTM model will be used to price prediction for the look forward period, in our case we'll be predicting the price 1 day in the future.

In particular, our LSTM takes into account the following 8 inputs to predict 1 output:

  • Open
  • High
  • Low
  • Close
  • Volume
  • 20 Day SMA
  • 50 Day SMA
  • 200 Day SMA

As mentioned, further model development will involve additional feature engineering to test how it affects performance.

3. Building a CNN Model for the Probability of a Price Increase or Decrease

A convolutional neural network (CNN) is a subclass of deep learning that's commonly applied to analyzing visual images.

In the context of trading, we'll use the CNN model to output a probability prediction, which will be another one of the trading conditions to enter a long or short trade.

In particular, the output of the CNN model will be a probability in percentage terms that the stock will go up or down in the next candlestick.

In the broker integration step, we will be able to designate our own probability percentage of desire to trade and test the model on real-time prices.

In particular, when we deploy this model at a simulated IB trading account, we will make use of the CNN increase and decrease thresholds that we set up in the inputs section, both of which are set at 50%.

4. Build a Reinforcement Learning Trading Agent

After we've built our LSTM and CNN models, we'll build a reinforcement learning agent who's goal will be to maximize our Sharpe ratio.

In this section, we'll look at an open source RL model built by Teddy Koker: Trading with Reinforcement Learning in Python Part II: Application.

We will make use gradient ascent to maximize the Sharpe ratio over a set of training data, and attempt to create a strategy with a high Sharpe ratio when tested on out-of-sample data.

This function will generate a value between -1 and 1, which will give us a percentage that the portfolio should either buy or sell the asset.

In the next step of deploying the algorithm at Interactive Brokers, another one of trading conditions will be having this value above 0 for a long trade and below 0 for a short trade.

5. Combine the Models & Deploy at Interactive Brokers

Amongst others, the main steps we will need to take in the section include:

  • Enable API connections
  • Call available balance at IB
  • Calculate portfolio stop loss
  • Determine position size
  • Write long trading logic
  • Write short trading logic
  • Initialize & run strategy

Using this trading logic, we'll deploy the model at a simulated trading account. After that, we'll discuss several ways we can improve each model and the algorithm as a whole.

Summary: Deep RL Trading Project

To summarize, this project is organized as follows:

  1. Step 1: Define our imports, inputs, and helper functions
  2. Step 2: Build a CNN-LSTM hybrid deep learning model
  3. Step 3: Build a reinforcement learning agent to maximize our Sharpe ratio
  4. Step 4: Combine these three models into a deep reinforcement learning agent
  5. Step 5: Implement RL agent at Interactive Brokers

Please note this project is part our our premium content in order to support our team of developers and includes the full Python script, as well as two Google Colab notebooks.


There is one thing we would like the readers to know—we are not here to claim that we have a production-ready trading algorithm. The idea of the project is to focus on the workflow of deep reinforcement learning for trading, including model development, testing, and broker integration, but it is still very much in the early stages.

That said, this project is in beta and almost certainly contains issues that will cause you to lose money. It should go without saying that it comes with no warranty, none of this is investment advice, past performance is not indicative of future results, and you should not deploy this with a live account. Please see our full Terms of Service for more information.

Spread the word

Keep reading