Introduction to Algorithmic Trading with Quantopian
Note: Quantopian has shut down its trading platform. You can learn more about Python for Finance here.
In this Python for Finance tutorial series, we have discussed key concepts including:
- Introduction to Python for Finance
- Python for Data Visualization
- Python for Time Series Analysis
- Portfolio Optimization
In this guide, we'll put these skills together and use the Quantopian trading platform to develop trading algorithms and backtest to see how they perform on historical data.
The following is based on notes from this course on Python for Financial Analysis & Algorithmic Trading and is organized as follows:
- Introduction to Quantopian
- Quantopian Research
- Basics of Quantopian Algorithms
- Pairs Trading Algorithm
- Quantopian Pipelines
1. Introduction to Quantopian
The basic idea of Quantopian is to let anyone that knows how to code in Python to write their own trading algorithm:
Quantopian provides free education, data, and tools so anyone can pursue quantitative finance. Select members license their algorithms and share in the profits.
We can write these trading algorithms in Python in their interactive development environment (IDE), we can clone algorithms that others have shared in the community, and we can backtest them against historical data.
In short, with Quantopian you can:
- Research: Learn how to test ideas with advanced data science tools.
- Compete: Enter the contest to evaluate your strategy and earn prizes.
- Get Funded: License your algorithm and share in the profits.
For the research phase we can use their notebooks hosted on Quantopian.
Below you can see that we can also use the IDE, here is an example of a Cross-sectional Equity Template:
The Quantopian Github also has many open-source libraries for quantitive finance.
Let's start off by using the Research Notebook format, and then move on to using the Quantopian IDE.
Subscribe now
We're an independent group of machine learning engineers, quantitative analysts, and quantum computing enthusiasts. Subscribe to our newsletter and never miss our articles, latest news, etc.
2. Quantopian Research
First, we'll start by opening up a new notebook on Quantopian.
The Research Notebook allows us to gather information about securities within the Quantopian platform.
Instead of focusing on backtesting trading strategies (which is what the IDE is for), the Research Notebook is where we get information like prices, fundamentals, and so on.
Let's look at a few important functions that already come installed on Quantopian:
get_pricing()
- The
get_pricing
function provides access for up to 12 years of US equity pricing data - The
get_pricing
function returns a pandas object
Let's get the pricing data for TSLA with daily data:
tsla = get_pricing('TSLA')
symbols()
- By default the
symbols()
function returns a security object for a ticker symbol - You specify a ticker symbol (or list of tickers) and get and get a list of security objects back
If we use tsla_info = symbols('TSLA')
we see that we get back <type 'zipline.assets._assets.Equity'> as the data type.
This object behaves similar to a dictionary, so let's turn it into one with tsla_info.to_dict()
:
We won't actually be using symbols that often as they can often change, instead we use the sid
, or security ID, which Quantopian developed.
3. Basics of Quantopian Algorithms
Let's start discussing how to create trading algorithms on Quantopian.
There are a few key functions we need to learn:
initialize()
handle_data()
before_trading_start()
We'll then be able use use these to create our own algorithms.
To demonstrate these functions we will use the IDE instead of a Notebook, so let's create a new algorithm and delete all the templated provided.
To demonstrate this we'll use a portfolio of tech stocks: FB, AMZN, AAPL, and TSLA.
The algorithm we'll develop will buy these stocks in the simplest way possible with equal distribution for each.
Let's get started:
- The
initialize()
function takes incontext
and is called only once when our algorithm starts. - The
context
object is like an augmented Python dictionary that's used for maintaining the state of the algorithm - either during a backtest or live trading - Properties can assigned and accessed using dot notation
Here we can access a stocks sid and assign it:
Let's now use the handle_data()
function, which takes in context
and data.
This function is called once at the end of each minute.
Let's use it to learn about a built-in function order_target_percent()
, which places an order to adjust a position to a target percent of the current portfolio value—you can learn more here.
This function takes in (sid, target_percent, style)—as mentioned the target_percent
at the end of each minute will be 25%:
Let's run a quick 1 month backtest (since this is obviously not going to generate anything interesting):
Now that we have the basics let's look at checking historical data and scheduling functions in the IDE.
data.history()
- The
data.history()
method let's you you call historical information on equities and the data returned is adjusted for splits, mergers, dividends at the current date. - You pass in a list (or single) of
assets
, the fields you want, thebar_count
, andfrequency
.
As an example let's get the 10 price minute data for FB:
# get 10m price data for FBdef handle_data(context,data):
schedule_function()
- Lets you schedule functions to do things like opening and closing positions
- It takes in the
func
you're scheduling, thedate_rule
, and thetime_rule
Let's schedule a function to first open position in FB worth 20% of the portfolio at the start of each month, at the open of the market.
Then we will close the position at the end of each month, at the close:
# initialize
def initialize(context):
context.fb = sid(42950)
context.amzn = sid(16841)
context.aapl = sid(24)
context.tsla = sid(39840)
# schedule open_position, start of each month, at the open
schedule_function(open_position, date_rules.month_start(), time_rules.market_open())
# schedule close_position, end of each month, at the close
schedule_function(close_position, date_rules.month_end, time_rules.market_close())
# open 20% position in FB
def open_position(context,data):
order_target_percent(context.fb, 0.2)
# close FB position to 0
def close_position(context,data):
order_target_percent(context.fb,0)
4. Pairs Trading Algorithm
Now that we understand a few of the basic functions, let's build a trading algorithm. In particular, we're going to quickly look at a pairs trading algorithm, which Quantopian provides as an example.
Pairs trading is a form of mean reversion that has a distinct advantage of being hedged against directional market movements.
The pairs strategy is trading ABGB and FSLR—we won't go into the code here, but we get the following results after running this for the year 2014:
5. Quantopian Pipelines
Quantopian Pipelines are useful for algorithms that for a set structure.
Let's review the set structure—the Pipeline structure typically follows 4 key steps:
- Compute some scalar value for all assets, for example we compute the 30 day rolling mean
- Select a smaller group of tradable assets by filtering assets based off that scalar value
- Set the desired portfolio weights of filtered assets
- Place orders on assets to reflect desired portfolio weights
There are several technical challenges with following all of these steps on a large group of assets, for example performing computations on all U.S. equities.
What Quantopian has introduced with the Pipeline system to solve these challenges by providing a uniform API.
Let's start with a few key functions in our Pipeline.
Classifiers & Factors
A classifier is a function that transforms the input of an asset and a timestamp to a categorical output.
For example, if we pass in FB we could get back that it is in the technology sector.
A factor is similar, except it returns a numerical value—for example, we could get back a 20 day moving average.
Let's create a Pipeline that returns back all equities available at a timestamp. We're then going to use Quantopians USEquityPricing
dataset to match prices to the SIDs.
First we're going to import the Pipeline object:
from quantopian.pipeline import Pipeline
We're then going to define a function called make_pipeline()
that returns an instantiation of Pipeline.
def make_pipeline():
Next we're going to create a pipeline object:
pipe = make_pipeline()
Next we're going to import run_pipeline
, which allows us to run the pipeline:
from quantopian.research import run_pipeline
We're then going to set result equal to run_pipeline()
, which takes in pipeline
, start_date
, and end_date
. We're just going to get data for the first trading day in 2019:
result = run_pipeline(pipe, '2019-01-01', '2019-01-01')
We can see this returns a DataFrame:
If we use results.info()
we see it is MultiIndex
with 8696 entries, and that it is an Empty DataFrame
. What this represents is every SID at that moment in time.
To actually get the values we need to use factors.
In order to get the price information for all the equities we first need to import USEquityPricing
:
from quantopian.pipeline.data.builtin import USEquityPricing
Now we're going to create a factor that can take in the SID and a timestamp and calculate a numerical value.
Let's look at the SimpleMovingAverage
factor:
from quantopian.pipeline.factors import SimpleMovingAverage
We're now going to use this with our make_pipeline()
function to get the 30 day SMA.
We're also pass in columns to return Pipeline which takes in a dictionary with a key for the new name of the column, and a value of the factor itself:
def make_pipeline():
We're then going to use run_pipeline()
and store it in results:
results = run_pipeline(make_pipeline(), '2019-01-01', '2019-01-01')
To summarize, what we're doing is calculating the 30 day SMA for every US equity, and the SMA is a factor with Quantopian Pipelines.
If we want we can also pass in multiple factors into our Pipeline.
Filters and Screens
Let's continue this Quantopian Pipeline tutorial by discussing Filters and Screens.
Filters take in an asset and a timestamp and return a Boolean.
Screens allow you to execute those filters in your pipeline.
To understand filters, let's code them out with our make_pipeline()
function.
In this function we have a few factors such as:
- The 10 day SMA
- The 30 day SMA
- The latest close
- The percent difference between 10 and 30 day SMA
def make_pipeline():
mean_close_10 = SimpleMovingAverage(inputs=[USEquityPricing.close],window_length=10)
mean_close_30 = SimpleMovingAverage(inputs=[USEquityPricing.close],window_length=30)
latest_close = USEquityPricing.close.latest
perc_diff = (mean_close_10 - mean_close_30) / mean_close_30
return Pipeline(columns={ 'Percent Difference':perc_diff, '30 Day Mean Close':mean_close_30, 'Latest Close':latest_close })
results = run_pipeline(make_pipeline(),'2018-01-01','2019-01-01')
Let's create a filter off of this - a filter is essentially going to be a comparison operation against one of the factors we created.
For our filter lets get only the rows where the Percent Difference is negative with perc_filter
:
def make_pipeline():
We can see this filter returns a boolean.
Lets move onto screens—these allow you to actually execute the filter instead of just creating a new column.
To do this we're going to add an argument inside our Pipeline called screen
:
def make_pipeline():
Now we only get back data where the filter was True:
Masking and Classifiers
Masking allows you to tell the pipeline to ignore assets all together, before the factors or filters even take place.
We can pass in the mask parameters to both factors and filters.
Classifiers take in an asset and a timestamp and return a categorical value, such as a sector or an exchange.
Let's look at our make_pipeline()
function from earlier, and we're going to use a filter to be a mask for the 10 and 30 day SMAs.
To do this we're going to first calculate the latest_close price, then we'll calculate the small_price
where the latest_close
is less than 5.
We're then going to pass in a parameter mask=small_price
to our factors.
What this does is apply the filter first, before calculating the SMAs - this will save a lot of computation:
def make_pipeline():
Let's now look at classifiers - we're first have a few imports:
from quantopian.pipeline.data import morningstar
We'll then set morningstar_sector
to an instance of Sector
:
morningstar_sector = Sector()
Let's now get the exchange_id
from morning star:
exchange = morningstar.share_class_reference.exchange_id.latest
Let's now build a filter from this classifier that checks if an asset is on the NYSE exchange:
nyse_filter = exchange.eq('NYS')
We're then going to pass this into our make_pipleline()
function:
def make_pipeline():
Now that we have an understanding of Quantopian Pipelines, in the next article we'll start building our own trading algorithms.
Summary: Algorithmic Trading with Quantopian
In this article, we saw how to use the Quantopian trading platform to develop a trading algorithm and backtested it to see how it performed on historical data.
We put the skills together that we learnt from our earlier Python for Finance articles and used them for Quantopian Research, Quantopian Algorithms, a Pairs Trading Algorithm, and Quantopian Pipelines.