Neural networks for algorithmic trading. Simple time series forecasting (2024)

Alex Honchar

7 min read

Jun 18, 2016

I want to implement trading system from scratch based only on deep learning approaches, so for any problem we have here (price prediction, trading strategy, risk management) we gonna use different variations of artificial neural networks (ANNs) and check how well they can handle this.

Now I plan to work on next sections:

Simple time series forecasting (and mistakes done)
Correct 1D time series forecasting + backtesting
Multivariate time series forecasting
Volatility forecasting and custom losses
Multitask and multimodal learning
Hyperparameters optimization
Enhancing classical strategies with neural nets
Probabilistic programming and Pyro forecasts

I highly recommend you to check out code and IPython Notebook in this repository.

In this, first part, I want to show how MLPs, CNNs and RNNs can be used for financial time series prediction. In this part we are not going to use any feature engineering. Let’s just consider historical dataset of S&P 500 index price movements. We have information from 1950 to 2016 about open, close, high, low prices for every day in the year and volume of trades. First, we will try just to predict close price in the end of the next day, second, we will try to predict return (close price — open price). Download the dataset from Yahoo Finance or from this repository.

Neural networks for algorithmic trading. Simple time series forecasting (3)

We will consider our problem as 1) regression problem (trying to forecast exactly close price or return next day) 2) binary classification problem (price will go up [1; 0] or down [0; 1]).

For training NNs we gonna use framework Keras.

First let’s prepare our data for training. We want to predict t+1 value based on N previous days information. For example, having close prices from past 30 days on the market we want to predict, what price will be tomorrow, on the 31st day.

We use first 90% of time series as training set (consider it as historical data) and last 10% as testing set for model evaluation.

Here is example of loading, splitting into training samples and preprocessing of raw input data:

It will be just 2-hidden layer perceptron. Number of hidden neurons is chosen empirically, we will work on hyperparameters optimization in next sections. Between two hidden layers we add one Dropout layer to prevent overfitting.

Important thing is Dense(1), Activation(‘linear’) and ‘mse’ in compile section. We want one output that can be in any range (we predict real value) and our loss function is defined as mean squared error.

Let’s see what happens if we just pass chunks of 20-days close prices and predict price on 21st day. Final MSE= 46.3635263557, but it’s not very representative information. Below is plot of predictions for first 150 points of test dataset. Black line is actual data, blue one — predicted. We can clearly see that our algorithm is not even close by value, but can learn the trend.

Neural networks for algorithmic trading. Simple time series forecasting (4)

Let’s scale our data using sklearn’s method preprocessing.scale() to have our time series zero mean and unit variance and train the same MLP. Now we have MSE = 0.0040424330518 (but it is on scaled data). On the plot below you can see actual scaled time series (black)and our forecast (blue) for it:

Neural networks for algorithmic trading. Simple time series forecasting (5)

For using this model in real world we should return back to unscaled time series. We can do it, by multiplying or prediction by standard deviation of time series we used to make prediction (20 unscaled time steps) and add it’s mean value:

MSE in this case equals 937.963649937. Here is the plot of restored predictions (red) and real data (green):

Neural networks for algorithmic trading. Simple time series forecasting (6)

Not bad, isn’t it? But let’s try more sophisticated algorithms for this problem!

I am not going to dive into theory of convolutional neural networks, you can check out this amazing resourses:

cs231n.github.io — Stanford CNNs for Computer Vision course
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ — CNNs for text recognition, can be useful for understanding how it works for 1D data

Let’s define 2-layer convolutional neural network (combination of convolution and max-pooling layers) with one fully-connected layer and the same output as earlier:

Let’s check out results. MSEs for scaled and restored data are: 0.227074542433; 935.520550172. Plots are below:

Neural networks for algorithmic trading. Simple time series forecasting (7)

Neural networks for algorithmic trading. Simple time series forecasting (8)

Even looking on MSE on scaled data, this network learned much worse. Most probably, deeper architecture needs more data for training, or it just overfitted due to too high number of filters or layers. We will consider this issue later.

As recurrent architecture I want to use two stacked LSTM layers (read more about LSTMs here).

Plots of forecasts are below, MSEs = 0.0246238639582; 939.948636707.

Neural networks for algorithmic trading. Simple time series forecasting (9)

Neural networks for algorithmic trading. Simple time series forecasting (10)

RNN forecasting looks more like moving average model, it can’t learn and predict all fluctuations.

So, it’s a bit unexpectable result, but we can see, that MLPs work better for this time series forecasting. Let’s check out what will happen if we swith from regression to classification problem. Now we will use not close prices, but daily return (close price-open price) and we want to predict if close price is higher or lower than open price based on last 20 days returns.

Neural networks for algorithmic trading. Simple time series forecasting (11)

Code is changed just a bit — we change our last Dense layer to have output [0; 1] or [1; 0] and add softmax output to expect probabilistic output.

To load binary outputs, change in the code following line:

split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=False, scale=True)split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=True, scale=True)

Also we change loss function to binary cross-entopy and add accuracy metrics.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 2s - loss: 0.1960 - acc: 0.6461 - val_loss: 0.2042 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 2s - loss: 0.1944 - acc: 0.6547 - val_loss: 0.2049 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 1s - loss: 0.1924 - acc: 0.6656 - val_loss: 0.2064 - val_acc: 0.6019
Epoch 4/5
13513/13513 [==============================] - 1s - loss: 0.1897 - acc: 0.6738 - val_loss: 0.2051 - val_acc: 0.6039
Epoch 5/5
13513/13513 [==============================] - 1s - loss: 0.1881 - acc: 0.6808 - val_loss: 0.2072 - val_acc: 0.6052
1669/1669 [==============================] - 0s  Test loss and accuracy: [0.25924376667510113, 0.50209706411917387]

Oh, it’s not better than random guessing (50% accuracy), let’s try something better. Check out the results below.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 3s - loss: 0.2102 - acc: 0.6042 - val_loss: 0.2002 - val_acc: 0.5979
Epoch 2/5
13513/13513 [==============================] - 3s - loss: 0.2006 - acc: 0.6089 - val_loss: 0.2022 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 4s - loss: 0.1999 - acc: 0.6186 - val_loss: 0.2006 - val_acc: 0.5979
Epoch 4/5
13513/13513 [==============================] - 3s - loss: 0.1999 - acc: 0.6176 - val_loss: 0.1999 - val_acc: 0.5932
Epoch 5/5
13513/13513 [==============================] - 4s - loss: 0.1998 - acc: 0.6173 - val_loss: 0.2015 - val_acc: 0.5999
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24841217570779137, 0.54463750750737105]

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 18s - loss: 0.2130 - acc: 0.5988 - val_loss: 0.2021 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 18s - loss: 0.2004 - acc: 0.6142 - val_loss: 0.2010 - val_acc: 0.5959
Epoch 3/5
13513/13513 [==============================] - 21s - loss: 0.1998 - acc: 0.6183 - val_loss: 0.2013 - val_acc: 0.5959
Epoch 4/5
13513/13513 [==============================] - 17s - loss: 0.1995 - acc: 0.6221 - val_loss: 0.2012 - val_acc: 0.5965
Epoch 5/5
13513/13513 [==============================] - 18s - loss: 0.1996 - acc: 0.6160 - val_loss: 0.2017 - val_acc: 0.5965
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24823409688551315, 0.54523666868172693]

We can see, that treating financial time series prediction as regression problem is better approach, it can learn the trend and prices close to the actual.

What was surprising for me, that MLPs are treating sequence data better as CNNs or RNNs which are supposed to work better with time series. I explain it with pretty small dataset (~16k time stamps) and dummy hyperparameters choice.

You can reproduce results and get better using code from repository.

I think we can get better results both in regression and classification using different features (not only scaled time series) like some technical indicators, volume of sales. Also we can try more frequent data, let’s say minute-by-minute ticks to have more training data. All these things I’m going to do later, so stay tuned :)

P.S.
Follow me also in Facebook for AI articles that are too short for Medium, Instagram for personal stuff and Linkedin!

Neural networks for algorithmic trading. Simple time series forecasting (2024)

FAQs

What is the best neural network for time series forecasting? ›

Recurrent Neural Networks (RNNs) The Recurrent Neural Network (RNN) is one of the promising ANNs that has shown accurate results for time series forecasting. It is made up of a series of interconnected neural networks at different time intervals or time steps.

Read On ›

Can neural networks be used for forecasting? ›

This gives them a self-training ability, the ability to formalize unclassified information and—most importantly—the ability to make forecasts based on available historical information. Neural networks are used increasingly in a variety of business applications, including forecasting and marketing research.

Discover More Details ›

What neural network is used for time series processing? ›

The Time-series Neural Network (TNN) proposed in this study is a deep learning model designed for time-series prediction tasks. The design of this model fully considers the characteristics of time-series data, including time order, continuity, and periodicity, as shown in Figure 1.

Is RNN good for time series forecasting? ›

Recurrent Neural Networks (RNNs) are deep learning models that can be utilized for time series analysis, with recurrent connections that allow them to retain information from previous time steps.

See Details ›

What is better than LSTM for time series? ›

More specifically, it was observed that BiLSTM models provide better predictions compared to ARIMA and LSTM models. It was also observed that BiLSTM models reach the equilibrium much slower than LSTM-based models.

Find Out More ›

Which is better LSTM or ARIMA for time series forecasting? ›

The longer the data window period, the better ARIMA performs, and the worse LSTM performs. The comparison of the models was made by comparing the values of the MAPE error. When predicting 30 days, ARIMA is about 3.4 times better than LSTM. When predicting an averaged 3 months, ARIMA is about 1.8 times better than LSTM.

Tell Me More ›

Can CNN be used for time series forecasting? ›

Key Advantages of CNNs for Time Series Forecasting:

Local Connectivity: CNNs employ convolutional layers that focus on local regions of the input data. This characteristic enables them to capture short-term patterns effectively, which is crucial in time series forecasting.

Show Me More ›

What is the best algorithm for forecasting? ›

Top 5 Common Time Series Forecasting Algorithms

Autoregressive (AR)
Moving Average (MA)
Autoregressive Moving Average (ARMA)
Autoregressive Integrated Moving Average (ARIMA)
Exponential Smoothing (ES)

Explore More ›

Is Tesla using neural network? ›

Unlike the earlier versions, where the car's reactions were predetermined for specific situations, v12 draws on the learning from "end-to-end neural networks." These networks are extensively trained using video clips from real driving situations, allowing the car to make decisions that feel more human-like.

How to use neural network for time series forecasting? ›

To train an LSTM neural network for time series forecasting, train a regression LSTM neural network with sequence output, where the responses (targets) are the training sequences with values shifted by one time step.

Show Me More ›

Which algorithm is used for time series? ›

Some common algorithms used for time-series forecasting: ARIMA: It stands for Autoregressive-Integrated-Moving Average. It utilizes the combination of Autoregressive and moving averages to predict future values. Read more about it here.

Read The Full Story ›

Why use LSTM for time series? ›

Using LSTM, time series forecasting models can predict future values based on previous, sequential data. This provides greater accuracy for demand forecasters which results in better decision making for the business.

See Details ›

Which machine learning algorithm is best for time series forecasting? ›

The Best Time Series forecasting algorithms are the following: Autoregressive. Exponential Smoothing (ES) Autoregressive Moving Average (ARMA)

Get More Info Here ›

Which deep learning algorithm is best for time series forecasting? ›

LSTM (Long Short-Term Memory)

This makes LSTM particularly effective in capturing long-term dependencies in time series data. LSTM models can learn complex patterns and relationships within the data, making them suitable for deep learning time series in the fields like: Speech recognition. Natural language processing.

Why LSTM is better than RNN in time series? ›

The main difference between LSTM and RNN lies in their ability to handle and learn from sequential data. LSTMs are more sophisticated and capable of handling long-term dependencies, making them the preferred choice for many sequential data tasks.

Which algorithm is best for time series forecasting? ›

ARIMA happens to be one of the most used algorithms in Time Series forecasting. While other models describe the trend and seasonality of the data points, ARIMA aims to explain the autocorrelation between the data points.

View Details ›

Which model is best for time series forecasting machine learning? ›

ARIMA models are great for forecasting stationary time series data. This implies that the data does not contain any seasonal or temporary trends and the statistical properties of the source of the time series data, like the mean and variance, do not change over time.

Is CNN good for time series forecasting? ›

Convolutional Neural Networks have evolved beyond image analysis and have proven to be formidable tools for time series forecasting. They excel at learning intricate patterns, both short-term and long-term, and can adapt to various domains, making them a valuable addition to the time series forecasting toolkit.

Learn More ›

Is LSTM good for time series forecasting? ›

Using LSTM, time series forecasting models can predict future values based on previous, sequential data. This provides greater accuracy for demand forecasters which results in better decision making for the business.

Discover More Details ›