Financial Analysis with Deep Learning

Tseng, Kuo-Kun; Ou, Chiye; Huang, Ao; Lin, Regina Fang-Ying; Guo, Xiangmin

doi:10.1007/978-981-13-5841-8_57

Financial Analysis with Deep Learning

Kuo-Kun Tseng¹⁸,
Chiye Ou¹⁸,
Ao Huang¹⁹,
Regina Fang-Ying Lin¹⁹ &
…
Xiangmin Guo²⁰

Conference paper
First Online: 12 May 2019

841 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 834))

Abstract

Nowadays, there are massive data in the financial industry that can provide a lot of information to activities of capital supply and economic operation. Artificial Neural Networks (ANN) allow input characteristics of high dimensional, and have the ability to describe the complex nonlinear relationships. Therefore, deep learning methods in the financial sector have great potential. Further, deep learning has extensive application in computer vision and speech recognition, due to which many measure approaches are able to be applied to the financial sector. In this work, we predicted the stock price by using Recurrent Neural Networks (RNN), indicating that data characteristics of the financial sector are suitable for using deep learning method, and the result has certain practical value.

Download conference paper PDF

1 Introduction

When it comes to applying deep learning in the financial sector, many people may immediately think of the prediction of stock price movements. This is indeed a high-profile research direction, but the practical application of deep learning in the financial sector is far more extensive.

The ability of the sustainable development and profit are important for investors. Tam and Kiang [1] applied artificial neural networks to predict bank bankruptcy. They used a back propagation (BP) neural network with single hidden layer to predict bankruptcy for Texas banks between 1985 and 1987. The results showed that such a simple neural network is more accurate and stable than multivariate discriminant analysis model, logistic model, K-nearest neighbors (KNN) model, and decision tree model.

Hutchinson et al. [2] also used ANN for analyzing option prices and hedging S&P 500 futures options from 1987 to 1991. It is verified the result is better than the Black-Scholes-Merton option pricing model, which is the classic option pricing model in the financial studies.

Due to the limitations of the linear regressions, the exchange rate has been considered unpredictable in the past period of time, but the ability of the neural network to describe the nonlinear relationship opens up a new way to predict the exchange rate. Lee and Chen [3] used RNN to predict the trend of the five currencies against the US dollar exchange rate, and demonstrated significant market synchronization ability in the forecast of the yen against the US dollar and the British pound against the US dollar. It reveals the potential of neural networks to describe nonlinear relationships.

In 2006, Hinton [4] published a paper in Science that proposed the Deep Belief Network (DBN), breaking the bottleneck of gradient disappearance in multilayer network training, and letting the concept of deep learning enter people’s field of vision. The improvement of computer hardware performance also makes the training of multilayer networks more efficient. Therefore, more and more researchers use deep learning methods to solve the complex problems that shallow neural networks are invalid, which includes the problems of financial analysis.

Heaton et al. [5] gave a brief overview of the development of deep learning in the finance, summarizing the three models of deep learning, namely Autoencoder, Rectified Neural Networks, and Long Short-Term Memory (LTSM). They also introduced two ways to avoid over-fitting, namely regularization and dropout. They then used the Autoencoder model as an example, and selected a small number of stocks to build two training sets (One training set is the 10 stocks most similar to the S&P500 trend, another set based on the former, adding 10 stocks that are the least similar to the S&P500 trend), trained the S&P500 stock index between 2014 and 2015. The results showed that as long as the samples are diverse enough, a small number of samples can approximate the real value with almost any precision by deep learning.

Nowadays, countless researchers have devoted a lot of energy to study deep neural networks, and have harvested many excellent achievements. These studies allow us to stand on the shoulders of giants, so we have a strong basis on financial analysis using deep learning.

2 Related Research

Deep learning has developed to a certain scale, with a knowledge system containing many models and optimization methods. Deep Belief Network (DBN) is a classic deep learning model proposed by Hinton [4]. It is based on the Restricted Boltzmann Machine (RBM), and the entire network can be regarded as a stack of several RBMs. In DBN, the layer-by-layer unsupervised training method is used to gradually optimize the parameters of each RBM structure, which is called “pre-training”. After pre-training, the parameters close to the optimal solution can be obtained, and the problem of gradient disappearance when training multilayer neural networks is solved to some extent. Then, the whole neural network is trained by BP algorithm, which is called “fine tuning”. The DBM (Deep Boltzmann machine) model is another RBM-based model proposed by Salakhutdinov and Hinton [6], which has been proven to perform well in handwritten digit recognition and object recognition. The difference between DBM and RBM are the multiple hidden layers (RBM has only one hidden layer), and the connection between all adjacent layers is undirected (in the DBM structure, only the last two layers are undirected connections).

Convolutional Neural Network (CNN) is a forward artificial neural network model proposed by Yann LeCun et al. in 1998. Its typical network structure includes convolutional layer, pooled layer, and fully connected layer. It has been widely used in computer vision. The operation process is that we give a picture (a training sample) as an input, and sequentially scan the input picture pixel through multiple convolution operators. The scan result is activated by the activation function to obtain the feature map. Then we use the pooling operator to get the downsample of the feature map and output the result as the input to the next layer. After all the convolution and pooling layers, we use the fully connected neural network for further operations. Finally, the result is given by the output layer.

Recurrent Neural Network (RNN) is a type of neural network efficient for processing time series data. In the traditional neural network, the signals of neurons in each layer can only be propagated to the next layer, and the data processing is independent at each moment. However, the RNN memorizes the previous information and applies it to current data processing. That is, the hidden layer not only receives the output from upper layer, but also receives the output made by itself from the previous time. Saad et al. [7] applied Time-Delay Neural Network, Probabilistic Neural Network, and RNN models to predict the closing price of the stock. The experimental results show that all three models have predictive abilities. However, the gradient disappearance occurred on the RNN, which means that the gradient generated at a certain moment disappears after several propagations on the time axis, and the long-range information cannot be effectively memorized. Long Short-Term Memory (LSTM) is developed to solve this weakness of RNN. LSTM introduces the concept of cell state, which allows add or forget information, and also form a closed loop through the gates, thereby overcoming the weakness that RNN cannot effectively memorize long-range information. Chen et al. [8] tried to use LSTM to predict stock returns in China’s stock market. Experiments show that compared with the stochastic forecasting method, the LSTM model’s prediction accuracy for stock returns increased from 14.3 to 27.2%.

Autoencoder is a neural network that reproduces the input signal as much as possible. Its learning process is first to encode the input and then decode the reconstructed input as an output. This approach is often used for feature extraction, denoising, and so on. Autoencoder generally refers to a network model with only one hidden layer, while Deep Autoencoder is a network model with multiple hidden layers. The training of Deep Autoencoder is similar to DBM. Pre-training is performed between every two layers using RBM, and then the parameters are fine-tuned by the BP algorithm.

Because of the large number of layers in deep learning, the generated models are usually very complex. Over-fitting is prone to occur when the training sample set is not large enough. Therefore, some techniques are needed to control the training process, and to reduce or even prevent serious over-fitting. Commonly used techniques include early stopping, regularization, and drop out.

Early stopping is a method to prevent over-fitting by controlling the epochs of trainings. It stops iteration at the right timing to prevent over-fitting. Prechelt [9] made a deep study in this method. The regularization method refers to adding a regular term of the weights to the loss function, avoiding the overly complicated weights lead to an over-fitting model. The main idea of drop out is to randomly remove certain connections between the hidden layers in each training epoch to reduce the complexity of the model. The papers by Heaton et al. [5] introduce practical applications of regularization and drop out.

3 Case Study: Short-Term Prediction of the Shanghai Composite Index Using RNN

In this example, the RNN will be applied to make short-term prediction of the Shanghai Stock Index. The prediction method is to use the opening price, the highest price, the lowest price, the closing price and the trading volume of the previous 10 days to predict the highest price of the next day. The input characteristic vector is

$$ X = \left( {Opening\,price,highest\,price,lowest\,price,closing\,price,trading\,volume} \right) $$

(1)

Then the time series inputted is

$$ \left( {X_{1} , X_{2} ,X_{3} ,X_{4} ,X_{5} ,X_{6} ,X_{7} ,X_{8} ,X_{9} ,X_{10} } \right) $$

(2)

The expected output is

$$ Y_{11}^{*} = (highest\,price)_{t = 11} $$

(3)

3.1 Data Collection and Processing

First, we collected transaction data of the Shanghai Stock Exchange Index from January 4, 2005 to June 30, 2017 for a total of 3,034 trading days, including date, opening price, highest price, lowest price, closing price, and trading volume. The data from January 4, 2005 to December 31, 2015, a total of 2,671 transaction days were used as training sets. The data from January 4, 2016 to June 30, 2017, a total of 363 trading days were used as test sets.

As input, the transaction data of the previous 10 days are processed as a moving time series, and the highest price of the 11th day is taken as the expected output, namely:

$$ \left( {X_{1} , X_{2} ,X_{3} ,X_{4} ,X_{5} ,X_{6} ,X_{7} ,X_{8} ,X_{9} ,X_{10} } \right)\, {=>} \, \left( {{\text{Y}}_{11}^{*} } \right) $$

(4)

$$ \left( {X_{2} ,X_{3} ,X_{4} ,X_{5} ,X_{6} ,X_{7} ,X_{8} ,X_{9} ,X_{10} ,X_{11} } \right) \, {=>} \, \left( {{\text{Y}}_{12}^{*} } \right) $$

(5)

After building the moving time series, the scales of input data were normalized into [0, 1]. So far the data processing is completed and they were prepared to input to the model.

3.2 Model Structure and Parameters

The neural network has an RNN layer as its hidden layer, and this hidden layer contains 32 neurons, which are iteratively trained in 10 time steps. The hidden layer activation function is tanh. Because it is a regression analysis problem, the output layer does not perform nonlinear transformation, that is, the linear unit y = x is directly used. The mean square error (MSE) is selected as the loss function, namely:

$$ MSE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {y^{*} - y} \right)^{2} } $$

(6)

where n is the total number of training samples, y* is the expected output, and y is the label of the model, namely real highest price of the 11th day.

The optimization strategy was gradient descent, and the learning rate was set to 0.01. The early stopping technique was used to prevent over-fitting. When the value of loss function did not reduce during continuous 5 iterations, the training would stop, the preset iteration time is 500 epochs, and the actual iteration time is 407 epochs.

3.3 Analysis of Results

The comparison of the predicted curve of the model on the test set with the real curve is shown in Fig. 1.

It can be seen that the predicted curve is basically consistent with the actual curve in the trend, though it still has some local deviations. The daily error rate is used to further measure the accuracy of the prediction. The calculation formula is

$$ DER = \frac{{ABS\left( {y^{*} - y} \right)}}{y*}*100\% $$

(7)

The ABS is absolute value function, and the daily error rate of the 363 trading day predictions of the test set is shown in Table 1.

Table 1 Statistics on the daily error rate of Shanghai Composite Index

Full size table

In the 363 trading days of the test set, the number of days with predicted daily error rates less than or equal to 1% were 224, accounting for 61.71%. The predicted daily error rates ranged from 1 to 2% were 95 days, accounting for 26.17%. In general, the generalization performance on the test set was good and the model can be considered to have certain practical value. We analyzed the 4 days with the highest predicted error rate and found that it appeared on January 5, 12, 27, and February 26, 2016. These 4 days’ Shanghai Composite Index has experienced a sharp decline compared to other days. According to the securities news, China’s stock market experienced a serious crash in 2015. So the circuit breaker was officially implemented on January 4, 2016. However, the circuit breaker was repeatedly triggered by the continuous falling of the stock price. This mechanism was implemented for only 4 days. Even so, in January 2016, it was the biggest drop in a single month in the past 8 years (last time is the global financial crisis in 2008). This explains the high deviations and shows that RNN’s prediction of stock index as a technical analysis method has the disadvantage of being insensitive to bad news in the market (Fig. 2).

4 Conclusion

This paper studies the applications of deep learning in finance and focuses on applying a recurrent neural network to stock price prediction. The experimental result suggests that in the context of big data, deep learning is very suitable for financial analysis and can achieve high performance. Therefore, it is very promising to apply deep learning to solve more financial problem.

References

Tam. K.Y., Kiang, M.: Predicting Bank Failures: A Neural Network Approach. Taylor & Francis, Inc. (1990)
Google Scholar
Hutchinson, J.M., Lo, A.W., Poggio, T.: A nonparametric approach to pricing and hedging derivative securities via learning networks. J. Financ. 49(3), 851–889 (2012)
Google Scholar
Lee, T.S., Chen, I.F.: Forecasting exchange rates using feedforward and recurrent neural networks. J. Appl. Econom. 10(4), 347–364 (1995)
Article Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Heaton, J.B., Polson, N.G., Witte, J.H.: Deep learning in finance (2016). arXiv:1602.06561
Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. J. Mach. Learn. Res. 5(2), 1967–2006 (2009)
MATH Google Scholar
Saad, E.W., Prokhorov, D.V., Nd, W.D.: Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks. IEEE Trans. Neural Netw. 9(6), 1456–1470 (1998)
Article Google Scholar
Chen, K., Zhou, Y., Dai, F.: A LSTM-based method for stock returns prediction: a case study of China stock market. In: IEEE International Conference on Big Data, pp. 2823–2824. IEEE (2015)
Google Scholar
Prechelt, L.: Early stopping—but when? Neural Netw. Tricks Trade 1524(1), 55–69 (1998)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thanks funding from 1. The Science Research Project of Guangdong Province, “Research on the Technical Platform of Rural Cultural Tourism Planning Based on Digital Media” (No: 2017A020220011). 2. The Natural Science Funding of Guangdong Province, “Research on the Sponge River Design Based on Multi-disciplines: a Case Study of Shenzhen” (No: 2016A030313659).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Kuo-Kun Tseng & Chiye Ou
School of Economics and Management, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Ao Huang & Regina Fang-Ying Lin
School of Architecture, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Xiangmin Guo

Authors

Kuo-Kun Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Chiye Ou
View author publications
You can also search for this author in PubMed Google Scholar
Ao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Regina Fang-Ying Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xiangmin Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Regina Fang-Ying Lin .

Editor information

Editors and Affiliations

Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, Fujian, China
Jeng-Shyang Pan
Department of Computing, Mathematics, and Physics, Western Norway University of Applied Sciences, Bergen, China
Jerry Chun-Wei Lin
Changzhou College of Information Technology, Changzhou, Jiangsu, China
Bixia Sui
College of Smart Living and Management, Tajen University, Pingtung, Taiwan
Shih-Pang Tseng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tseng, KK., Ou, C., Huang, A., Lin, R.FY., Guo, X. (2019). Financial Analysis with Deep Learning. In: Pan, JS., Lin, JW., Sui, B., Tseng, SP. (eds) Genetic and Evolutionary Computing. ICGEC 2018. Advances in Intelligent Systems and Computing, vol 834. Springer, Singapore. https://doi.org/10.1007/978-981-13-5841-8_57

Download citation

DOI: https://doi.org/10.1007/978-981-13-5841-8_57
Published: 12 May 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-5840-1
Online ISBN: 978-981-13-5841-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics