AComNN: Attention enhanced Compound Neural Network for financial time-series forecasting with cross-regional features

doi:10.1016/j.asoc.2021.107649

Applied Soft Computing

Volume 111, November 2021, 107649

https://doi.org/10.1016/j.asoc.2021.107649 Get rights and content

Highlights

•
Present a method for learning the hidden patterns of the financial time-series.
•
Adopt cross-regional features to mitigate the information insufficiency.
•
Apply on the Hong Kong Hang Seng Index trend prediction as a real-life case study.
•
Show high practicality and competitiveness in the financial time-series forecasting.

Abstract

In recent years, many works spring out to adopt the forecast-based approach to support the investment decision in the financial market. Nevertheless, most of them do not consider mining the hidden patterns in the cross-regional financial time-series. However, the fluctuation in financial markets has always been affected by the global economy, instead of a single market. To overcome this issue, this article proposes an Attention enhanced Compound Neural Network (AComNN) that can be applied on features of multiple-sources, including different financial markets and economic entities. The proposed novel approach compounds of Artificial Neural Network (ANN), Long Short-Term Memory (LSTM), and self-attention to progressively capture the time-zone-dependent context behind the financial time-series across regions with multiple filters. Thereby, it provides trading signals for supporting the financial investment decision. The proposed AComNN has been applied on the Hong Kong Hang Seng Index (HSI) trend prediction based on various initial features across regions. The experimental result demonstrates that the AComNN achieves the highest average accuracy for the one-day ahead trend prediction over 60%. Besides, it reveals highly superior competitiveness on the forecasting capability improved by 13.36% on average compared with the baselines. Therefore, we encourage to adopt the proposed method to the practitioners and provide a new thought, considering the analysis of cross-regional features, in the financial time-series forecasting.

Introduction

Due to the volatile, nonlinear, complicated, and chaotic characteristics of the financial market, accurately forecasting the trend of the financial time-series has always been challenging [1]. In recent years, a series of well-designed machine-learning-based trading systems emerge for assisting investors or speculators in identifying financially rewarding stocks and exercising their ownership [2], [3].

Previous traditional studies mainly adopt some univariate time-series models for the prediction in the financial market, such as AutoRegressive Moving Average model (ARMA) [4], AutoRegressive Integrated Moving Average model (ARIMA) [5], [6], [7] and Generalized AutoRegressive Conditional Heteroskedast (GARCH) [8]. However, only considering the influence of its historical behaviors on future movements of the price, the univariate model structure and their simple market pattern assumptions lead to the low financial forecasting capability in the practical application.

Apart from the traditional time-series models, machine learning models also have been adopted in the financial time-series forecasting for years due to their more substantial capability of learning, ease of interpretability, and absence of the presumption, such as Support Vector Machine (SVM) [9], [10], Support Vector Regression (SVR) [11], Logistic Regression (LR) [12], Random Forest (RF) [13], eXtreme Gradient Boosting (XGBoost) [14], Decision Tree (DT) [15] as well as a series of ensemble models of stacking [16] and bagging [17].

In recent years, deep learning has been widely applied to various research fields such as pattern recognition, image classification, and autopilot, which obtained great success. Because of their robust fitting and nonlinear mapping capability, researchers also have designed various deep learning models to implement the forecasting in the financial market, such as Long Short-Term Memory (LSTM) [2], [18], Convolutional Neural Network (CNN) [19], Artificial Neural Networks (ANN) [20], Graph Convolutional Neural Network (GCNN) [21] and other hybrid neural networks [22], [23], [24], [25].

Nevertheless, two issues suppress the forecasting capability of the above techniques. These are — (1) most of the previous studies of financial market prediction only focus their features on the relationship of inter-markets restricted in one region or even on a single targeting market, obstructing crucial information transmission from the outside market, i.e., information insufficiency. (2) Besides, their models are unable to capture the crucial hidden patterns behind the financial time-series across areas, due to the lack of corresponding adaptation to the multi-regional features, i.e., structural deficiency.

Therefore, in this work, we adopt the Hang Seng Index (HSI) trend prediction task by taking as an example to solve the above two issues. For the information insufficiency, we adopt cross-regional features as the initial input from two perspectives. On the one hand, we collect the technical indicators extracted from the Financial Times Stock Exchange 100 Index (FTSE 100), Standard & Poor’s 500 (S&P 500) and HSI in the area of London, New York, and Hong Kong respectively. On the other hand, we collect other highly associated economic indicators such as macro-economic indicators, commodity indicators, and currency exchange indicators obtained from regions of the U.K., the U.S., and China.

For the structural deficiency issue, we propose a novel Attention enhanced Compound Neural Network (AcomNN) for extracting features from multiple sources, which is constructed of the steps of ANN, LSTM, and self-attention in order. The ANN step is responsible for preliminarily extracting semantics from each region and uniform their feature dimensions. The LSTM step horizontally further transfers the refined semantics among regions according to their time-series relation on time zone. Finally, the self-attention step can dynamically focus on the decisive parts of regions for the weights allocation, thereby progressively capturing the time-zone-dependent context behind the financial time-series across regions with multiple filters.

In the experimental stage, we evaluate the AComNN on the HSI prediction with cross-regional features collected from Apr. 2003 to Dec. 2019. The experimental result demonstrates that the highest average accuracy for the one-day ahead HSI trend prediction can up to 60.81%. Compared with the state-of-the-art baselines, our AComNN outperforms them on average over 13%, simultaneously with a relatively very low standard deviation of 0.0355. Additionally, we also implement the trading simulation based on the trading signal provided by the AComNN. The final accumulative return during the simulation can up to 35.04% on average, showing that the performance of the AComNN based investment advisory system is highly competitive and practical in the real world.

The main contributions of our work are as follows:

1.
We mitigate the information insufficiency by integrating the cross-regional features of multiple stock markets and economic entities that constitute the raw features.
2.
We propose a fine-designed Attention enhanced Compound Neural Network (AcomNN), which can progressively capture the time-series relation characteristics and dynamically allocate attention for cross-regional features in each time zone.
3.
We explore the performance of AComNN under different feature smoothing and forecasting windows configuration. Also, we re-implement three state-of-the-art financial forecasting models [16], [19], [22] to compare with the proposed AComNN, and the experimental results prove that our proposed AComNN achieves an encouraging result among the baselines.

The rest of the paper is organized as follows: Section 2 summarizes the related literature concerning the financial time-series forecasting. Section 3 presents the associated data collection and preparation. Section 4 elaborates on the model construction method. Section 5 demonstrates the experimental design. Section 6 reports the experimental results. Section 7 discusses the whole experiment in forecasting capability and trading simulation. Section 8 discusses the threats to validity of this work. Finally, Section 9 concludes the paper and outlines the future work.

Section snippets

Feature study in financial markets

Since the price variation in the financial market is highly fluctuating and full of unexpected noises, features studies including determinant factors selection and dimension reduction play a critical role in effectively boosting the accuracy for financial market prediction and mitigating overfitting in training.

In [26], Garefalakis et al. study the determinant factors that influence HSI tendency and conclude that S&P 500 in the U.S. stock exchanges, gold, and crude oil prices play a substantial

Data collection and preparation

Regionally, our cross-regional features can be divided into three parts, i.e., from the U.K., the U.S., and China. The data comprises both technical indicators and other economic indicators in each region.

Technical indicators are the same in each region, extracted either from FTSE 100, S&P 500, or HSI. The FTSE 100 is composed of 100 constituent stocks in the London stock exchange. The S&P 500 consists of 500 constituent stocks in the NYSE and NASDAQ exchange, while the HSI consists of 50

Attention enhanced Compound Neural Network

In this section, we introduce the construction of our proposed Attention enhanced Compound Neural Network (AComNN). The whole framework includes three main steps: ANN Step, LSTM Step, and Self-Attention Step. Fig. 1 depicts the general framework of the AComNN.

Evaluation metrics

As we mentioned in Section 3.3, we conduct five consecutive back-testing experiments under each combination of $α$ and $w s$ . To obtain a stable evaluation for the AComNN performance (including the forecasting capability and stability) under each combination of $α$ and $w s$ , we define Average Accuracy (Avg. Acc.) and Standard Deviation (Std. Dev.) respectively as our evaluation metrics. The Avg. Acc and Std. Dev. are defined in Eqs. (20), (21). The $A c c u r a c y_{i}$ represents the accuracy in the $i$ th

AComNN performance results

For this part, we discuss the performance of our AComNN with different forecasting windows and smoothing factors. After we finish the first part of the experimental procedure in Fig. 2, we present Table 4 to demonstrate the best model’s accuracy on the test set in each back-testing experiment, and we also list the Avg. Acc. and Std. Dev. among five backtests on the right side of the table. It is obvious that the AComNN predicting for the datasets with $w s = 1$ and $α = 0.5$ obtains the highest Avg.

Discussion

The above experiments illustrate the predictive and profit capability of our proposed AComNN and the comparison with other baseline models.

For comparing the forecasting capability, our proposed AComNN outperforms the other three baselines because we consider the global major stock markets and economic entities for cross-regional feature extraction. With each of the stock markets opens and closes, their influence transfer from western hemisphere to the eastern hemisphere and finally affect the

Threats to validity

Some of our re-implementation to the original baseline paper for fair comparison may cause threats to the validity, we list below:

For the MFNN, referring to its Section 3.1 in [22], it labels samples with −1, 0, and 1 as Decrease, No change (when the return fluctuation does not exceed a certain threshold) and Increase, in which they account for 10%, 80%, and 10% respectively as the best configuration. In order to avoid class imbalance problem, MFNN stochastically deletes instances in the class

Conclusion and future work

In this paper, we propose a novel Attention enhanced Compound Neural Network (AComNN) as the prediction engine for the financial trading system to fully exploit the time-series interrelation of features across regions. Further, we explore its performance with four different forecasting windows and another four different smoothing factors. In addition, we verify its robustness and performance by five consecutive back-testing experiments under each window size and smoothing factor combination.

The

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported in part by the General Research Fund of the Research Grants Council of Hong Kong (No. 11208017) and the research funds of City University of Hong Kong (7005028, 7005217), and the Research Support Fund by Intel (9220097), and funding supports from other industry partners (9678149, 9440227, 9440180, 9220103 and 9229029).

References (51)

KrausM. et al.
Decision support from financial disclosures with deep neural networks and transfer learning
Decis. Support Syst.
(2017)
BrasileiroR.C. et al.
Automatic trading method based on piecewise aggregate approximation and multi-swarm of improved self-adaptive particle swarm optimization with validation
Decis. Support Syst.
(2017)
KocakC.
ARMA (P, q) type high order fuzzy time series forecast method based on fuzzy logic relations
Appl. Soft Comput.
(2017)
LinZ.
Modelling and forecasting the stock market volatility of SSE composite index using GARCH models
Future Gener. Comput. Syst.
(2018)
LuoL. et al.
Improving the integration of piece wise linear representation and weighted support vector machine for stock trading signal prediction
Appl. Soft Comput.
(2017)
HenriqueB.M. et al.
Stock price prediction using support vector regression on daily and up to the minute prices
J. Financ. Data Sci.
(2018)
BasakS. et al.
Predicting the direction of stock market prices using tree-based classifiers
North Amer. J. Econ. Financ.
(2019)
JiangM. et al.
An improved stacking framework for stock index prediction by leveraging tree-based ensemble models and deep learning algorithms
Physica A
(2020)
HoseinzadeE. et al.
Cnnpred: CNN-based stock market prediction using a diverse set of variables
Expert Syst. Appl.
(2019)
SezerO.B. et al.
A deep neural-network based stock trading system based on evolutionary optimized technical analysis parameters
Procedia Comput. Sci.
(2017)

LongW. et al.

Deep learning-based feature engineering for stock price movement prediction

Knowl.-Based Syst.

(2019)

WengB. et al.

Macroeconomic indicators alone can predict the monthly closing price of major U.S. indices: Insights from artificial intelligence, time-series analysis and hybrid models

Appl. Soft Comput.

(2018)

KaraY. et al.

Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul stock exchange

Expert Syst. Appl.

(2011)

Tanaka-YamawakiM. et al.

Adaptive use of technical indicators for the prediction of intra-day stock prices

Physica A

(2007)

ÜlküN. et al.

Stock market’s response to real output shocks in eastern European frontier markets: A varwal model

Emerg. Mark. Rev

(2017)

GravesA. et al.

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Neural Netw.

(2005)

Abu-MostafaY.S. et al.

Introduction to financial forecasting

Appl. Intell.

(1996)

AdebiyiA.A. et al.

Comparison of ARIMA and artificial neural networks models for stock price prediction

J. Appl. Math.

(2014)

AriyoA.A. et al.

Stock price prediction using the ARIMA model

JarrettJ.E. et al.

ARIMA Modeling with intervention to forecast and analyze chinese stock prices

Int. J. Eng. Bus. Manag.

(2011)

LinY. et al.

An SVM-based approach for stock market trend prediction

DuttaA. et al.

Prediction of stock performance in indian stock market using logistic regression

Int. J. Bus. Inf.

(2012)

NtiK.O. et al.

Random forest based feature selection of macroeconomic variables for stock market prediction

Am. J. Appl. Sci.

(2019)

NairB. et al.

A decision tree- rough set hybrid system for stock market trend prediction

Int. J. Comput. Appl.

(2010)

WangH. et al.

Stock return prediction based on bagging-decision tree

Cited by (13)

A multi-agent reinforcement learning framework for optimizing financial trading strategies based on TimesNet
2024, Expert Systems with Applications
An increasing number of studies have shown the effectiveness of using deep reinforcement learning to learn profitable trading strategies from financial market data. However, a single-agent model is not sufficient to handle complex financial scenarios. To address this problem, a novel approach called Multi-Agent Double Deep Q-Network (Later called MADDQN) is proposed in this study, which reasonably balances the pursuit of maximum revenue and the avoidance of risk under the multi-agent reinforcement learning framework by innovatively employing two different agents represented respectively by two time-series feature extraction networks, TimesNet, and the Multi-Scale Convolutional Neural Network. Furthermore, to achieve a more generalized model suitable for different underlying assets, a mixed dataset containing three major U.S. stock indexes is collected. And the proposed model has been pre-trained in this dataset and subsequently refined for the specified asset. The results from experiments on five different stock indices show that the proposed MADDQN has an average cumulative return of 23.08%, outperforming the other baseline methods. Besides, the multi-agent model demonstrates its advantage in balancing the risk and revenue, in comparison with the single-agent models. Additionally, The generalization experiments confirm that the proposed MADDQN method after pre-training in the proposed mixed dataset could be stably transferred to the other underlying assets with a refinement. These findings indicate that the proposed framework not only achieves good performance in complex financial market environments but also is able to generalize robustly across different scenarios in various markets.
TLIA: Time-series forecasting model using long short-term memory integrated with artificial neural networks for volatile energy markets
2023, Applied Energy
Due to weather and political fluctuations that significantly impact the production and price of energy sources, enhancing data distribution and reducing data complexity is crucial to achieving accurate forecasting. Additionally, it is essential to provide a flexible forecasting model capable of handling rapid changes in the energy market and effectively anticipating energy supplies and demands. This study introduces a novel method to deal with energy market fluctuations in the long and short term and provide highly accurate forecasts for various energy data. It uses the Enhancing Transformation Reduction (ETR) method to improve the stationarity of the data, reduce seasonality and trend, and resolve rapid fluctuations. The output of ETR is then passed into a hybrid forecasting model referred to as “ Time-Series Forecasting Model using Long Short-Term Memory integrated with Artificial Neural Networks” (TLIA). The TLIA model benefits from transfer learning, which transmits the output of the LSTM layers into the ANN layers, enabling TLIA to base its work on the best performance and continue improving it. The study evaluates and tests its methods using six different datasets, including the electricity dataset of Victoria State, the oil price for the West Texas Intermediate, the Elia Grid load dataset, and wind power production. In addition to its characteristics, ETR accelerates and enhances the TLIA processing to achieve the highest accuracy compared to seven forecasting models in all six datasets. The TLIA is often 40 times or more superior to competing models. Compared to another model, the Mean Absolute Error (MAE) results of TLIA range between (0.008 and 0.088) versus (0.77 and 4318.544).
Diverse title generation for Stack Overflow posts with multiple-sampling-enhanced transforme
2023, Journal of Systems and Software
Stack Overflow is one of the most popular programming communities where developers can seek help for their encountered problems. Nevertheless, if inexperienced developers fail to describe their problems clearly, it is hard for them to attract sufficient attention and get the anticipated answers. To address such a problem, we propose M₃NSCT5, a novel approach to automatically generate multiple post titles from the given code snippets. Developers may take advantage of the generated titles to find closely related posts and complete their problem descriptions. M₃NSCT5 employs the CodeT5 backbone, which is a pre-trained Transformer model with an excellent language understanding and generation ability. To alleviate the ambiguity issue that the same code snippets could be aligned with different titles under varying contexts, we propose the maximal marginal multiple nucleus sampling strategy to generate multiple high-quality and diverse title candidates at a time for the developers to choose from. We build a large-scale dataset with 890,000 question posts covering eight programming languages to validate the effectiveness of M₃NSCT5. The automatic evaluation results on the BLEU and ROUGE metrics demonstrate the superiority of M₃NSCT5 over six state-of-the-art baseline models. Moreover, a human evaluation with trustworthy results also demonstrates the great potential of our approach for real-world applications.
Detecting multi-type self-admitted technical debt with generative adversarial network-based neural networks
2023, Information and Software Technology
Developers often introduce the self-admitted technical debt (SATD), i.e., a compromised solution to satisfy the delivery of the current goals, in code comments but do not eliminate them timely in the following software development and maintenance process. Automatically identifying the SATDs to reduce potential harm to software has attracted the attention of researchers. However, existing approaches only identified SATDs at a coarse-grained level, which impacts developers to locate and remove them.
This paper proposes a novel model named GCF, which is a deep learning method to enhance the performance of multi-type SATD classification based on generative adversarial network. Method: The GCF model employs the JSD Generative Adversarial Network to solve the imbalance problem, utilizes CodeBERT to fuse information of code snippets and natural language for initializing the instances as embedding vectors, and introduces the feature extraction module to extract the instance features more comprehensively.
The experimental results show that, the GCF model obtains better performance compared with the state-of-the-art method. Moreover, experiments on the GCF model variants and others with different GAN models show the superiority of the GCF model.
Our proposed GCF model effectively solves the problem of imbalanced types of SATD, fuses the information of code snippets and natural language, and extracts key features to achieve outstanding performance in detecting multi-type SATD. Therefore, the GCF model is an effective method for detecting multi-type SATD.
Machine learning techniques for stock price prediction and graphic signal recognition
2023, Engineering Applications of Artificial Intelligence
Stock market analysis is extremely important for investors because knowing the future trend and grasping the changing characteristics of stock prices will decrease the risk of investing capital for profit. Thereupon, the prediction of stock prices and identifying the graphic signals of candlestick charts, which are two crucial tasks in stock price analysis, attract much attention from investors owing to the returns and risks that coexist in financial markets. To introduce a reliable approach for addressing these challenges, this paper proposes the modeling strategies based on machine learning (ML) techniques. A vector autoregression (VAR)-based rolling prediction model is proposed for forecasting stock prices, and a Gaussian feed-forward neural networks (GFNN)-based graphic signal identification method is introduced to recognize different types of stock price signals. The experimental results demonstrate better performance comparing with the state-of-the-art methods, and it can be successfully applied in real-world stock exchange strategies.
Finding the best learning to rank algorithms for effort-aware defect prediction
2023, Information and Software Technology
Effort-Aware Defect Prediction (EADP) ranks software modules or changes based on their predicted number of defects (i.e., considering modules or changes as effort) or defect density (i.e., considering LOC as effort) by using learning to rank algorithms. Ranking instability refers to the inconsistent conclusions produced by existing empirical studies of EADP. The major reason is the poor experimental design, such as comparison of few learning to rank algorithms, the use of small number of datasets or datasets without indicating numbers of defects, and evaluation with inappropriate or few metrics.
To find a stable ranking of learning to rank algorithms to investigate the best ones for EADP,
We examine the practical effects of 34 algorithms on 49 datasets for EADP. We measure the performance of these algorithms using 7 module-based and 7 LOC-based metrics and run experiments under cross-release and cross-project settings, respectively. Finally, we obtain the ranking of these algorithms by performing the Scott-Knott ESD test.
When module is used as effort, random forest regression performs the best under cross-release setting, and linear regression performs the best under cross-project setting among the learning to rank algorithms; (2) when LOC is used as effort, LTR-linear (Learning-to-Rank with the linear model) performs the best under cross-release setting, and Ranking SVM performs the best under cross-project setting.
This comprehensive experimental procedure allows us to discover a stable ranking of the studied algorithms to select the best ones according to the requirement of software projects.

View all citing articles on Scopus

View full text

AComNN: Attention enhanced Compound Neural Network for financial time-series forecasting with cross-regional features

Highlights

Abstract

Introduction

Section snippets

Feature study in financial markets

Data collection and preparation

Attention enhanced Compound Neural Network

Evaluation metrics

AComNN performance results

Discussion

Threats to validity

Conclusion and future work

Declaration of Competing Interest

Acknowledgments

Decis. Support Syst.

Decis. Support Syst.

Appl. Soft Comput.

Future Gener. Comput. Syst.

Appl. Soft Comput.

J. Financ. Data Sci.

North Amer. J. Econ. Financ.

Physica A

Expert Syst. Appl.

Procedia Comput. Sci.

Knowl.-Based Syst.

Appl. Soft Comput.

Expert Syst. Appl.

Physica A

Emerg. Mark. Rev

Neural Netw.

Introduction to financial forecasting

Appl. Intell.

Comparison of ARIMA and artificial neural networks models for stock price prediction

J. Appl. Math.

Stock price prediction using the ARIMA model

ARIMA Modeling with intervention to forecast and analyze chinese stock prices

Int. J. Eng. Bus. Manag.

An SVM-based approach for stock market trend prediction

Prediction of stock performance in indian stock market using logistic regression

Int. J. Bus. Inf.

Random forest based feature selection of macroeconomic variables for stock market prediction

Am. J. Appl. Sci.

A decision tree- rough set hybrid system for stock market trend prediction

Int. J. Comput. Appl.

Stock return prediction based on bagging-decision tree