Abstract
Forecasting trends in the financial market is a classic and challenging problem that attracts economists’ and computer scientists’ attention. This research area, characterized by its dynamic, chaotic, and nonlinear nature, is further complicated by the overarching influence of the efficient market hypothesis (EMH). The EMH posits that all available information, including historical prices and public news, is already reflected in current asset prices. It suggests that gaining consistent predictive advantages by leveraging such information is challenging. This paper evaluates different machine learning models to identify relevant news based on oscillations in a financial time series. Specifically, we explore the state-of-the-art in graph neural networks, which have the advantage of combining different representations of temporal series and textual data. As a result, we introduce three approaches to classify news as relevant or irrelevant and to model textual data and time series through graphs, taking into account the implications of the EMH. These approaches include text and time-series clusters with daily data, data occurring at perceptually important points in the time series, and data from moments when more than 70% of the news is classified as relevant. We find that similar to the challenge of using the news to enhance the prediction of financial series, the reverse is also true, highlighting the difficulty of identifying relevant news that potentially impacts commodity price fluctuations.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The dataset used in this study is publicity available via the following link: URL (https://data.mendeley.com/datasets/f8fdmpp6yh/2).
Algorithm availability
The method proposed in this study is publicity available via the following link: URL (https://github.com/ivanfilhoreis/GNN_text_ts).
References
Pinto N, Silva Figueiredo L, Garcia AC (2021) Automatic prediction of stock market behavior based on time series, text mining and sentiment analysis: A systematic review. In: 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, pp 1203–1208
Venter M, Strydom D, Grové B (2013) Stochastic efficiency analysis of alternative basic grain marketing strategies. Agrekon 52(sup1):46–63
Clapham B, Siering M, Gomber P (2021) Popular news are relevant news! how investor attention affects algorithmic decision-making and decision support in financial markets. Inf Syst Front 23:477–494
Man X, Luo T, Lin J (2019) Financial sentiment analysis (fsa): A survey. In: International Conference on Industrial Cyber Physical Systems (ICPS). IEEE, pp 617–622
Swathi T, Kasiviswanath N, Rao AA (2022) An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis. Appl Intell 52(12):13675–13688
Khan W, Ghazanfar MA, Azam MA, Karami A, Alyoubi KH, Alfakeeh AS (2020) Stock market prediction using machine learning classifiers and social media, news. J Ambient Intell Humaniz Comput 13:1–24
Carosia A, Coelho GP, Silva A (2020) Analyzing the Brazilian financial market through Portuguese sentiment analysis in social media. Appl Artif Intell 34(1):1–19
Yekrangi M, Abdolvand N (2021) Financial markets sentiment analysis: developing a specialized lexicon. J Intell Inf Syst 57:127–146
Zhong S, Hitchcock D (2021) S &p 500 stock price prediction using technical, fundamental and text data. Stat, Optim Inf Comput 9(4):769–788
Li X, Wu P, Wang W (2020) Incorporating stock prices and news sentiments for stock market prediction: a case of Hong Kong. Inf Process Manag 57(5):102212
Huang S-C, Pareek A, Seyyedi S, Banerjee I, Lungren MP (2020) Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit Med 3(1):1–9
Bleiholder J, Naumann F (2009) Data fusion. ACM Comput Surv (CSUR) 41(1):1–41
El-Sappagh S, Saleh H, Sahal R, Abuhmed T, Islam SR, Ali F, Amer E (2021) Alzheimer’s disease progression detection model based on an early fusion of cost-effective multimodal data. Future Gener Comput Syst 115:680–699
Lim TY, Ansari A, Major B, Fontijne D, Hamilton M, Gowaikar R, Subramanian S (2019) Radar and camera early fusion for vehicle detection in advanced driver assistance systems. In: Machine Learning for Autonomous Driving Workshop at the 33rd Conference on Neural Information Processing Systems, vol 2. p 7
Liang W, Xiao L, Zhang K, Tang M, He D, Li KC (2021) Data fusion approach for collaborative anomaly intrusion detection in blockchain-based systems. IEEE Internet Things J 9(16):14741–14751
Cheng D, Yang F, Xiang S, Liu J (2022) Financial time series forecasting with multi-modality graph neural network. Pattern Recogn 121:108218
Sezer OB, Gudelek MU, Ozbayoglu AM (2020) Financial time series forecasting with deep learning: a systematic literature review: 2005–2019. Appl Soft Comput 90:106181
Sirignano J, Cont R (2021) Universal features of price formation in financial markets: perspectives from deep learning. In: Machine Learning and AI in Finance. Routledge, pp 5–15
Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl-Based Syst 69:14–23
Ozbayoglu AM, Gudelek MU, Sezer OB (2020) Deep learning for financial applications: a survey. Appl Soft Comput 93:106384
Olorunnimbe K, Viktor H (2023) Deep learning in the stock market-a systematic survey of practice, backtesting, and applications. Artif Intell Rev 56(3):2057–2109
Timmermann A, Granger CW (2004) Efficient market hypothesis and forecasting. Int J Forecast 20(1):15–27
Ying Q, Yousaf T, Ain QU, Akhtar Y, Rasheed MS (2019) Stock investment and excess returns: a critical review in the light of the efficient market hypothesis. J Risk Financ Manag 12(2):97
Feng F, Chen H, He X, Ding J, Sun M, Chua TS (2018) Enhancing stock movement prediction with adversarial training. arXiv preprint arXiv:1810.09936
Long W, Song L, Tian Y (2019) A new graphic kernel method of stock price trend prediction based on financial news semantic and structural similarity. Expert Syst Appl 118:411–424
Zhou D, Zheng L, Zhu Y, Li J, He J (2020) Domain adaptive multi-modality neural attention network for financial forecasting. In: The Web Conference. pp 2230–2240
Reis Filho IJ, Correa GB, Freire GM, Rezende SO (2020) Forecasting future corn and soybean prices: an analysis of the use of textual information to enrich time-series. In: Anais do VIII Symposium on Knowledge Discovery, Mining and Learning. SBC, pp 113–120
Jin Z, Yang Y, Liu Y (2020) Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput Appl 32:9713–9729
Dehmamy N, Barabási AL, Yu R (2019) Understanding the representation power of graph neural networks in learning graph topology. Adv Neural Inf Process Syst 32:66
Xiang S, Cheng D, Shang C, Zhang Y, Liang Y (2022) Temporal and heterogeneous graph neural network for financial time series prediction. In: 31st International Conference on Information and Knowledge Management (ACM). pp 3584–3593
Huang W-C, Chen C-T, Lee C, Kuo F-H, Huang S-H (2023) Attentive gated graph sequence neural network-based time-series information fusion for financial trading. Inf Fusion 91:261–276
Ding Y, Zhang Z, Zhao X, Hong D, Cai W, Yu C, Yang N, Cai W (2022) Multi-feature fusion: graph neural network and CNN combining for hyperspectral image classification. Neurocomputing 501:246–257
Wu B, Chao KM, Li Y (2023) Heterogeneous graph neural networks for fraud detection and explanation in supply chain finance. Inf Syst 121:102335
Pradhyumna P, Shreya G (2021) Graph neural network (gnn) in image and video understanding using deep learning for computer vision applications. In: International Conference on Electronics and Sustainable Communication Systems (ICESC). IEEE, pp 1183–1189
Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81
Kenton JDMWC, Toutanova LK (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT. pp 4171–4186
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst 32:5753–5763
Chung FLK, Fu TC, Luk WPR, Ng VTY (2001) Flexible time series pattern matching based on perceptually important points. In: Workshop on Learning from Temporal and Spatial Data in International Joint Conference on Artificial Intelligence
Fu Tc, Chung Fl, Luk R, Ng Cm (2008) Representing financial time series based on data point importance. Eng Appl Artif Intell 21(2):277–300
Dwivedi VP, Joshi CK, Luu AT, Laurent T, Bengio Y, Bresson X (2023) Benchmarking graph neural networks. J Mach Learn Res 24(43):1–48
Tsitsulin A, Palowitch J, Perozzi B, Müller E (2023) Graph clustering with graph neural networks. J Mach Learn Res 24(127):1–21
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. Trans Neural Netw Learn Syst 32(1):4–24
Tang J, Liao R (2022) Graph neural networks for node classification. Graph Neural Netw: Found, Front, Appl 66:41–61
Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural networks? In: International Conference on Learning Representations. Open Review, New Orleans
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the Advances in Neural Information Processing Systems. MIT, Los Angeles, pp 1024–1034
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Conference on Artificial Intelligence. Association for the Advancement of Artifcial Intelligence, Honolulu, Hawaii
Reis Filho IJ, Campos Coleti J, Marcacini RM, Rezende SO (2024) Dataset: annotated soybean market news articles. Data Brief 55:110545
Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
Acknowledgements
This work was carried out at the Center for Artificial Intelligence (C4AI-USP) and partially supported by the São Paulo Research Foundation (FAPESP) (grant #2019/07665-4) and the IBM Corporation. The authors of this paper thank FAPESP (Process 2019/25010-5) and the National Center for Scientific and Technological Development (CNPq) (process 309575/2021-4). The corresponding author thanks the Minas Gerais State Research Support Foundation (FAPEMIG) (Process PCRH BPG-00054-210).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Filho, I.J.R., Gôlo, M.P.S., Marcacini, R.M. et al. How do financial time series enhance the detection of news significance in market movements? A study using graph neural networks with heterogeneous representations. Neural Comput & Applic 37, 1307–1319 (2025). https://doi.org/10.1007/s00521-024-10418-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-10418-5