News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT

Khan, Md. Nabil Rahman; Salsabil, Most. Sadia; Hasib, Khan Md.; Islam, Md Rafiqul; Alam, Mohammad Shafiul; Sanin, Cesar; Szczerbicki, Edward

doi:10.1007/978-981-97-5934-7_19

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2145))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

169 Accesses

Abstract

Stock market is a complex and dynamic industry that has always presented challenges for stakeholders and investors due to its unpredictable nature. This unpredictability motivates the need for more accurate prediction models. Traditional prediction models have limitations in handling the dynamic nature of the stock market. Additionally, previous methods have used less relevant data, leading to suboptimal performance. This study proposes the use of Bidirectional Encoder Representations from Transformers (BERT), a pre-trained Large Language Model (LLM), to predict Dhaka Stock Exchange (DSE) market movements. We also introduce a new dataset designed specifically for this problem, capturing important characteristics and patterns that were missing in other datasets. We test our new dataset of headlines and stock market indexes on various machine learning techniques, including Decision Tree (DT), Logistic Regression (LR), K-Nearest Neighbors (KNN), Random Forest (RF), Linear Support Vector Machine (LSVM), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), Bidirectional Long Short-Term Memory (Bi-LSTM), BERT, Financial Bidirectional Encoder Representations from Transformers (FinBERT), and RoBERTa, which are compared to assess their predictive capabilities. Our proposed model achieves 99.83% accuracy on the training set and 99.78% accuracy on the test set, outperforming previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Predicting Emerging and Frontier Stock Markets Using Deep Neural Networks

Deep Neural Networks for Stock Market Price Predictions in VUCA Environments

Using BERT to Predict the Brazilian Stock Market

References

Bing, L., Chan, K.C., Ou, C.: Public sentiment analysis in Twitter data for prediction of a company’s stock price movements. In: 2014 IEEE 11th International Conference on e-Business Engineering, pp. 232–239. IEEE, November 2014
Google Scholar
Cakra, Y.E., Trisedya, B.D.: Stock price prediction using linear regression based on sentiment analysis. In: 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 147–154. IEEE, October 2015
Google Scholar
Seker, S.E., Cihan, M.E.R.T., Khaled, A.N., Ozalp, N., Ugur, A.Y.A.N.: Time series analysis on stock market for text mining correlation of economy news. Int. J. Soc. Sci. Humanit. Stud. 6(1), 69–91 (2013)
Google Scholar
Kim, Y., Jeong, S.R., Ghani, I.: Text opinion mining to analyze news for stock market prediction. Int. J. Advance. Soft Comput. Appl 6(1), 2074–8523 (2014)
Google Scholar
Abdullah, S.S., Rahaman, M.S., Rahman, M.S.: Analysis of stock market using text mining and natural language processing. In: 2013 International Conference on Informatics, Electronics and Vision (ICIEV), pp. 1–6. IEEE, May 2013
Google Scholar
Khan, M.N.R., Al Tanim, O., Salsabil, M.S., Reza, S.R., Hasib, K.M., Alam, M.S.: A multi-modal deep learning approach for predicting Dhaka stock exchange. In: 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0879–0885. IEEE, March 2023
Google Scholar
Khan, M.N.R., Reza, S.R., Al Tanim, O., Salsabil, M.S., Hasib, K.M., Alam, M.S.: A hybrid method based on machine learning to predict the stock prices in Bangladesh. In: 2022 IEEE 13th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 0067–0073. IEEE, October 2022
Google Scholar
Hasan, T., et al.: XL-Sum: large-scale multilingual abstractive summarization for 44 languages. arXiv preprint arXiv:2106.13822 (2021)
Belinkov, Y., Bisk, Y.: Synthetic and natural noise both break neural machine translation. arXiv preprint arXiv:1711.02173 (2017)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)
Article Google Scholar
Lora, S.K., Shahariar, G.M., Nazmin, T., Rahman, N.N., Rahman, R., Bhuiyan, M.: Ben-Sarc: a corpus for sarcasm detection from Bengali social media comments and its baseline evaluation (2022)
Google Scholar
Ashtiani, M.N., Raahmei, B.: News-based intelligent prediction of financial markets using text mining and machine learning: a systematic literature review. Expert Syst. Appl. 217, 119509 (2023)
Article Google Scholar
Melina, S., Napitupulu, H., Mohamed, N.: A conceptual model of investment-risk prediction in the stock market using extreme value theory with machine learning: a semisystematic literature review. Risks 11(3), 60 (2023)
Article Google Scholar
Han, Y., Kim, J., Enke, D.: A machine learning trading system for the stock market based on N-period Min-Max labeling using XGBoost. Expert Syst. Appl. 211, 118581 (2023)
Article Google Scholar
Ali, M.B.: Impact of micro and macroeconomic variables on emerging stock market return: a case on Dhaka Stock Exchange (DSE). Interdisc. J. Res. Bus. 1(5), 8–16 (2011)
Google Scholar
Sousa, M.G., Sakiyama, K., de Souza Rodrigues, L., Moraes, P.H., Fernandes, E.R., Matsubara, E.T.: BERT for stock market sentiment analysis. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1597–1601. IEEE, November 2019
Google Scholar
Araci, D.: FinBERT: financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019)
Liao, W., Zeng, B., Yin, X., Wei, P.: An improved aspect-category sentiment analysis model for text sentiment analysis based on RoBERTa. Appl. Intell. 51, 3522–3533 (2021)
Article Google Scholar
Al-Taie, M.Z., Kadry, S., Lucas, J.P.: Online data preprocessing: a case study approach. Int. J. Electr. Comput. Eng. 9(4), 2620 (2019)
Google Scholar
Almeida, F., Xexéo, G.: Word embeddings: a survey. arXiv preprint arXiv:1901.09069 (2019)
Xie, Q., Dai, Z., Hovy, E., Luong, T., Le, Q.: Unsupervised data augmentation for consistency training. Adv. Neural. Inf. Process. Syst. 33, 6256–6268 (2020)
Google Scholar
Rahman, N.: DSEX-News Dataset for Forecasting DSE Using BERT (2023). https://www.kaggle.com/datasets/nilabrahman/dsex-news-dataset-for-forecasting-dse-using-bert/data
Rauf, S.A., Qiang, Y., Ali, S.B., Ahmad, W.: Using BERT for checking the polarity of movie reviews. Int. J. Comput. Appl. 975(8887) (2019)
Google Scholar
https://www.investopedia.com/terms/b/bollingerbands.asp
Bi, J.: Stock market prediction based on financial news text mining and investor sentiment recognition. Math. Probl. Eng. 2022, 1–9 (2022). https://doi.org/10.1155/2022/2427389
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
Md. Nabil Rahman Khan, Most. Sadia Salsabil & Mohammad Shafiul Alam
Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
Khan Md. Hasib
Business Information Systems, Australian Institute of Higher Education, Sydney, Australia
Md Rafiqul Islam & Cesar Sanin
Faculty of Management and Economics, Gdansk University of Technology, Gdansk, Poland
Edward Szczerbicki

Authors

Md. Nabil Rahman Khan
View author publications
You can also search for this author in PubMed Google Scholar
Most. Sadia Salsabil
View author publications
You can also search for this author in PubMed Google Scholar
Khan Md. Hasib
View author publications
You can also search for this author in PubMed Google Scholar
Md Rafiqul Islam
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Shafiul Alam
View author publications
You can also search for this author in PubMed Google Scholar
Cesar Sanin
View author publications
You can also search for this author in PubMed Google Scholar
Edward Szczerbicki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md Rafiqul Islam .

Editor information

Editors and Affiliations

Wroclaw University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
University of Pau and Adour Countries, Pau, France
Richard Chbeir
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Japan Advanced Institute of Science and Technology, Nomi, Japan
Le Minh Nguyen
Wrocław University of Science and Technology, Wrocław, Poland
Krystian Wojtkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, M.N.R. et al. (2024). News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2024. Communications in Computer and Information Science, vol 2145. Springer, Singapore. https://doi.org/10.1007/978-981-97-5934-7_19

Download citation

DOI: https://doi.org/10.1007/978-981-97-5934-7_19
Published: 13 August 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5933-0
Online ISBN: 978-981-97-5934-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT