Abstract
Financial markets are very sensitive to emerging news related to stock because investors need to continuously monitor financial events when deciding buying and selling stocks. Tracking important events has done mostly using rule-based methods in the past, which is time-consuming in the face of huge online news data. To track this issue, in this paper, a novel document embedding technology based on TF-IDF and BERT incorporating online text cluster algorithm to form an automated event detection system is proposed. Embedding technology is first used to encode text to vectors and then an online text cluster algorithm - SinglePass is implemented to accomplish topic tracking. Experiment results show that the proposed algorithms can effectively detect and track online topics. In addition, both domestic and international events such as the outbreak of novel coronavirus (COVID-19) and Sino-U.S. trade war and their impact on capital market in China are analyzed, which demonstrate the practical and economic value of proposed system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, W.Y., Zhang, C., Xu, K.F., Zhang, Z.M.: A self-adaptive microblog topic tracking method by user relationship. Tien Tzu Hsueh Pao/Acta Electronica Sinica 45(6), 1375–1381 (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Feuerriegel, S., Ratku, A., Neumann, D.: Analysis of how underlying topics in financial news affect stock prices using latent Dirichlet allocation. In: 2016 49th Hawaii International Conference on System Sciences (HICSS) (2016)
Kim, H.K., Kim, H., Cho, S.: Bag-of-concepts: comprehending document representation through clustering words in distributed representation (2016)
Hogenboom, F., Frasincar, F., Kaymak, U., De Jong, F., Caron, E.: A survey of event extraction methods from text for decision support systems. Decis. Support Syst. 85, 12–22 (2016)
Kim, K., Lee, S.Y., Kauffman, R.J.: Social sentiment and stock trading via mobile phones. Association for Information Systems (2016)
Liu, J., Peng, Y., Zhang, L., Zhang, Y., Deng, J.: LDA-K-means algorithm of network food safety topic detection. Eng. J. Wuhan Univ. 50(2), 307–310 (2017)
Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300–309 (2016)
Nuij, W., Milea, V., Hogenboom, F., Frasincar, F., Kaymak, U.: An automated framework for incorporating news into stock trading strategies. IEEE Trans. Knowl. Data Eng. 26(4), 823–835 (2013)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-Networks. arXiv preprint arXiv:1908.10084 (2019)
Tafti, A., Zotti, R., Jank, W.: Real-time diffusion of information on Twitter and the financial markets. PloS one 11(8), e0159226 (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Wang, C.I., Zhang, J.: Improved K-means algorithm based on latent Dirichlet allocation for text clustering. J. Comput. Appl. 34(1), 249–254 (2014)
Xiaolin, Y., Xiao, Z., Nan, K., Fengchao, Z.: An improved single-pass clustering algorithm internet-oriented network topic detection. In: 2013 Fourth International Conference on Intelligent Control and Information Processing (ICICIP), pp. 560–564. IEEE (2013)
Zhang, Y., Song, A.: Application of improved algorithm based on K-means in microblog topic discovery. Comput. Syst. Appl. 25(10), 308–311 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, Y. et al. (2020). Online Topic Detection and Tracking System and Its Application on Stock Market in China. In: Dou, Z., Miao, Q., Lu, W., Mao, J., Jia, G. (eds) Information Retrieval. CCIR 2020. Lecture Notes in Computer Science(), vol 12285. Springer, Cham. https://doi.org/10.1007/978-3-030-56725-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-56725-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-56724-8
Online ISBN: 978-3-030-56725-5
eBook Packages: Computer ScienceComputer Science (R0)