A hybrid approach for stock trend prediction based on tweets embedding and historical prices

Ni, Huihui; Wang, Shuting; Cheng, Peng

doi:10.1007/s11280-021-00880-9

A hybrid approach for stock trend prediction based on tweets embedding and historical prices

Published: 22 April 2021

Volume 24, pages 849–868, (2021)
Cite this article

World Wide Web Aims and scope Submit manuscript

1346 Accesses
17 Citations
Explore all metrics

Abstract

Recently, the development of data mining and natural language processing techniques enable the relationship probe between social media and stock market volatility. The integration of natural language processing, deep learning and the financial field is irresistible. This paper proposes a hybrid approach for stock market prediction based on tweets embedding and historical prices. Different from the traditional text embedding methods, our approach takes the internal semantic features and external structural characteristics of Twitter data into account, such that the generated tweet vectors can contain more effective information. Specifically, we develop a Tweet Node algorithm for describing potential connection in Twitter data through constructing the tweet node network. Further, our model supplements emotional attributes to the Twitter representations, which are input into a deep learning model based on attention mechanism together with historical stock price. In addition, we designed a visual interactive stock prediction tool to display the result of the prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Mental Health Analysis in Social Media Posts: A Survey

Article 03 January 2023

A CNN-BiLSTM-AM method for stock price prediction

Article 24 November 2020

References

Akita, R., Yoshihara, A., Matsubara, T., Uehara, K.: Deep learning for stock prediction using numerical and textual information. In: 15th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2016, Okayama, Japan, June 26-29, 2016, pp. 1–6. IEEE Computer Society (2016)
Ali, S.A., Raza, B., Malik, A.K., Shahid, A.R., Faheem, M., Alquhayz, H., Kumar, Y.J.: An optimally configured and improved deep belief network (OCI-DBN) approach for heart disease prediction based on ruzzo-tompa and stacked genetic algorithm. IEEE Access 8, 65947–65958 (2020)
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015)
Conneau, A., Kruszewski, G., Lample, G., Barrault, L., Baroni, M.: What you can cram into a single vector: Probing sentence embeddings for linguistic properties. arXiv:1805.01070 (2018)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019)
Ding, X., Zhang, Y., Liu, T., Duan, J.: Deep learning for event-driven stock prediction. In: Yang, Q., Wooldridge, M.J. (eds.) Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pp. 2327–2333. AAAI Press (2015)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 855–864. ACM (2016)
Hiemstra, D.: A probabilistic justification for using tf x idf term weighting in information retrieval. Int. J. Digit. Libr. 3(2), 131–139 (2000)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hu, Z., Liu, W., Bian, J., Liu, X., Liu, T.: Listening to chaotic whispers: A deep learning framework for news-oriented stock trend prediction. In: Chang, Y., Zhai, C., Liu, Y., Maarek, Y. (eds.) Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, February 5-9, 2018, pp. 261–269. ACM (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, JMLR Workshop and Conference Proceedings, vol. 32, pp. 1188–1196. JMLR.org (2014)
Li, Q., Chen, Y., Wang, J., Chen, Y., Chen, H.: Web media and stock markets : A survey and future directions from a big data perspective. IEEE Trans. Knowl. Data Eng. 30(2), 381–399 (2018)
Article Google Scholar
Li, X., Li, Y., Yang, H., Yang, L., Liu, X.: DP-LSTM: differential privacy-inspired LSTM for stock prediction using financial news. arXiv:1912.10806 (2019)
Lin, Z., Feng, M., dos Santos, C.N., Yu, M., Xiang, B., Zhou, B., Bengio, Y.: A structured self-attentive sentence embedding. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net (2017)
Little, C., Mclean, D., Crockett, K.A., Edmonds, B.: A semantic and syntactic similarity measure for political tweets. IEEE Access 8, 154095–154113 (2020)
Article Google Scholar
Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. In: Barzilay, R., Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp. 1–10. Association for Computational Linguistics (2017)
Liu, X., Huang, H., Zhang, Y., Yuan, C.: News-driven stock prediction with attention-based noisy recurrent state transition. arXiv:2004.01878 (2020)
Ma, Y., Zong, L., Wang, P.: A novel distributed representation of news (drnews) for stock market predictions. arXiv:2005.11706 (2020)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Bengio, Y., LeCun, Y. (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C. J. C., Bottou, L., Ghahramani, Z., Weinberger, K. Q. (eds.) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013)
Nguyen, T.H., Shirai, K.: Topic modeling based sentiment analysis on social media for stock market prediction. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pp. 1354–1364. The Association for Computer Linguistics (2015)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pp. 701–710. ACM (2014)
Rather, A.M., Agarwal, A., Sastry, V.N.: Recurrent neural network and a hybrid model for prediction of stock returns. Expert Syst. Appl. 42(6), 3234–3241 (2015)
Article Google Scholar
Ren, R., Wu, D.D., Liu, T.: Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Syst. J. 13 (1), 760–770 (2019)
Article Google Scholar
Sawhney, R., Agarwal, S., Wadhwa, A., Shah, R.R.: Deep attentive learning for stock movement prediction from social media text and company correlations. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 8415–8426 (2020)
Scheel, O.: Using deep neural networks for scene understanding and behaviour prediction in autonomous driving. Ph.D. thesis, Technical University of Munich, Germany (2020)
Staudemeyer, R.C., Morris, E.R.: Understanding LSTM - a tutorial into long short-term memory recurrent neural networks. arXiv:1909.09586 (2019)
Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:1602.07261 (2016)
Thompson, N.C., Greenewald, K., Lee, K., Manso, G.F.: The computational limits of deep learning. arXiv:2007.05558 (2020)
Vanstone, B.J., Gepp, A., Harris, G.: Do news and sentiment play a role in stock price prediction?. Appl. Intell. 49(11), 3815–3820 (2019)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5998–6008 (2017)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 1225–1234. ACM (2016)
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, L., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144 (2016)
Xu, N., Zeng, Z., Mao, W.: Reasoning with multimodal sarcastic tweets via modeling cross-modality contrast and semantic association. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 3777–3786. Association for Computational Linguistics (2020)
Xu, Y., Cohen, S.B.: Stock movement prediction from tweets and historical prices. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp. 1970–1979. Association for Computational Linguistics (2018)
Yahoo finance. https://finance.yahoo.com/ (2012)
Yang, Y., Wu, B., Zhao, K., Guo, W.: Tweet stance detection: A two-stage DC-BILSTM model based on semantic attention. In: 5th IEEE International Conference on Data Science in Cyberspace, DSC 2020, Hong Kong, July 27-30, 2020, pp. 22–29. IEEE (2020)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: Knight, K., Nenkova, A., Rambow, O. (eds.) NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp. 1480–1489. The Association for Computational Linguistics (2016)
Zheng, J., Xia, A., Shao, L., Wan, T., Qin, Z.: Stock volatility prediction based on self-attention networks with social information. In: IEEE Conference on Computational Intelligence for Financial Engineering & Economics, CIFEr 2019, Shenzhen, China, May 4-5, 2019, pp. 1–7. IEEE (2019)

Download references

Acknowledgements

This work is partially supported by GuangDong Basic and Applied Basic Research Foundation 2019B1515120048.

Author information

Authors and Affiliations

School of Software Engineering, East China Normal University, Shanghai, China
Huihui Ni & Peng Cheng
School of Computer and Information Engineering, Zhejiang Gongshang University, Zhejiang, China
Shuting Wang

Authors

Huihui Ni
View author publications
You can also search for this author in PubMed Google Scholar
Shuting Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Cheng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Huihui Ni and Shuting Wang are joint first author and contribute equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ni, H., Wang, S. & Cheng, P. A hybrid approach for stock trend prediction based on tweets embedding and historical prices. World Wide Web 24, 849–868 (2021). https://doi.org/10.1007/s11280-021-00880-9

Download citation

Received: 04 November 2020
Revised: 15 March 2021
Accepted: 29 March 2021
Published: 22 April 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11280-021-00880-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid approach for stock trend prediction based on tweets embedding and historical prices

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Mental Health Analysis in Social Media Posts: A Survey

A CNN-BiLSTM-AM method for stock price prediction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid approach for stock trend prediction based on tweets embedding and historical prices

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Mental Health Analysis in Social Media Posts: A Survey

A CNN-BiLSTM-AM method for stock price prediction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation