Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction

Wang, Feng; Zhang, Yongquan; Rao, Qi; Li, Kangshun; Zhang, Hao

doi:10.1007/s00500-015-2003-z

Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction

Methodologies and Application
Published: 18 January 2016

Volume 21, pages 3193–3205, (2017)
Cite this article

Soft Computing Aims and scope Submit manuscript

Feng Wang¹,
Yongquan Zhang¹,
Qi Rao²,
Kangshun Li³ &
…
Hao Zhang⁴

1602 Accesses
55 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

Stock price volatility prediction is regarded as one of the most attractive and meaningful research issues in financial market. Some existing researches have pointed out that both the prediction accuracy and the prediction speed are the most important factors in the process of stock prediction. In this paper, we focus on the problem of how to design a methodology which can improve prediction accuracy as well as speed up prediction process, and propose a new prediction model which employs mutual information- based sentimental analysis methodology with extreme learning machine to enhance the prediction performance. The two major contributions of our work are (1) as the words in the news documents are not absolutely negative or positive, and the lengths of the financial news documents are various; here, we propose a new sentimental analysis methodology based on mutual information to improve the efficiency of feature selection, which is different from the traditional sentimental analysis algorithm, and a new weighting scheme is also used in the feature weighting process; (2) since ELM is a fast learning model and has been successfully applied in many research fields, we propose a prediction model which combined mutual information-based sentimental analysis with kernel-based ELM named as MISA-K-ELM. This model has the benefits of both statistical sentimental analysis and ELM, which can well balance the requirements of both prediction accuracy and prediction speed. We take experiments on HKEx 2001 stock market datasets to validate the performance of the proposed MISA-K-ELM. The market historical price and the market news are implemented in our MISA-K-ELM. To test the efficiency of MISA, we first compare the prediction accuracy of ELM model using MISA with ELM model using traditional sentimental analysis. Then, we compare our proposed MISA-K-ELM with existing state-of-the-art learning algorithms, such as Back-Propagation Neural Network (BP-NN), and Support Vector Machine (SVM). Our experimental results show that (1) MISA model can help get higher prediction accuracy than traditional sentimental analysis models; (2) MISA-K-ELM and MISA-SVM have a higher prediction accuracy than MISA-BP-NN and MISA-B-ELM; (3) both MISA-K-ELM and MISA-B-ELM can achieve faster prediction speed than MISA-SVM and MISA-BP-NN in most cases; (4) in most cases, MISA-K-ELM has higher prediction accuracy than the other three methodologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

Mayur Wankhade, Annavarapu Chandra Sekhara Rao & Chaitanya Kulkarni

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Pranati Rakshit & Avik Sarkar

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

Nirmal Varghese Babu & E. Grace Mary Kanaga

Notes

Software is downloaded on http://ictclas.org.
www.finet.hk/mainsite/index.htm.
Levenberg–Marquardt algorithm has its implementation in MATLAB toolbox.
Notation \(\#(X)\) indicates the number of object X.

References

Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65
Article MathSciNet MATH Google Scholar
Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, 2010, pp 2200–2204
Bautin M, Vijayarenu L, Skiena S (2008) International sentiment analysis for news and blogs. In: ICWSM, 2008
Bhatia N et al (2010) Survey of nearest neighbor techniques. arXiv:1007.0085
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8
Article Google Scholar
Cheung C-C, Ng S-C, Lui AK, Xu SS (2010) Enhanced two-phase method in fast learning algorithms. In: Proceedings of the 2010 international joint conference on neural networks (IJCNN’10), IEEE, 2010, pp 1–7
Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: BMVC, vol 810, 2008, pp 812–815
Dai W, Wu J-Y, Lu C-J (2012) Combining nonlinear independent component analysis and neural network for the prediction of asian stock market indexes. Exp Syst Appl 39(4):4444–4452
Article Google Scholar
Deng S, Mitsubuchi T, Shioda K, Shimada T, Sakurai A (2011) Combining technical analysis with sentiment analysis for stock price prediction. In: Dependable, autonomic and secure computing (DASC), 2011 IEEE 9th international conference on, IEEE, 2011, pp 800–807
Feldman R, Rosenfeld B, Bar-Haim R, Fresko M (2011) The stock sonarłsentiment analysis of stocks based on a hybrid approach. In: 23rd IAAI Conference, 2011
Feng G, Huang G-B, Lin Q, Gay RKL (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357
Article Google Scholar
Handoko SD, Keong KC, Soon OY, Zhang GL, Brusic V (2006) Extreme learning machine for predicting hla-peptide binding. In: Advances in neural networks-ISNN. Springer, 2006, pp 716–721
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501
Article Google Scholar
Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062
Article Google Scholar
Hung J-C (2015) Robust kalman filter based on a fuzzy garch model to forecast volatility using particle swarm optimization. Soft Comput 19(10):2861–2869
Article Google Scholar
Ku L-W, Liang Y-T, Chen H-H (2006) Opinion extraction, summarization and tracking in news and blog corpora. In: Proceeding of AAAI, 2006
Li J, Fong S, Zhuang Y, Khoury R (2015) Hierarchical classification in text mining for sentiment analysis of online news. Soft Comput 2015:1–10
Li X, Wang C, Dong J, Wang F, Deng X, Zhu S (2011) Improving stock market prediction by integrating both market news and stock prices. In: Database and expert systems applications, Springer, 2011, pp 279–293
Martinez LC, da Hora DN, de Palotti JRM, Meira W, Pappa GL (2009) From an artificial neural network to a stock market day-trading system: a case study on the bm&f bovespa. In: Proceedings of the international joint conference on neural networks (IJCNN’09), IEEE, 2009, pp 2006–2013
Nguyen NN, Quek C (2010) Stock price prediction using generic self-evolving takagi–sugeno–kang (gsetsk) fuzzy neural network. In: Proceedings of the international joint conference on neural networks (IJCNN’10), IEEE, 2010, pp 1–8
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: Linking text sentiment to public opinion time series. ICWSM 11:122–129
Google Scholar
Paik JH (2013) A novel tf-idf weighting scheme for effective ranking. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, 2013, pp 343–352
Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning
Rong H-J, Huang G-B, Ong Y-S (2008) Extreme learning machine for multi-categories classification applications. In: Proceedings of the international joint conference on neural networks (IJCNN’08), 2008, pp 1709–1713
Ruiz EJ, Hristidis V, Castillo C, Gionis A, Jaimes A (2012) Correlating financial time series with micro-blogging activity. In: Proceedings of the fifth ACM international conference on Web search and data mining, ACM, 2012, pp 513–522
Saraswathi S, Sundaram S, Sundararajan N, Zimmermann M, Nilsen-Hamilton M (2011) Icga-pso-elm approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented. Computational biology and bioinformatics. IEEE/ACM Trans 8(2):452–463
Google Scholar
Schumaker RP, Chen H (2006) Textual analysis of stock market prediction using financial news. In: Americas conference on information systems, 2006
Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the azfin text system. ACM Trans Inf Syst (TOIS) 27(2):12
Article Google Scholar
Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. In: ACL (2), 2013, pp 24–29
Sun Y, Yuan Y, Wang G (2011) An os-elm based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443
Article Google Scholar
Tang J, Wang D, Chai T (2012) Predicting mill load using partial least squares and extreme learning machines. Soft Comput 16(9):1585–1594
Ticknor JL (2013) A bayesian regularized artificial neural network for stock market forecasting. Expert Syst Appl 40(14):5501–5506
Turney PD, Littman ML (2003) Measuring praise and criticism: inference of semantic orientation from association. ACM Trans Inf Syst 21(4):315–346
Article Google Scholar
Wang R, Kwong S, Wang X (2012) A study on random weights between input and hidden layers in extreme learning machine. Soft Comput 16(9):1465–1475
Article Google Scholar
Wu HC, Luk RWP, Wong KF, Kwok KL (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst (TOIS) 26(3):13
Article Google Scholar
Wu Q, Tan S, Cheng X (2009) Graph ranking for sentiment transfer. In: Proceedings of the ACL-IJCNLP 2009 conference short papers. Association for computational linguistics, 2009, pp 317–320
Zhang R, Xu Z-B, Huang G-B, Wang D (2012) Global convergence of online bp training with dynamic learning rate. IEEE Trans Neural Netw Learn Syst 23(2):330–341
Article Google Scholar

Download references

Acknowledgments

This work is supported by National Nature Science Foundation of China (Grant Nos. 51277135, 50707021, 61103125, 61373038, and 61573157); the Doctoral Fund of Ministry of Education of China (Grant No. 20100141120046); the Natural Science Foundation of Hubei Province of China (Grant No. 2010CDB08504); the 111 Programme of Introducing Talents of Discipline to Universities (Grant No. B07037); Wuhan University Academic Development Plan for Scholars after 1970s [(“Research on Internet User Behavior”; The National High Resolution Earth Observation System (the Civil Part) Technology Projects of China; the Fund of Natural Science Foundation of Guangdong Province of China with the Grant No. 2014A030313454 and Special Funds for Projects of Basic Research and Operational Costs of the Central Universities].

Author information

Authors and Affiliations

State Key Lab of Software Engineering, Wuhan University, Wuhan, China
Feng Wang & Yongquan Zhang
Institute of Computational Linguistics, Peking University, Beijing, China
Qi Rao
College of Mathematics and Information, South China Agricultural University, Guangzhou, China
Kangshun Li
College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Hao Zhang

Authors

Feng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yongquan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Rao
View author publications
You can also search for this author in PubMed Google Scholar
Kangshun Li
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Wang.

Ethics declarations

Conflict of interest

Author Feng Wang declares that she has no conflict of interest. Author Yongquan Zhang declares that he has no conflict of interest. Author Qi Rao declares that he has no conflict of interest. Author Kangshun Li declares that he has no conflict of interest. Author Hao Zhang declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, F., Zhang, Y., Rao, Q. et al. Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction. Soft Comput 21, 3193–3205 (2017). https://doi.org/10.1007/s00500-015-2003-z

Download citation

Published: 18 January 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00500-015-2003-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation