Skip to main content
Log in

Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Stock price volatility prediction is regarded as one of the most attractive and meaningful research issues in financial market. Some existing researches have pointed out that both the prediction accuracy and the prediction speed are the most important factors in the process of stock prediction. In this paper, we focus on the problem of how to design a methodology which can improve prediction accuracy as well as speed up prediction process, and propose a new prediction model which employs mutual information- based sentimental analysis methodology with extreme learning machine to enhance the prediction performance. The two major contributions of our work are (1) as the words in the news documents are not absolutely negative or positive, and the lengths of the financial news documents are various; here, we propose a new sentimental analysis methodology based on mutual information to improve the efficiency of feature selection, which is different from the traditional sentimental analysis algorithm, and a new weighting scheme is also used in the feature weighting process; (2) since ELM is a fast learning model and has been successfully applied in many research fields, we propose a prediction model which combined mutual information-based sentimental analysis with kernel-based ELM named as MISA-K-ELM. This model has the benefits of both statistical sentimental analysis and ELM, which can well balance the requirements of both prediction accuracy and prediction speed. We take experiments on HKEx 2001 stock market datasets to validate the performance of the proposed MISA-K-ELM. The market historical price and the market news are implemented in our MISA-K-ELM. To test the efficiency of MISA, we first compare the prediction accuracy of ELM model using MISA with ELM model using traditional sentimental analysis. Then, we compare our proposed MISA-K-ELM with existing state-of-the-art learning algorithms, such as Back-Propagation Neural Network (BP-NN), and Support Vector Machine (SVM). Our experimental results show that (1) MISA model can help get higher prediction accuracy than traditional sentimental analysis models; (2) MISA-K-ELM and MISA-SVM have a higher prediction accuracy than MISA-BP-NN and MISA-B-ELM; (3) both MISA-K-ELM and MISA-B-ELM can achieve faster prediction speed than MISA-SVM and MISA-BP-NN in most cases; (4) in most cases, MISA-K-ELM has higher prediction accuracy than the other three methodologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Software is downloaded on http://ictclas.org.

  2. www.finet.hk/mainsite/index.htm.

  3. Levenberg–Marquardt algorithm has its implementation in MATLAB toolbox.

  4. Notation \(\#(X)\) indicates the number of object X.

References

  • Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65

    Article  MathSciNet  MATH  Google Scholar 

  • Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, 2010, pp 2200–2204

  • Bautin M, Vijayarenu L, Skiena S (2008) International sentiment analysis for news and blogs. In: ICWSM, 2008

  • Bhatia N et al (2010) Survey of nearest neighbor techniques. arXiv:1007.0085

  • Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8

    Article  Google Scholar 

  • Cheung C-C, Ng S-C, Lui AK, Xu SS (2010) Enhanced two-phase method in fast learning algorithms. In: Proceedings of the 2010 international joint conference on neural networks (IJCNN’10), IEEE, 2010, pp 1–7

  • Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: BMVC, vol 810, 2008, pp 812–815

  • Dai W, Wu J-Y, Lu C-J (2012) Combining nonlinear independent component analysis and neural network for the prediction of asian stock market indexes. Exp Syst Appl 39(4):4444–4452

    Article  Google Scholar 

  • Deng S, Mitsubuchi T, Shioda K, Shimada T, Sakurai A (2011) Combining technical analysis with sentiment analysis for stock price prediction. In: Dependable, autonomic and secure computing (DASC), 2011 IEEE 9th international conference on, IEEE, 2011, pp 800–807

  • Feldman R, Rosenfeld B, Bar-Haim R, Fresko M (2011) The stock sonarłsentiment analysis of stocks based on a hybrid approach. In: 23rd IAAI Conference, 2011

  • Feng G, Huang G-B, Lin Q, Gay RKL (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357

    Article  Google Scholar 

  • Handoko SD, Keong KC, Soon OY, Zhang GL, Brusic V (2006) Extreme learning machine for predicting hla-peptide binding. In: Advances in neural networks-ISNN. Springer, 2006, pp 716–721

  • Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501

    Article  Google Scholar 

  • Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062

    Article  Google Scholar 

  • Hung J-C (2015) Robust kalman filter based on a fuzzy garch model to forecast volatility using particle swarm optimization. Soft Comput 19(10):2861–2869

    Article  Google Scholar 

  • Ku L-W, Liang Y-T, Chen H-H (2006) Opinion extraction, summarization and tracking in news and blog corpora. In: Proceeding of AAAI, 2006

  • Li J, Fong S, Zhuang Y, Khoury R (2015) Hierarchical classification in text mining for sentiment analysis of online news. Soft Comput 2015:1–10

  • Li X, Wang C, Dong J, Wang F, Deng X, Zhu S (2011) Improving stock market prediction by integrating both market news and stock prices. In: Database and expert systems applications, Springer, 2011, pp 279–293

  • Martinez LC, da Hora DN, de Palotti JRM, Meira W, Pappa GL (2009) From an artificial neural network to a stock market day-trading system: a case study on the bm&f bovespa. In: Proceedings of the international joint conference on neural networks (IJCNN’09), IEEE, 2009, pp 2006–2013

  • Nguyen NN, Quek C (2010) Stock price prediction using generic self-evolving takagi–sugeno–kang (gsetsk) fuzzy neural network. In: Proceedings of the international joint conference on neural networks (IJCNN’10), IEEE, 2010, pp 1–8

  • O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: Linking text sentiment to public opinion time series. ICWSM 11:122–129

    Google Scholar 

  • Paik JH (2013) A novel tf-idf weighting scheme for effective ranking. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, 2013, pp 343–352

  • Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning

  • Rong H-J, Huang G-B, Ong Y-S (2008) Extreme learning machine for multi-categories classification applications. In: Proceedings of the international joint conference on neural networks (IJCNN’08), 2008, pp 1709–1713

  • Ruiz EJ, Hristidis V, Castillo C, Gionis A, Jaimes A (2012) Correlating financial time series with micro-blogging activity. In: Proceedings of the fifth ACM international conference on Web search and data mining, ACM, 2012, pp 513–522

  • Saraswathi S, Sundaram S, Sundararajan N, Zimmermann M, Nilsen-Hamilton M (2011) Icga-pso-elm approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented. Computational biology and bioinformatics. IEEE/ACM Trans 8(2):452–463

    Google Scholar 

  • Schumaker RP, Chen H (2006) Textual analysis of stock market prediction using financial news. In: Americas conference on information systems, 2006

  • Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the azfin text system. ACM Trans Inf Syst (TOIS) 27(2):12

    Article  Google Scholar 

  • Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. In: ACL (2), 2013, pp 24–29

  • Sun Y, Yuan Y, Wang G (2011) An os-elm based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443

    Article  Google Scholar 

  • Tang J, Wang D, Chai T (2012) Predicting mill load using partial least squares and extreme learning machines. Soft Comput 16(9):1585–1594

  • Ticknor JL (2013) A bayesian regularized artificial neural network for stock market forecasting. Expert Syst Appl 40(14):5501–5506

  • Turney PD, Littman ML (2003) Measuring praise and criticism: inference of semantic orientation from association. ACM Trans Inf Syst 21(4):315–346

    Article  Google Scholar 

  • Wang R, Kwong S, Wang X (2012) A study on random weights between input and hidden layers in extreme learning machine. Soft Comput 16(9):1465–1475

    Article  Google Scholar 

  • Wu HC, Luk RWP, Wong KF, Kwok KL (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst (TOIS) 26(3):13

    Article  Google Scholar 

  • Wu Q, Tan S, Cheng X (2009) Graph ranking for sentiment transfer. In: Proceedings of the ACL-IJCNLP 2009 conference short papers. Association for computational linguistics, 2009, pp 317–320

  • Zhang R, Xu Z-B, Huang G-B, Wang D (2012) Global convergence of online bp training with dynamic learning rate. IEEE Trans Neural Netw Learn Syst 23(2):330–341

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by National Nature Science Foundation of China (Grant Nos. 51277135, 50707021, 61103125, 61373038, and 61573157); the Doctoral Fund of Ministry of Education of China (Grant No. 20100141120046); the Natural Science Foundation of Hubei Province of China (Grant No. 2010CDB08504); the 111 Programme of Introducing Talents of Discipline to Universities (Grant No. B07037); Wuhan University Academic Development Plan for Scholars after 1970s [(“Research on Internet User Behavior”; The National High Resolution Earth Observation System (the Civil Part) Technology Projects of China; the Fund of Natural Science Foundation of Guangdong Province of China with the Grant No. 2014A030313454 and Special Funds for Projects of Basic Research and Operational Costs of the Central Universities].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Wang.

Ethics declarations

Conflict of interest

Author Feng Wang declares that she has no conflict of interest. Author Yongquan Zhang declares that he has no conflict of interest. Author Qi Rao declares that he has no conflict of interest. Author Kangshun Li declares that he has no conflict of interest. Author Hao Zhang declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, F., Zhang, Y., Rao, Q. et al. Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction. Soft Comput 21, 3193–3205 (2017). https://doi.org/10.1007/s00500-015-2003-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-2003-z

Keywords

Navigation