Skip to main content

Abstract

Stock Market (SM) is a significant sector of countries’ economy and represents a crucial role in the growth of their commerce and industry. Hence, discovering efficient ways to analyse and visualise stock market data is considered a significant issue in modern finance. The use of Data Mining (DM) techniques to predict stock market has been extensively studied using historical market prices but such approaches are constrained to make assessments within the scope of existing information, and thus they are not able to model any random behaviour of stock market or provide causes behind events. One area of limited success in stock market prediction comes from textual data, which is a rich source of information and analysing it may provide better understanding of random behaviours of the market. Text Mining (TM) combined with Random Forest (RF) algorithm offers a novel approach to study critical indicators, which contribute to the prediction of stock market abnormal movements. A Stock Market Random Forest-Text Mining system (SMRF-TM) is developed to mine the critical indicators related to the 2009 Dubai stock market debt standstill. Random forest is applied to classify the extracted features into a set of semantic classes, thus extending current approaches from three to eight classes: critical down, down, neutral, up, critical up, economic, social and political. The study demonstrates that Random Forest has outperformed the other classifiers and has achieved the best accuracy in classifying the bigram features extracted from the corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. IMF: United Arab Emirates 2009 Article IV Consultation—Staff Report; Public Information Notice; and Statement by the Executive Director for United Arab Emirates, IMF Country Report No. 10/42, February 2010

    Google Scholar 

  2. Gómez, M.M.Y., Gelbukh, A., López, A.L.: Mining the news: trends, associations, and deviations. Computación Sistemas 5(1), 14–24 (2001)

    Google Scholar 

  3. Wüthrich, B., Permunetilleke, D., Leung, S., Lam, W., Cho, V., Zhang, J.: Daily prediction of major stock indices from textual WWW data. HKIE Trans. 5(3), 151–156 (1998)

    Google Scholar 

  4. Nikfarjam, A., Emadzadeh, E., Muthaiyah, S.: Text mining approaches for stock market prediction. In: The 2nd International Conference on Computer and Automation Engineering ICCAE 2010, vol. 4, pp. 256–260. IEEE (2010)

    Google Scholar 

  5. Kloptchenko, A., Eklund, T., Karlsson, J., Back, B., Vanharanta, H., Visa, A.: Combining data and text mining techniques for analysing financial reports. Intell. Syst. Acc. Finan. Manag. 12(1), 29–41 (2004)

    Article  Google Scholar 

  6. Patel, J., Shah, S., Thakkar, P., Kotecha, K.: Predicting stock market index using fusion of machine learning techniques. Expert Syst. Appl. 42(4), 2162–2172 (2015)

    Article  Google Scholar 

  7. Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012)

    Article  Google Scholar 

  8. Drury, B.: A text mining system for evaluating the stock market’s response to news. Doctoral dissertation in Computer science, University of Porto (2013)

    Google Scholar 

  9. Kumar, B.S., Ravi, V.: A survey of the applications of text mining in financial domain. Knowl.-Based Syst. 114, 128–147 (2016)

    Article  Google Scholar 

  10. Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: a systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)

    Article  Google Scholar 

  11. Gonçalves, P., Araújo, M., Benevenuto, F., Cha, M.: Comparing and combining sentiment analysis methods. In: 2013 Proceedings of the First ACM Conference on Online Social Networks, pp. 27–38. ACM (2013)

    Google Scholar 

  12. Ming, F., Wong, F., Liu, Z., Chiang, M.: Stock market prediction from WSJ: text mining via sparse matrix factorization. In: IEEE International Conference on Data Mining, ICDM 2014, pp. 430–439. IEEE (2014)

    Google Scholar 

  13. Sun, A., Lachanski, M., Fabozzi, F.J.: Trade the tweet: social media text mining and sparse matrix factorization for stock market prediction. Int. Rev. Financ. Anal. 48, 272–281 (2016)

    Article  Google Scholar 

  14. Kim, Y., Jeong, S.R., Ghani, I.: Text opinion mining to analyze news for stock market prediction. Int. J. Adv. Soft Comput. Appl. 6(1) (2014)

    Google Scholar 

  15. Ali, M.M.Z., Theodoulidis, B.: Analyzing stock market fraud cases using a linguistics-based text mining approach. In: WaSABi-FEOSW@ ESWC (2014)

    Google Scholar 

  16. Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. (TOIS) 27(2), 12 (2009)

    Article  Google Scholar 

  17. Sorto, M., Aasheim, C., Wimmer, H.: Feeling the stock market: a study in the prediction of financial markets based on news sentiment. In: 2017 Proceedings of the Southern Association for Information Systems Conference, St. Simons Island, GA, USA (2017)

    Google Scholar 

  18. Khedr, A.E., Salama, S.E., Yaseen, N.: Predicting stock market behavior using data mining technique and news sentiment analysis. Int. J. Intell. Syst. Appl. (IJISA) 9(7), 22–30 (2017)

    Google Scholar 

  19. Tasci, S., Gungor, T.: An evaluation of existing and new feature selection metrics in text categorization. In: 23rd International Symposium on Computer and Information Sciences (ISCIS), pp. 1–6. IEEE (2008)

    Google Scholar 

  20. Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3(Mar), 1289–1305 (2003)

    MATH  Google Scholar 

  21. Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013)

    Article  Google Scholar 

  22. Kaya, M.Y., Karsligil, M.E.: Stock price prediction using financial news articles. In: 2nd IEEE International Conference on Information and Financial Engineering (ICIFE), pp. 478–482. IEEE (2010)

    Google Scholar 

  23. Myung, J., Yang, J.Y., Lee, S.G.: Picachoo: a tool for customizable feature extraction utilizing characteristics of textual data. In: Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, pp. 650–655. ACM (2009)

    Google Scholar 

  24. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  25. Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques. Informatica 31, 249–268 (2007)

    MathSciNet  MATH  Google Scholar 

  26. Bradley, P.S., Fayyad, U.M., Reina, C.A.: Clustering very large databases using EM mixture models. In: 2000 15th International Conference on Pattern Recognition, vol. 2, pp. 76–80. IEEE (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mazen Nabil Elagamy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Elagamy, M.N., Stanier, C., Sharp, B. (2018). Text Mining Approach to Analyse Stock Market Movement. In: Hassanien, A., Tolba, M., Elhoseny, M., Mostafa, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018). AMLTA 2018. Advances in Intelligent Systems and Computing, vol 723. Springer, Cham. https://doi.org/10.1007/978-3-319-74690-6_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74690-6_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74689-0

  • Online ISBN: 978-3-319-74690-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics