Skip to main content
Log in

A subjectivity classification framework for sports articles using improved cortical algorithms

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The enormous number of articles published daily on the Internet, by a diverse array of authors, often offers misleading or unwanted information, rendering activities such as sports betting riskier. As a result, extracting meaningful and reliable information from these sources becomes a time-consuming and near impossible task. In this context, labeling articles as objective or subjective is not a simple natural language processing task because subjectivity can take several forms. With the rise of online sports betting due to the revolution in Internet and mobile technology, an automated system capable of sifting through all these data and finding relevant sources in a reasonable amount of time presents itself as a desirable and marketable product. In this work, we present a framework for the classification of sports articles composed of three stages: The first stage extracts articles from web pages using text extraction libraries, parses the text and then tags words using Stanford’s parts of speech tagger; the second stage extracts unique syntactic and semantic features, and reduces them using our modified cortical algorithm (CA)—hereafter CA*—while the third stage classifies these texts as objective or subjective. Our framework was tested on a database containing 1000 articles, manually labeled using Amazon’s crowdsourcing tool, Mechanical Turk; and results using CA, CA*, support vector machines and one of its soft computing variants (LMSVM) as classifiers were reported. A testing accuracy of 85.6% was achieved on a fourfold cross-validation with a 40% reduction in features using CA* that was trained using an entropy weight update rule and a cross-entropy cost function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://articles.chicagotribune.com/2013-02-04/news/chi-record-amount-bet-on-2013-super-bowl-20130204_1_nevada-super-bowl-sports-books

  2. http://www.bbc.co.uk/news/business-19554469

  3. http://www.beyondthebets.com/sports-betting-apps-list-of-4-of-the-top-selling-and-most-download-apps-for-sports-bettors/

  4. http://www.forbes.com/sites/darrenheitner/2012/10/08/new-peer-to-peer-sports-betting-app-says-screw-the-bookie-bet-your-friends/

References

  1. Hashmi AG, Lipasti MH (2009) Cortical columns: building blocks for intelligent systems. In: IEEE symposium on computational intelligence for multimedia signal and vision processing, pp 21–28

  2. Hashmi AG, Lipasti MH (2010) Discovering cortical algorithms. In: Proceedings of the international conference on fuzzy computation and international conference on neural computation, Valencia, Spain, pp 196–204

  3. Rizk Y, Mitri N, Awad M (2013) A local mixture based SVM for an efficient supervised binary classification. In: International joint conference on neural networks, Dallas, TX

  4. Rizk Y, Awad M (2012) Syntactic genetic algorithm for a subjectivity analysis of sports articles. In: 11th IEEE international conference on cybernetic intelligent systems, Limerick, Ireland

  5. Esuli A, Sebastiani F (2006) SentiWordNet: a publicly available lexical resource for opinion mining. In: Proceedings of language resources and evaluation, pp 417–422

  6. Yu H, Hatzivassiloglou V (2003) Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In: Proceedings of the 2003 conference on empirical methods in natural language processing, pp 129–136

  7. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the human language technology conference and the conference on empirical methods in natural language processing, pp 347–354

  8. Wiebe J, Riloff E (2011) Finding mutual benefit between subjectivity analysis and information extraction. IEEE Trans Affect Comput 2(4):175–191

    Article  Google Scholar 

  9. Das A, Bandyopadhyay S (2010) Subjectivity detection using genetic algorithm. In: The 1st Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA10), Lisbon, Portugal

  10. Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: NAACL’03: proceedings of the 2003 conference of the North American chapter of the association of computational linguistics on human language technology, Edmonton, Canada, pp 173–180

  11. Heerschop B, Hogenboom A, Frasincar F (2011a) Sentiment lexicon creation from lexical resources. In: 14th International conference on business information systems, vol 87, pp 185–196

  12. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings empirical methods in natural language processing, Philadelphia, pp 79–86

  13. Abbasi A, France S, Zhang Z, Chen H (2011) Selecting attributes for sentiment classification using feature relation networks. IEEE Trans Knowl Data Eng 23:447–462

    Article  Google Scholar 

  14. Wiebe J, Wilson T, Bruce R, Bell M, Martin M (2002) Learning subjective language. In: Technical report TR-02-100, Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania

  15. Devitt A, Ahmad K (2007) Sentiment analysis in financial news: A cohesion-based approach. In: Proceedings of the association for computational linguistics, pp 984–991

  16. Godbole N, Srinivasaiah M, Skiena S (2007) Large-scale sentiment analysis for news and blog. In: Proceedings of the international conference on weblogs and social media, pp 219–222

  17. Heerschop B, Van Iterson P, Hogenboom A, Frasincar F, Kaymak U (2011) Analyzing sentiment in a large set of web data while accounting for negation. Adv Intell Web Mastering 3:195–205

    Article  Google Scholar 

  18. Benamara F, Cesarano C, Picariello A, Reforgiato D, Subrahmanian VS (2007) Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: Proceedings of the international conference on weblogs and social media

  19. Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations, pp 115–120

  20. Guerra PHC, Veloso A, Meira Jr W, Almeida V (2011) From bias to opinion: a transfer-learning approach to real-time sentiment analysis. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 150–158

  21. Berger AL, Pietra VJD, Pietra SAD (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71

    Google Scholar 

  22. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, vol 97, pp 412–420

  23. Liu T, Liu S, Chen Z, Ma WY (2003) An evaluation on feature selection for text clustering. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 488–495

  24. Kim H, Howland P, Park H (2005) Dimension reduction in text classification with support vector machines. J Mach Learn Res 6:37–53

    MathSciNet  MATH  Google Scholar 

  25. Shafiei M, Wang S, Zhang R, Milios E, Tang B, Tougas J, Spiteri R (2007) Document representation and dimension reduction for text clustering. In: 2007 IEEE 23rd international conference on data engineering workshop. IEEE, pp 770–779

  26. Chua FCT (2009) Dimensionality reduction and clustering of text documents. Singapore Management University, Singapore

    Google Scholar 

  27. Mao Y, Balasubramanian K, Lebanon G (2010) Dimensionality reduction for text using domain knowledge. In: Proceedings of the 23rd international conference on computational linguistics: posters. Association for Computational Linguistics, pp 801–809

  28. Bian W, Tao D (2011) Max–min distance analysis by using sequential SDP relaxation for dimension reduction. IEEE Trans Pattern Anal Mach Intell 33(5):1037–1050

    Article  Google Scholar 

  29. Tang EK, Suganthan PN, Yao X, Qin AK (2005) Linear dimensionality reduction using relevance weighted LDA. Pattern Recogn 38(4):485–493

    Article  Google Scholar 

  30. Chen Y, Miao D, Wang R, Wu K (2011) A rough set approach to feature selection based on power set tree. Knowl Based Syst 24(2):275–281

    Article  Google Scholar 

  31. Han Y, Yu L (2012) A variance reduction framework for stable feature selection. Stat Anal Data Min 5(5):428–445

    Article  MathSciNet  Google Scholar 

  32. Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171

    Article  Google Scholar 

  33. Atyabi A, Luerssen M, Fitzgibbon S, Powers DM (2012) Evolutionary feature selection and electrode reduction for EEG classification. In: IEEE congress on evolutionary computation (CEC2012), pp 1–8

  34. Perantonis SJ, Virvilis V (1999) Input feature extraction for multilayered perceptrons using supervised principal component analysis. Neural Process Lett 10(3):243–252

    Article  Google Scholar 

  35. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  Google Scholar 

  36. Deepthi DR, Krishna GR, Eswaran K (2007) Automatic pattern classification by unsupervised learning using dimensionality reduction of data with mirroring neural networks. Preprint arXiv:0712.0938

  37. Bi J, Bennett K, Embrechts M, Breneman C, Song M (2003) Dimensionality reduction via sparse support vector machines. J Mach Learn Res 3:1229–1243

    MATH  Google Scholar 

  38. Wang M, Sha F, Jordan MI (2010) Unsupervised kernel dimension reduction. In: Advances in neural information processing systems, pp 2379–2387

  39. Formisano E, De Martino F, Bonte M, Goebel R (2008) “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322(5903):970–973

    Article  Google Scholar 

  40. Edelman GM, Mountcastle VB (1982) The mindful brain. The MIT Press, Cambridge

    Google Scholar 

  41. Hajj N, Awad M (2013) Weighted entropy cortical algorithms for modern standard arabic speech recognition. In: International joint conference on neural networks (IJCNN), Dallas, TX

  42. Silva LM, Marques de Sá J, Alexandre LA (2005) Neural network classification using Shannon’s entropy. In: ESANN, pp 217–222

  43. Silva LM, Marques de Sá J, Alexandre LA (2008) Data classification with multilayer perceptrons using a generalized error function. Neural Netw 21(9):1302–1310

    Article  Google Scholar 

  44. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27

    Article  Google Scholar 

  45. Moller C (2011) Experiments with MATLAB. The MathWorks Co, Natick

    Google Scholar 

Download references

Acknowledgements

This work was partially funded by Intel and the University Research Board at the American University of Beirut. We would also like to acknowledge the help of Professor Lina Choueiri from the Department of English at the American University of Beirut.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariette Awad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hajj, N., Rizk, Y. & Awad, M. A subjectivity classification framework for sports articles using improved cortical algorithms. Neural Comput & Applic 31, 8069–8085 (2019). https://doi.org/10.1007/s00521-018-3549-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3549-3

Keywords

Navigation