Improving reading comprehension step by step using Online-Boost text readability classification system

La, Lei; Wang, Nan; Zhou, Dong-ping

doi:10.1007/s00521-014-1770-2

Improving reading comprehension step by step using Online-Boost text readability classification system

Original Article
Published: 19 November 2014

Volume 26, pages 929–939, (2015)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Lei La¹,
Nan Wang¹ &
Dong-ping Zhou¹

628 Accesses
7 Citations
Explore all metrics

Abstract

Online reading exercise becomes the universal tool for a wide variety of second language learning systems. Readability sorting is a key step to display suitable reading materials for the learners. Traditional text readability classification techniques cannot meet the request for online learning perfectly as they do not have real-time classification ability and cannot get the information of learners’ language levels. This paper presents a novel framework for online reading exercise which is based on the Online-Boost text readability classification algorithm. We first modified the multinomial Naïve Bayes model to give the reading materials initial readability. We then proposed an Online-Boost algorithm for the text readability update and learners’ reading comprehension evaluation according to the learners’ answers correct rate of the text. Finally, the system would deliver reading materials with different difficulties to testers with different levels of reading ability in real time. The experimental result reveals that the novel method has ideal ease of use and can significantly improve the performance of second language learners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reconsidering the Evidence That Systematic Phonics Is More Effective Than Alternative Methods of Reading Instruction

Article Open access 08 January 2020

Jeffrey S. Bowers

An automated essay scoring systems: a systematic literature review

Article 23 September 2021

Dadi Ramesh & Suresh Kumar Sanampudi

Educational data mining: prediction of students' academic performance using machine learning algorithms

Article Open access 03 March 2022

Mustafa Yağcı

References

Krashen SD (1989) The input hypothesis: issues and implications. Mod Lang J 73(4):440–464
Article Google Scholar
Klingner JK, Artiles AJ, Barletta LM (2006) English language learners who struggle with reading: language acquisition or LD? J Learn Disabil 39(2):107–128
Article Google Scholar
http://www.ets.org/toefl/ibt
Mc Laughlin GH (1969) SMOG grading—a new readability formula. J Read 20(5):639–646
Google Scholar
Farr JN, Jenkins JJ, Paterson DG (1951) Simplification of Flesch reading ease formula. J Appl Psychol 35(5):333–337
Article Google Scholar
Courtis JK, Hassan S (2002) Reading ease of bilingual annual reports. J Bus Commun 39(4):394–413
Article Google Scholar
Graesser AC, McNamara DS, Louwerse MM, Cai Z (2004) Coh–Metrix: analysis of text on cohesion and language. Behav Res Methods 36(2):193–202
Article Google Scholar
Nagy WE, Anderson RC (1987) Learning word meanings from context during normal reading. Am Educ Res J 24(2):237–270
Article Google Scholar
Socher R, Bauer J, Manning CD, Ng AY (2013) Parsing with compositional vector grammars. In: The annual meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, pp 213–220
Schwarm SE, Ostendorf M (2005) Sorting texts by readability. In: Proceedings of the 43rd annual meeting on Association for Computational Linguistics (ACL ‘05), pp 523–530
Tanaka-Ishii K, Tezuka S, Terada H (2010) Narrow-band analyzer. Comput Linguist 36(2):503–527
Google Scholar
Schwenker F, Trentin E (2014) Pattern classification and clustering: a review of partially supervised learning approaches. Pattern Recognit Lett 37:4–14
Article Google Scholar
Feldman R, Sanger J (2007) The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, New York, pp 77–78
Google Scholar
Huanling T, Jun W, Zhengkui L (2010) An enhanced AdaBoost algorithm with Naive Bayesian text categorization based on a novel re-weighting strategy. Int J Innov Comput Inf Control 6(11):5299–5310
Google Scholar
Masnadi-Shirazi H, Vasconcelos N (2011) Cost-sensitive boosting. IEEE Trans Pattern Anal Mach Intell 33(2):294–309. doi:10.1109/TPAMI.2010.71
Article Google Scholar
Vu TT, Braga-Neto UM (2010) Small-sample error estimation for bagged classification rules. EURASIP J Adv Signal Process 2010:1–12
Article Google Scholar
Xiaoyong L, Hui F (2012) A hybrid algorithm for text classification problem. Prz Elektrotech 88(1B):8–11
Google Scholar
Ganiz MC, George C, Pottenger WM (2011) Higher order Naive Bayes: a novel non-IID approach to text classification. IEEE Trans Knowl Data Eng 23(7):1022–1034. doi:10.1109/TKDE.2010.160
Article Google Scholar
Tan S, Li Y, Sun H et al (2014) Interpreting the public sentiment variations on twitter. IEEE Trans Knowl Data Eng 26(5):1158–1170
Article Google Scholar
Yuanping Z, Mingzhu T, Jia Y (2007) Rocchio text classification based on ontology. In: 7th international conference of Chinese computing (ICCC 2007), China, 2007, pp 266–271
Kwon O-W, Lee J-H (2003) Text categorization based on k-nearest neighbor approach for Web site classification. Inf Process Manag 39(1):25–44
Article MATH Google Scholar
Rätsch G, Onoda T, Müller K-R (2001) Soft margins for AdaBoost. Mach Learn 42(3):287–320
Article MATH Google Scholar
Javed I, Afzal H, Majeed A et al (2014) Towards creation of linguistic resources for bilingual sentiment analysis of twitter data. In: 19th international conference on applications of natural language to information systems, Montpellier, France, pp 232–236
Mikolov T (2012) Statistical language models based on neural networks. Ph.D. thesis, Brno University of Technology
Crossley SA, Greenfield J, McNamara DS (2008) Assessing text readability using cognitively based indices. Tesol Q 42(3):475–493
Google Scholar
Kanungo T, Orr D (2009) Predicting the readability of short web summaries. In: Proceedings of the second ACM international conference on web search and data mining, NY, USA, pp 202–211
Ganiz MC, George C, Pottenger WM (2011) Higher order Naive Bayes: a novel non-IID approach to text classification. IEEE Trans Knowl Data Eng 23(7):1022–1034
Article Google Scholar
Miranda V, Jaco D, Henk F (2012) Ethnic concentration in the neighbourhood and majority and minority language: a study of first and second-generation immigrants. Soc Sci Res 41(3):555–569
Article Google Scholar
Abuom TO, Roelien B (2012) Characteristics of Swahili–English bilingual agrammatic spontaneous speech and the consequences for understanding agrammatic aphasia. J Neurolinguist 15(5):885–893
Google Scholar
González-Ortega D, Díaz-Pernas FJ, Martínez-Zarzuela M, Antón-Rodríguez M, Díez-Higuera JF, Boto-Giralda D (2010) Real-time hands, face and facial features detection and tracking: application to cognitive rehabilitation tests monitoring. J Netw Comput Appl 33(4):447–466
Article Google Scholar
Schapire RE, Singer Y (2000) BoosTexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168
Article MATH Google Scholar
Gambina A, Szczureka E, Dutkowskia J, Bakunc M, Dadlez M (2009) Classification of peptide mass fingerprint data by novel no-regret boosting method. Comput Biol Med 39(5):460–473
Article Google Scholar
Schapire RE (2005) Boosting with prior knowledge for call classification. IEEE Trans Speech Audio Process 13(2):174–181. doi:10.1109/TSA.2004.840937
Article Google Scholar
Zhu J, Rosset S, Zou H, Hastie T (2006) Multi-class AdaBoost. Stanford Education http://www.stanford.edu/~hastie/Papers/samme.pdf.2006
Masnadi-Shirazi H, Vasconcelos N (2007) Asymmetric boosting. In: Proceedings of the 24th international conference on machine learning (ICML ‘07), NY, USA, pp 609–616
Hach F, Numanagić I, Alkan C, Sahinalp SC (2012) SCALCE: boosting sequence compression algorithms using locally consistent encoding. Bioinformatics 28(23):3051–3057
Article Google Scholar
Ting KM, Zheng ZJ (2003) A study of AdaBoost with naive Bayesian classifiers: weakness and improvement. Comput Intell 19(2):186–200
Article MathSciNet Google Scholar
Yijun S, Sinisa T, Jian L (2006) Reducing the overfitting of AdaBoost by controlling its data distribution skewness. Int J Pattern Recogn Artif Intell 20(7):1093–1116
Article Google Scholar
Song E, Huang D, Ma G (2011) Semi-supervised multi-class Adaboost by exploiting unlabeled data. Expert Syst Appl 38(6):6720–6726
Article Google Scholar
Uguz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl Based Syst 24(7):1024–1032
Article Google Scholar
Larson P, Diaconu C, Zwilling MJ, Freedman CS (2011) Optimistic multi-version concurrency control system used for controlling concurrently executing transactions, assigns created version of data records of data store as two timestamps indicating lifetime of version. US Patent US 2011153566-A1 (online). http://www.patentlens.net/patentlens/patent/US_201153566-A1

Download references

Author information

Authors and Affiliations

First Research Institute of Ministry of Public Security, Beijing, People’s Republic of China
Lei La, Nan Wang & Dong-ping Zhou

Authors

Lei La
View author publications
You can also search for this author in PubMed Google Scholar
Nan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dong-ping Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei La.

Rights and permissions

Reprints and permissions

About this article

Cite this article

La, L., Wang, N. & Zhou, Dp. Improving reading comprehension step by step using Online-Boost text readability classification system. Neural Comput & Applic 26, 929–939 (2015). https://doi.org/10.1007/s00521-014-1770-2

Download citation

Received: 28 June 2014
Accepted: 11 November 2014
Published: 19 November 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s00521-014-1770-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving reading comprehension step by step using Online-Boost text readability classification system

Abstract

Access this article

Similar content being viewed by others

Reconsidering the Evidence That Systematic Phonics Is More Effective Than Alternative Methods of Reading Instruction

An automated essay scoring systems: a systematic literature review

Educational data mining: prediction of students' academic performance using machine learning algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving reading comprehension step by step using Online-Boost text readability classification system

Abstract

Access this article

Similar content being viewed by others

Reconsidering the Evidence That Systematic Phonics Is More Effective Than Alternative Methods of Reading Instruction

An automated essay scoring systems: a systematic literature review

Educational data mining: prediction of students' academic performance using machine learning algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation