Skip to main content

Advertisement

Log in

Predicting Second Language Proficiency Level Using Linguistic Cognitive Task and Machine Learning Techniques

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

This paper proposes a novel method for predicting second language proficiency based on linguistic cognitive ability measured in linguistic cognitive response test. Our method is based on an assumption that there is a correlation between language aptitude test scores and linguistic cognitive ability. Our proposed method for predicting L2 language proficiency uses as input learner’s linguistic cognition aptitude data. In our experiment, the method produced promising results with the predictive power as high as 70 %. Linguistic cognitive ability is measured through linguistic cognition tasks, which are: reading lexical decision tasks (LDT), listening LDT, translation recognition tasks, and semantic recognition tasks. Each type of the tasks is related to a different linguistic function in the brain. After measuring the learner’s linguistic cognitive aptitude, the result is fed as input for a machine learning model, which makes predictions for the corresponding language proficiency level. In training the linguistic proficiency classifier, we used multi-layer perceptron, Naive Bayes, logistic regression, and random forest model. For input data set in our experiment, we had 42 participants take our cognitive aptitude tests and used the result. Our classifier showed an accuracy >70 % in predicting proficiency level. Among the models, random forest model produced the best predictive power.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Wikipedia. (2009). Language proficiency. San Francisco: Wikimedia Foundation Inc.

    Google Scholar 

  2. Wikipedia. (2009). TOEIC, TOEFL, TEPS. San Francisco: Wikimedia Foundation Inc.

    Google Scholar 

  3. Kroll, J. F., Michael, E., Tokowicz, N., & Dufour, R. (2002). The development of lexical fluency in a second language. Second Language Research, 18(2), 137–171.

    Article  Google Scholar 

  4. Ferre, P., Sanchez-Casas, R., & Guasch, M. (2006). Can a horse be a donkey? Semantic and form interference effects in translation recognition in early and late proficient and nonproficient Spanish-Catalan bilinguals. Language Learning, 56(4), 257–608.

    Article  Google Scholar 

  5. Fairclough, M. (2011). Testing the lexical recognition task with Spanish/English bilinguals in the United States. Language Testing, 28(2), 273–297.

    Article  Google Scholar 

  6. Phillips, N. A., Segalowitz, N., O’Brien, I., & Yamasaki, N. (2004). Semantic priming in a first and second language: evidence from reaction time variability and event-related brain potentials. Journal of Neurolinguistics, 17, 237–262.

    Article  Google Scholar 

  7. Schoonbaert, S., Duyck, W., Brysbaert, M., & Hartsuiker, R. J. (2009). Semantic and translation priming from a first language to a second and back: Making sense of the findings. Memory and Cognition, 17(5), 569–586.

    Article  Google Scholar 

  8. Crossley, S. A., Salsbury, T., & McNamara, D. S. (2011). Predicting the proficiency level of language learners using lexical indices. Language Testing, 29(2), 240–260.

    Google Scholar 

  9. De Wet, F., Van Der Walt, C., & Niesler, T. R. (2009). Automatic assessment of oral language proficiency and listening comprehension. Speech Communication, 52, 864–874.

    Article  Google Scholar 

  10. Van der Walt, C., De Wet, F., & Niesler, T. R. (2008). Oral proficiency assessment: The use of automatic speech recognition systems. South African Linguist and Applied Language Studies, 26(1), 135–146.

    Article  Google Scholar 

  11. Luo, D., Minematsu, N., Yamauchi, Y., & Hirose, K. (2008). Automatic assessment of language proficiency through shadowing. ISCSLP, 41–44.

  12. Yang, Y., Ji, H., Lim, H. (2014). Second language proficiency prediction model through cognitive ability. ICISCA, 1(1), 48–50.

    Google Scholar 

  13. de Annette, A. M. B., & de Cornijs, H. (1995). Translation recognition and translation production: Comparing a new and an old tool in the study of Bilingualism. Language Learning, 45(3), 467–509.

    Article  Google Scholar 

  14. Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). New York: Prentice-Hall.

    MATH  Google Scholar 

  15. Taspınar, N., & Çiçek, M. (2013). Neural network based receiver for multiuser detection in MC-CDMA systems. Wireless Personal Communications, 68, 463–472.

    Article  Google Scholar 

  16. Çiflikli, C., Özsahin, A. T., & Yapici, A. C. (2009). Artificial neural network channel estimation based on Levenberg-Marquardt for OFDM systems. Wireless Personal Communications, 68, 221–229.

    Article  Google Scholar 

  17. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(9), 533–536.

    Article  Google Scholar 

  18. Çalhan, A., & Çeken, C. (2013). Artificial neural network based vertical handoff algorithm for reducing handoff latency. Wireless Personal Communications, 71, 2399–2415.

    Article  Google Scholar 

  19. Ho, T. J. (2005). Data mining and data warehousing. New York: Prentice Hall.

    Google Scholar 

  20. Breiman, L. (2001). Random forest. Machine Learning, 45, 5–32.

    Article  MATH  Google Scholar 

  21. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.

    MathSciNet  MATH  Google Scholar 

  22. Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). CELEX. Philadelphia: Linguistic Data Consortium.

    Google Scholar 

Download references

Acknowledgments

This research was supported by the ICT R&D program of MSIP/IITP. [2014, Development of distribution and diffusion service technology through individual and collective intelligence to digital contents]. This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (MSIP) (No. NRF-2015R1A5A7037674).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to HeuiSeok Lim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Yu, W. & Lim, H. Predicting Second Language Proficiency Level Using Linguistic Cognitive Task and Machine Learning Techniques. Wireless Pers Commun 86, 271–285 (2016). https://doi.org/10.1007/s11277-015-3062-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-015-3062-2

Keywords

Navigation