Abstract
Studies have shown that test-takers tend to use keyword-matching strategies when taking listening tests. Keyword-matching involves matching content words in the written modality (test items) against those heard in the audio text. However, no research has investigated the effect of such keywords in listening tests, or the impact of gazing upon these keywords on listening test scores. Thus, this study examined whether test-takers’ performance on a listening test can be explained by their gaze behaviors across three types of content words in the written modality: nouns, verbs, and adjectives. Using eye-tracking technology, this study measured the gaze behavior of 66 listening test-takers during reading content words in test item stems. Using linear mixed effect model, binary probit regression, and multinomial logistic regression, we found that test-takers’ performance was predicted by gaze behavioral measures on content words. Among the content words, fixating on nouns in written test items had the most significant role in predicting test performance, followed by adjectives and verbs. By shedding light on how keywords in test items are attended to by test-takers and the relationship between keyword-matching and listening test performance, this study has provided significant evidence for the overwhelming role of reading in listening tests. Implications for test score interpretation are discussed.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are not publicly available as they are the property of [masked] University.
Abbreviations
- Adj :
-
Adjective
- AOI :
-
Areas of interest
- Avg :
-
Average
- CAEL :
-
Canadian Academic English Assessment
- CE :
-
CAEL Computer Edition
- Dur :
-
Duration
- Fix :
-
Fixation
- GLM :
-
Generalized linear model
- IELTS :
-
International English Language Testing System
- LMM :
-
Linear mixed effect model
- MCQ :
-
Multiple-choice question
- MnSq :
-
Mean square
- N :
-
Noun
- PCAR :
-
Principal component analysis of Rasch residuals
- Tot :
-
Total
References
Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14, 115–129. https://doi.org/10.1093/applin/14.2.115
Angelis, G. D. (2005). Interlanguage transfer of function words. Language Learning, 55(3), 379–414. https://doi.org/10.1111/j.0023-8333.2005.00310.x
Ariffin, S. R., Omara, B., Isaa, A., & Sharif, S. (2010). Validity and reliability multiple intelligent item using Rasch measurement model. Procedia Social and Behavioral Sciences, 9, 729–733. https://doi.org/10.1016/j.sbspro.2010.12.225
Aryadoust, V. (2011). Application of the fusion model to while-listening performance tests. Shiken: JALT Testing & Evaluation SIG Newsletter, 15, 2–9. http://hosted.jalt.org/test/PDF/Aryadoust2.pdf
Aryadoust, V. (2018). The listening test of the internet-based test of english as a foreign language (TOEFL iBT). In D. L. Worthington & G. D. Bodle (Eds.), The Sourcebook of Listening Research: Methodology and Measures (pp. 592–598). Wiley Blackwell.
Aryadoust, V. (2019). Dynamics of item reading and answer changing in two hearings in a computerized while-listening performance test: An eye-tracking study. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2019.1574267
Aryadoust, V. (2020). A review of comprehension subskills: A scientometrics perspective. System, 88, 102180. https://doi.org/10.1016/j.system.2019.102180
Badger, R., & Yan, X. (2009). The use of tactics and strategies by Chinese students in the Listening component of IELTS. In P. Thompson (Ed.), International English Language Testing System (IELTS) Research Reports (Vol. 9, pp. 67–98). British Council and IELTS Australia. https://www.ielts.org/-/media/research-reports/ielts_rr_volume09_report2.ashx
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2020). Lme4: Linear mixed-effects models using Eigen and S4. https://github.com/lme4/lme4/
Batty, A. O. (2020). An eye-tracking study of attention to visual cues in L2 listening tests. Language Testing. Advance online publication. https://doi.org/10.1177/0265532220951504
Bodie, G. D., Winter, J., Dupuis, D., & Tompkins, T. (2020). The echo listening profile: Initial validity evidence for a measure of four listening habits. International Journal of Listening, 34(3), 131–155. https://doi.org/10.1080/10904018.2019.1611433
Bond, T., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. Routledge.
Brown, V. A. (2021). An introduction to linear mixed-effects modeling in R.Advances in Methods and Practices in Psychological Science, 4(1), Article 2515245920960351.https://doi.org/10.1177/2515245920960351
Buck, G. (2001). Assessing listening. Cambridge University Press.
Cunnings, I., Fotiadou, G., & Tsimpli, I. (2017). Anaphora resolution and reanalysis during L2 sentence processing: Evidence from the visual world paradigm. Studies in Second Language Acquisition, 39(4), 621–652. https://doi.org/10.1017/S0272263116000292
Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A meta-analysis. Psychonomic Bulletin & Review, 3(4), 422–433. https://doi.org/10.3758/BF03214546
Deane, T., Nomme, K., Jeffery, E., Pollock, C., & Birol, G. (2016). Development of the Statistical Reasoning in Biology Concept Inventory (SRBCI). CBE Life Sciences Education, 15(1), ar5-ar5. https://doi.org/10.1187/cbe.15-06-0131
Douglas, D. (2001). Language for specific purposes assessment criteria: Where do they come from? Language Testing, 18(2), 171–185. https://doi.org/10.1177/026553220101800204?journalCode=ltja
Duchowski, A. (2007). Taxonomy and Models of Eye Movements. In Eye Tracking Methodology (pp. 41–48). Springer.
Dunkel, P., Henning, G., & Chaudron, C. (1993). The assessment of an L2 listening comprehension construct: A tentative model for test specification and development. Modern Language Journal, 77(2), 180–191. https://doi.org/10.2307/328942
Ehrich, J. F., Howard, S. J., Tognolini, J. S., & Bokosmaty, S. (2015). Measuring attitudes toward plagiarism: Issues and psychometric solutions. Journal of Applied Research in Higher Education, 7(2), 243–257. https://doi.org/10.1108/JARHE-02-2014-0013
Field, A. P. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). Sage Publications.
Field, J. (2009). The cognitive validity of the lecture-based question in the IELTS listening paper. In P. Thompson (Ed.), International English Language Testing System (IELTS) Research Reports 2009 (Vol. 9, pp. 17–65). British Council and IELTS Australia.
Field, J. (2013). Cognitive validity. In A. Garanpayeh & L. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening. Cambridge University Press.
Geranpayeh, A., & Taylor, L. (2008). Examining listening: Developments and issues in assessing second language listening. Cambridge ESOL: Research Notes, 32, 2–5.
Goh, C. C. M. (2002, 2002/06/01/). Exploring listening comprehension tactics and their interaction patterns. System, 30(2), 185–206. https://doi.org/10.1016/S0346-251X(02)00004-0
Haarmann, H. J., Davelaar, E. J., & Usher, M. (2003, 2003/02/01/). Individual differences in semantic short-term memory capacity and reading comprehension. Journal of Memory and Language, 48(2), 320–345. https://doi.org/10.1016/S0749-596X(02)00506-5
Halliday, M. A. K. (1985). Spoken and written language. Oxford University Press.
Holzknecht, F., McCray, G., Eberharter, K., Kremmel, B., Zehentner, M., Spiby, R., & Dunlea, J. (2020). The effect of response order on candidate viewing behaviour and item difficulty in a multiple-choice listening test. Language Testing. Advance online publication. https://doi.org/10.1177/0265532220917316
Howell, P., Au-Yeung, J., & Sackin, S. (1999). Exchange of stuttering from function words to content words with age. Journal of Speech, Language, and Hearing Research, 42(2), 345–354. https://doi.org/10.1044/jslhr.4202.345
IBM Corporation. (2011). IBM SPSS Statistics for Windows. (Version 20) [Computer software]. IBM.
Issa, B. I., & Morgan-Short, K. (2019). Effects of external and internal attentional manipulations on second language grammar development: An eye-tracking study. Studies in Second Language Acquisition, 41(2), 389–417. https://doi.org/10.1017/S027226311800013X
Keating, G. D., & Jegerski, J. (2015). Experimental designs in sentence processing research: A methodological review and user’s guide. Studies in Second Language Acquisition, 37(1), 1–32. https://doi.org/10.1017/S0272263114000187
Juhola, M. (1991). Median filtering is appropriate to signals of saccadic eye movements. Computers in Biology and Medicine, 21(1–2), 43–49. https://doi.org/10.1016/0010-4825(91)90034-7
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge University Press.
Komogortsev, O. V., Gobert, D. V., Jayarathna, S., Koh, D. H., & Gowda, S. (2010). Standardization of automated analyses of oculomotor fixation and saccadic behaviors. IEEE Transactions on Biomedical Engineering, 57(11), 2635–2645. https://doi.org/10.1109/TBME.2010.2057429
Krishnan, S. & Idris, N. (2014). Investigating reliability and validity for the construct of inferential statistics. International Journal of Learning, Teaching and Educational Research, 4(1), 51–60. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.677.9940&rep=rep1&type=pdf
Linacre, J. M. (2020). Winsteps. In (Version 4.5.3) [Computer software]. Winsteps.com
Maftoon, P., & Alamdari, E. F. (2020). Exploring the effect of metacognitive strategy instruction on metacognitive awareness and listening performance through a process-based approach. International Journal of Listening, 34(1), 1–20. https://doi.org/10.1080/10904018.2016.1250632
Masrai, A. (2020). Can L2 phonological vocabulary knowledge and listening comprehension be developed through extensive movie viewing? The case of Arab EFL learners. International Journal of Listening, 34(1), 54–69. https://doi.org/10.1080/10904018.2019.1582346
Messick, S. (1994). Alternative modes of assessment, uniform standard of validity. ETS Research Reports. https://onlinelibrary.wiley.com/doi/pdf/https://doi.org/10.1002/j.2333-8504.1994.tb01634.x
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241–256. https://doi.org/10.1177/026553229601300302
Nation, I. S. P., & Newton, J. (2008). Teaching ESL/EFL listening and speaking. Routledge.
Ockey, G. J., & Wagner, E. (2018). Assessment of L2 listening: Moving towards authenticity. John Benjamins Publishing Company.
Olsen, A. (2012). The Tobii I-VT fixation filter: Algorithm description. https://www.tobiipro.com/siteassets/tobii-pro/learn-and-support/analyze/how-do-we-classify-eye-movements/tobii-pro-i-vt-fixation-filter.pdf
Olsen, A., & Matos, R. (2012). Identifying parameter values for an I-VT fixation filter suitable for handling data sampled with various sampling frequencies. ETRA '12: Proceedings of the Symposium on Eye Tracking Research and Applications, Santa Barbara, CA, USA.
Oxford Languages and Google - English. (n.d.). https://languages.oup.com/google-dictionary-en/
Pellicer-Sánchez, A., Tragant, E., Conklin, K., Rodgers, M., Serrano, R., & Llanes, Á. (2020). Young learners’ processing of multimodal input and its impact on reading comprehension: An eye-tracking study. Studies in Second Language Acquisition, 42(3), 577–598. https://doi.org/10.1017/S0272263120000091
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bullentin, 124(3), 372–422. https://doi.org/10.1037/0033-2909.124.3.372
Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search, The Quarterly Journal of Experimental Psychology, 62, 8, 1457–1506, https://https://doi.org/10.1080/17470210902816461
Reed, J. (2000). Assessing vocabulary. Klett Sprachen GmbH.
Ryan, J., & Granville, S. (2020). The suitability of film for modelling the pragmatics of interaction: Exploring authenticity. System, 89, Article 102186. https://doi.org/10.1016/j.system.2019.102186
Schotter, E. R., Angele, B., & Rayner, K. (2012). Parafoveal processing in reading. Attention, Perception, & Psychophysics, 74(1), 5–35. https://doi.org/10.3758/s13414-011-0219-2
Stuart, S., Hickey, A., Vitorio, R., Welman, S., Foo, S., Keen, S., & Godfrey, A. (2019). Eye-tracker algorithms to detect saccades during static and dynamic tasks: a structured review. Physiological measurement, 40(2), 02TR01. https://doi.org/10.1088/1361-6579/ab02ab
Suvorov, R. (2013). Interacting with visuals in L2 listening tests: An eye-tracking study. Doctoral dissertation submitted to Iowa State University. https://lib.dr.iastate.edu/etd/13299
Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32(4), 463–483. https://doi.org/10.1177/0265532214562099
Tobii AB. (2016). Tobii Studio user's manual version 3.4.5. https://www.tobiipro.com/siteassets/tobii-pro/user-manuals/tobii-pro-studio-user-manual.pdf
Tobii AB. (2017). Tobii Pro Studio. In (Version 3.4.8) [Computer software].
Tsui, A. B. M., & Fullilove, J. (1998). Bottom-up or top-down processing as a discriminator of L2 listening performance. Applied Linguistics, 19(4), 432–451. https://doi.org/10.1093/applin/19.4.432
Wagner, E. (2010). The effect of the use of video texts on ESL listening test-taker performance. Language Testing, 27(4), 493–513. https://doi.org/10.1177/0265532209355668
Weir, C. (2005). Language testing and validation: An evidence-based approach. Palgrave Macmillan.
Wickens, C. D. (2006). Attention to attention and its applications: A concluding view. In A. F. Kramer, D. A. Wiegmann, & A. Kirlik (Eds.), Attention: From Theory to Practice (pp. 239–249). Oxford Scholarship.
Wilson, M. (2003). Discovery listening—improving perceptual processing. ELT Journal, 57(4), 335–343. https://doi.org/10.1093/elt/57.4.335
Winke, P., & Lim, H. (2014). The effects of testwiseness and test-taking anxiety on L2 listening test performance: A visual (eye-tracking) and attentional investigation (IELTS Research Reports Online Series, Issue. British Council, Cambridge English Language Assessment, & IDP: IELTS Australia. https://www.ielts.org/teaching-and-research/research-reports/online-series-2014-3
Wood, R. (1993). Assessment and testing. Cambridge University Press.
Wu, Y. (1998). What do tests of listening comprehension test? - A retrospection study of EFL test-takers performing a multiple-choice task. Language Testing, 15(1), 21–44. https://doi.org/10.1177/026553229801500102
Zarrabi, Z. (2020). Investigating the relationship between learning style and metacognitive listening awareness. International Journal of Listening, 34(1), 21–33. https://doi.org/10.1080/10904018.2016.1276458
Acknowledgements
We would like to acknowledge the primary funding support from Paragon Testing Enterprises, Canada. The secondary funding support for this project came from Nanyang Technological University (Singapore) under the Undergraduate Research Experience on CAmpus (URECA) programme.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
ESM 1
(DOCX 279 KB)
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kho, S.Q.E., Aryadoust, V. & Foo, S. An eye-tracking investigation of the keyword-matching strategy in listening assessment. Educ Inf Technol 28, 3739–3763 (2023). https://doi.org/10.1007/s10639-022-11322-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-022-11322-y