Abstract
Data-driven learning (DDL) has been demonstrated to be an operative strategy for assisting learners to handle a range of writing-related problems. Several studies have been conducted to compare the pedagogical effectiveness of DDL in English as a foreign language (EFL) writing. However, only a few studies have identified key factors that may affect learning outcomes when designing DDL activities. To bridge this gap, the present study looked at the medium-term effects of DDL activities in EFL writing. A pre-post quasi-experimental research design and semi-structural interviews were arranged to collect data from 64 Arab EFL undergraduate students. The DDL was carried out with the aid of BNCweb and offered the assessment of the findings by contrasting the efficiency of BNCweb with that of Sketch Engine, which is employed as a reference tool by EFL learners. The quantitative results showed that the experimental group’s use of BNCweb inspired their writing to be more fluid and consistent in the posttest as compared to the control group, which employed the Sketch Engine tool. However, no significant difference was detected between the groups in writing intricacy. The qualitative results indicated that students had positive attitudes toward using BNCweb, despite the challenges of implementing corpora in the writing process. It was recommended that integrating corpora with other types of reference sources would be a viable solution to overcome any potential obstacles for EFL learners.
Similar content being viewed by others
Data availability
The datasets generated during the current study are available from the corresponding author upon reasonable request.
Abbreviations
- AES:
-
Automated Essay Scoring
- BNC:
-
British National Corpus
- CQP:
-
Corpus Query Processor
- C/T:
-
Clauses/T-unit
- DDL:
-
Data-Driven Learning
- EFL:
-
English as a Foreign Language
- EFT/T:
-
Error Free T-units/T-units
- ETS:
-
Educational Testing Service
- E/W*100:
-
Errors per 100 Words
- M :
-
Mean no. of words per composition
- MI:
-
Mutual Information
- POS:
-
Part-of-Speech
- TTR:
-
Type-Token Ratio
References
Ackerley, K. (2021). 4 Exploiting a genre-specific corpus in ESP writing: students’ preferences and strategies. In: (a cura di): Charles Maggie; Frankenberg-Garcia Ana, Corpora in ESP/EAP Writing Instruction: Preparation, Exploitation, Analysis, 78–99. New York: Routledge, ISBN: 9780367432348
Ahsanuddin, M., Hanafi, Y., Basthomi, Y., Taufiqurrahman, F., Bukhori, H. A., Samodra, J., & Wijayati, P. H. (2022). Building a corpus-based academic vocabulary list of four languages. Pegem Journal of Education and Instruction, 12(1), 159–167. https://doi.org/10.47750/pegegog.12.01.15.
Al-Qahtani, A. A. (2021). The integration of Corpus-Based Approach with EFL Academic writing in the saudi context. International Journal of English Language and Literature Studies, 10(1), 22–45. https://doi.org/10.18488/journal.23.2021.101.22.45.
Alshammari, S. R. (2019). Data Driven Learning and teaching of prepositions in ESL: a study of arab learners. IJERI: International Journal of Educational Research and Innovation, (12), 153–168. https://doi.org/10.46661/ijeri.3809.
Alsolami, T., & Alharbi, A. (2020). Saudi EFL learners’ perceptions of the use of corpora in academic writing teaching. Studies in English Language Teaching, 8(4), 94–111. https://doi.org/10.22158/selt.v8n4p94.
Asi, N., Fauzan, A., Yaspis, J. S., Lui, S. S. F., & Salasa, I. H. (2021). EFL students’ long-term practice of Data-driven learning. Exposure: Jurnal Pendidikan Bahasa Inggris, 10(2), 390–401. https://journal.unismuh.ac.id/index.php/exposure/article/view/6173.
Bell, D. E. (2022). Methodology in EAP: why is it largely still an overlooked issue? Journal of English for Academic Purposes, 55, 101073. https://doi.org/10.1016/j.jeap.2021.101073.
Birhan, A. T., Teka, M., & Asrade, N. (2021). Effects of using Corpus-Based instructional mediation on EFL Students’ academic writing skills improvement. Theory and Practice of Second Language Acquisition, 7(2), 133–153. https://doi.org/10.31261/TAPSLA.9993.
Boone, G., De Wilde, V., & Eyckmans, J. (2023). A longitudinal study into learners’ productive collocation knowledge in L2 german and factors affecting the learning. Studies in Second Language Acquisition, 1–23. https://doi.org/10.1017/S0272263122000377.
Boulton, A. (2009). Testing the limits of data-driven learning: Language proficiency and training. ReCALL, 21(1), 37–54. https://doi.org/10.1017/S0958344009000068.
Boulton, A. (2010). Data-driven learning: taking the computer out of the equation. Language learning, 60(3), 534–572. https://doi.org/10.1111/j.1467-9922.2010.00566.x.
Boulton, A. (2011). Data-driven learning: the perpetual enigma. In S. Goźdź Roszkowski (Ed.), Explorations across languages and corpora (pp. 563–580). Frankfurt: Peter Lang. https://hal.archives-ouvertes.fr/hal-00528258.
Boulton, A. (2017). Corpora in language teaching and learning. Language Teaching, 50(4), 483–506. https://doi.org/10.1017/S0261444817000167.
Boulton, A., & Cobb, T. (2017). Corpus use in language learning: a meta-analysis. Language learning, 67(2), 348–393. https://doi.org/10.1111/lang.12224.
Bridle, M., & McIntyre, D. (2022). Pedagogical Corpus Stylistics: Teaching Style and Register Variation to EAP Students. Pedagogical Stylistics in the 21st Century (pp. 75–104). Cham: Palgrave Macmillan. https://doi.org/10.1007/978-3-030-83609-2_4.
Brezina, V. (2012). Use of Google Scholar in corpus-driven EAP research. Journal of English for Academic Purposes, 11(4), 319–331. https://doi.org/10.1016/j.jeap.2012.08.001.
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: the criterion online writing system. AI Magazine, 25, 27–36.
Burstein, J., Tetreault, J., & Madnani, N. (2013). The e-rater automated essay scoring system. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay scoring: Current applications and future directions (pp. 55–67). New York: Routledge.10.4324/9780203122761.ch4
Chambers, A. (2005). Integrating corpus consultation in language studies. Language learning & technology, 9(2), 111–125.
Chambers, A., & O’sullivan, I. (2004). Corpus consultation and advanced learners’ writing skills in french. ReCALL, 16(1), 158–172. https://doi.org/10.1017/S0958344004001211.
Chang, J. Y. (2014). The use of general and specialized corpora as reference sources for academic English writing: a case study. ReCALL, 26(2), 243–259. https://doi.org/10.1017/S0958344014000056.
Chang, W. L., & Sun, Y. C. (2009). Scaffolding and web concordancers as support for language learning. Computer Assisted Language Learning, 22(4), 283–302. https://doi.org/10.1080/09588220903184518.
Charles, M. (2014). Getting the corpus habit: EAP students’ long-term use of personal corpora. English for Specific Purposes, 35, 30–40. https://doi.org/10.1016/j.esp.2013.11.004.
Charles, M. (2022a). Corpora and autonomous language learning. In R. Jablonkai, & E. Csomay (Eds.), The Routledge handbook of corpora in English language teaching and learning. Routledge.
Charles, M. (2022b). The gap between intentions and reality: reasons for EAP writers’ non-use of corpora. Applied Corpus Linguistics, 100032. https://doi.org/10.1016/j.acorp.2022.100032.
Chen, M., & Flowerdew, J. (2018). A critical review of research and practice in data-driven learning (DDL) in the academic writing classroom. International Journal of Corpus Linguistics, 23(3), 335–369. https://doi.org/10.1017/S0261444817000167.
Cotos, E. (2014). Enhancing writing pedagogy with learner corpus data. ReCALL, 26(2), 202–224. https://doi.org/10.1017/S0958344014000019.
Corder, S. P. (1981). Error analysis and interlanguage. USA: Oxford University Press.
Crosthwaite, P. (2017). Retesting the limits of data-driven learning: feedback and error correction. Computer Assisted Language Learning, 30(6), 447–473. https://doi.org/10.1080/09588221.2017.1312462.
Çyfeku, J. (2022). Corpus Linguistics and Technology Integration In SLA; Generating Language Tasks Through SKETCH Engine. Journal of Positive School Psychology, 6(9), 1445–1458.
Fang, L., Ma, Q., & Yan, J. (2021). The effectiveness of corpus-based training on collocation use in L2 writing for chinese senior secondary school students. Journal of China Computer-Assisted Language Learning, 1(1), 80–109. https://doi.org/10.1515/jccall-2021-2004.
Flowerdew, L. (2015). Data-driven learning and language learning theories. Multiple affordances of language corpora for data-driven learning, 69, 15–36.
Fauzanz, A., Basthomi, Y., & Ivone, F. M. (2022). Effects of using Online Corpus and Online Dictionary as Data-driven learning on students’ grammar mastery. LEARN Journal: Language Education and Acquisition Research Network, 15(2), 679–704.
Gaskell, D., & Cobb, T. (2004). Can learners use concordance feedback for writing errors? System, 32(3), 301–319. https://doi.org/10.1016/j.system.2004.04.001.
Geluso, J. (2014). Discovering formulaic language through data-driven learning: student attitudes and efficacy. ReCALL, 26(2), 225–242.
Gilmore, A. (2009). Using online corpora to develop students’ writing skills. ELT journal, 63(4), 363–372. https://doi.org/10.1093/elt/ccn056.
Gilquin, G., & Granger, S. (2010). “How can data-driven learning be used in language teaching. In A. O’Keeffe, & M. McCarthy (Eds.), ” The Routledge Handbook of Corpus Linguistics (pp. 359–370). New York, NY: Routledge. https://doi.org/10.4324/9780203856949-26.
Girgin, U. (2019). The effectiveness of using Corpus-Based activities on the learning of some phrasal-prepositional verbs. Turkish Online Journal of Educational Technology-TOJET, 18(1), 118–125. http://files.eric.ed.gov/fulltext/EJ1201799.pdf.
Gries, S. T. (2010). Dispersions and adjusted frequencies in corpora: further explorations. In Corpus-linguistic applications (pp. 197–212). Brill. https://doi.org/10.1163/9789042028012_014
Hardie, A. (2012). CQPweb—combining power, flexibility and usability in a corpus analysis tool. International journal of corpus linguistics, 17(3), 380–409. https://doi.org/10.1075/ijcl.17.3.04har.
Hoffmann, S., & Evert, S. (2006). BNCweb (CQP-edition): The marriage of two corpus tools. Corpus technology and language pedagogy: New resources, new tools, new methods, 3, 177–195.
Hoffmann, S., Evert, S., Smith, N., Lee, D., & Berglund-Prytz, Y. (2008). Corpus linguistics with BNCweb-a practical guide (Vol. 6). Peter Lang. https://usir.salford.ac.uk/id/eprint/16802
Huang, Z. (2014). The effects of paper-based DDL on the acquisition of lexico-grammatical patterns in L2 writing. ReCALL, 26(2), 163–183. https://doi.org/10.1017/S0958344014000020.
Hyland, K. (2019). Second language writing. Cambridge university press.
Inpanich, P., & Somphong, M. (2022). The Effects of the integration of Indirect Feedback and Concordances on improving Grammatical Accuracy in Thai EFL students’ writing. English Language Teaching, 15(4), 25–38. https://doi.org/10.5539/elt.v15n4p25.
Johns, T., & King, P. (Eds.). (1991). Classroom concordancing. English Language Research Journal, 4, 1–12.
Karpenko-Seccombe, T. (2020). Academic writing with corpora: a resource book for data-driven learning. Routledge. https://doi.org/10.4324/9780429059926.
Kennedy, C., & Miceli, T. (2010). Corpus-assisted creative writing: introducing intermediate italian learners to a corpus as a reference resource. Language Learning & Technology, 14(1), 28–44.
Larsen-Freeman, D. (2006). The emergence of complexity, fluency, and accuracy in the oral and written production of five chinese learners of English. Applied linguistics, 27(4), 590–619. https://doi.org/10.1093/applin/aml029.
Larsen-Walker, M. (2017). Can Data-Driven Learning address L2 writers’ habitual errors with English linking adverbials? System, 69, 26–37. https://doi.org/10.1016/j.system.2017.08.005.
Lee, H., Warschauer, M., & Lee, J. H. (2019). The effects of corpus use on second language vocabulary learning: a multilevel meta-analysis. Applied Linguistics, 40(5), 721–753. https://doi.org/10.1093/applin/amy012.
Li, J., & Li, M. (2022). Assessing L2 writing in the digital age: Opportunities and challenges. Journal of Second Language Writing, 57. https://doi.org/10.1016/j.jslw.2022.100913
Lin, M. H. (2016). Effects of corpus-aided language learning in the EFL grammar classroom: a case study of students’ learning attitudes and teachers’ perceptions in Taiwan. TESOL Quarterly, 50(4), 871–893. https://doi.org/10.1002/tesq.250.
Lin, M. H. (2021). Effects of data-driven learning on college students of different grammar proficiencies: a preliminary empirical assessment in EFL classes. Sage Open, 11(3), 21582440211029936. https://doi.org/10.1177/21582440211029936.
Lin, M. H., & Lee, J. Y. (2015). Data-driven learning: changing the teaching of grammar in EFL classes. ELT Journal, 69(3), 264–274. https://doi.org/10.1093/elt/ccv010.
Luo, Q. (2016). The effects of data-driven learning activities on EFL learners’ writing development. SpringerPlus, 5(1), 1–13. https://doi.org/10.1186/s40064-016-2935-5.
Luo, Q., & Liao, Y. (2015). Using corpora for error correction in EFL learners’ writing. Journal of Language Teaching and Research, 6(6), 1333–1342. https://doi.org/10.17507/jltr.0606.22
McGillivray, B., & Kilgarriff, A. (2013). Tools for historical corpus research, and a corpus of Latin. New Methods in Historical Corpus Linguistics, 1(3), 247–257.
Muftah, M. (2022a). Impact of social media on learning English language during the COVID-19 pandemic. PSU Research Review, (ahead-of-print). https://doi.org/10.1108/PRR-10-2021-0060
Muftah, M. (2022b). Machine vs human translation: a new reality or a threat to professional Arabic–English translators. PSU Research Review, (ahead-of-print). https://doi.org/10.1108/PRR-02-2022-0024
Nesi, H., & Gardner, S. (2018). The BAWE corpus and genre families classification of assessed student writing. Assessing Writing, 38, 51–55. https://doi.org/10.1016/j.asw.2018.06.005.
Nunan, D. (1989). Designing tasks for the communicative classroom. Cambridge University Press.
Oakes, M. (1998). Statistics for corpus linguistics. Edinburgh: Edinburgh University Press.
O’Keeffe, A. (2021). Data-driven learning – a call for a broader research gaze. Language Teaching, 54(2), 259–272. https://doi.org/10.1017/S0261444820000245.
O’Sullivan, Í., & Chambers, A. (2006). Learners’ writing skills in French: Corpus consultation and learner evaluation. Journal of second language writing, 15(1), 49–68. https://doi.org/10.1016/j.jslw.2006.01.002.
Pérez-Paredes, P. (2022). A systematic review of the uses and spread of corpora and data-driven learning in CALL research during 2011–2015. Computer Assisted Language Learning, 35(1–2), 36–61. https://doi.org/10.1080/09588221.2019.1667832.
Pérez-Paredes, P., Sánchez-Tornel, M., & Calero, J. M. A. (2012). Learners’ search patterns during corpus-based focus-on-form activities: a study on hands-on concordancing. International Journal of Corpus Linguistics, 17(4), 482–515. https://doi.org/10.1075/ijcl.17.4.02par.
Qoura, Y. A., Hassan, B., & Mostafa, A. (2018). The impact of corpus-based program on enhancing the EFL student teachers’ writing skills and self-autonomy. Journal of Research in Curriculum Instruction and Educational Technology, 4(1), 11–53.
Rayson, P., & Garside, R. (2000, October). Comparing corpora using frequency profiling. In The workshop on comparing corpora (pp. 1–6).
Reynolds, B. L. (2015). Helping taiwanese graduate students help themselves: applying corpora to industrial management English as a foreign language academic reading and writing. Computers in the Schools, 32(3–4), 300–317. https://doi.org/10.1080/07380569.2015.1096643.
Richards, J., & Renandya, W. (2002). Methodology in Language Teaching: an anthology of current practice. Cambridge: Cambridge University Press.
Rivera, J. G. G. (2021). A corpus-based approach to teach verb + noun collocations to intermediate level English language learners (Doctoral dissertation, University of Tsukuba).
Römer, U. (2011). Corpus Research Applications in Second Language Teaching. Annual Review of Applied Linguistics, 31(1), 205–225. https://doi.org/10.1017/S0267190.
Römer, U. (2019). A corpus perspective on the development of verb constructions in second language learners. International Journal of Corpus Linguistics, 24(3), 268–290. https://doi.org/10.1075/ijcl.00013.roe.
Sabti, A. A., & Chaichan, R. S. (2014). Saudi high school students’ attitudes and barriers toward the use of computer technologies in learning English. SpringerPlus, 3(1), 1–8. https://doi.org/10.1186/2193-1801-3-460.
Saeedakhtar, A., Bagerin, M., & Abdi, R. (2020). The effect of hands-on and hands-off data-driven learning on low-intermediate learners’ verb-preposition collocations. System, 91, 102268. https://doi.org/10.1016/j.system.2020.102268.
Samoudi, N., & Modirkhamene, S. (2020). Concordancing in writing pedagogy and CAF measures of writing. International Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral-2020-2014.
Sha, G. (2010). Using Google as a super corpus to drive written language learning: a comparison with the British National Corpus. Computer Assisted Language Learning, 23(5), 377–393. https://doi.org/10.1080/09588221.2010.514576.
Shin, D., Kwon, S. K., & Lee, Y. (2021). The effect of using online language-support resources on L2 writing performance. Language Testing in Asia, 11(1), 1–23. https://doi.org/10.1186/s40468-021-00119-4.
Smart, J. (2014). The role of guided induction in paper-based data-driven learning. ReCALL, 26(2), 184–201. https://doi.org/10.1017/S0958344014000081.
Storch, N. (2005). Collaborative writing: product, process, and students’ reflections. Journal of second language writing, 14(3), 153–173. https://doi.org/10.1016/j.jslw.2005.05.002.
Sun, X., & Hu, G. (2020). Direct and indirect data-driven learning: an experimental study of hedging in an EFL writing class. Language Teaching Research, 1362168820954459. https://doi.org/10.1177/1362168820954459.
Thomas, J. (2015). Stealing a march on collocation. Mult Afford Lang Corpora Data-driven Learn, 69, 85–108.
Thongpan, N. (2022). A Corpus-Based Study Of English Synonyms Of The Adjectives ‘Far’,‘Distant’, And ‘Remote’.Journal of Positive School Psychology,3986–4001.
Toledo, E. Q. (2020). Data-driven learning in the academic writing classroom: citation and stance. LFE: Revista de lenguas para fines específicos, 26(1), 196–213. https://doi.org/10.20420/rlfe.2020.322.
Tono, Y., Satake, Y., & Miura, A. (2014). The effects of using corpora on revision tasks in L2 writing with coded error feedback. ReCALL, 26(2), 147–162. https://doi.org/10.1017/S095834401400007X.
Tsai, Y. R. (2021). Exploring the effects of corpus-based business English writing instruction on EFL learners’ writing proficiency and perception. Journal of Computing in Higher Education, 33(2), 475–498. https://doi.org/10.1007/s12528-021-09272-4.
Vyatkina, N. (2016a). Data-driven learning for beginners: the case of German verb-preposition collocations. ReCALL, 28(2), 207–226. https://doi.org/10.1017/S09583.
Vyatkina, N. (2016b). Data-driven learning of Collocations: Learner Performance, proficiency, and perceptions. Language Learning & Technology, 20(3), 159–179. http://llt.msu.edu/issues/october2016/vyatkina.pdf.
Weigle, S. C. (2013). English as a second language writing and automated essay evaluation. In M. D. Shermis, & J. Burstein (Eds.), Handbook of automated essay scoring: current applications and future directions (pp. 36–54). New York: Routledge. https://www.routledgehandbooks.com/doi/https://doi.org/10.4324/9780203122761.ch3.
Wigglesworth, G., & Storch, N. (2009). Pair versus individual writing: Effects on fluency, complexity and accuracy. Language Testing, 26(3), 445–466. https://doi.org/10.1177/0265532209104670.
Wu, Y. J. A. (2021). Discovering collocations via data-driven learning in L2 writing. Language Learning & Technology, 25(2), 192–214. http://hdl.handle.net/10125/73440.
Xue, L. (2021). Using data-driven learning activities to improve lexical awareness in intermediate EFL learners. Cogent Education, 8(1), 1996867. https://doi.org/10.1080/2331186X.2021.1996867.
Yeh, Y., Liou, H. C., & Li, Y. H. (2007). Online synonym materials and concordancing for EFL college writing. Computer Assisted Language Learning, 20(2), 131–152. https://doi.org/10.1080/09588220701331451.
Yoon, H. (2008). More than a linguistic reference: the influence of corpus technology on L2 academic writing. Language Learning & Technology, 12(2), 31–48.
Yoon, C. (2016). Concordancers and dictionaries as problem-solving tools for ESL academic writing. Language Learning & Technology, 20(1), 209–229.
Yoon, H., & Hirvela, A. (2004). ESL student attitudes toward corpus use in L2 writing. Journal of second language writing, 13(4), 257–283. https://doi.org/10.1016/j.jslw.2004.06.002.
Yoon, H., & Jo, J. (2014). Direct and indirect access to corpora: an exploratory case study comparing students’ error correction and learning strategy use in L2 writing. Language Learning & Technology, 18(1), 96–117. http://llt.msu.edu/issues/february2014/yoonjo.pdf.
Zare, J., & Karimpour, S. (2022). Classroom Concordancing and Second Language Motivational Self-System: A Data-Driven Learning Approach. Frontiers in Psychology, 13. https://doi.org/10.3389%2Ffpsyg.2022.841584
Zhang, S. (2021). Review of automated writing evaluation systems. Journal of China Computer-Assisted Language Learning, 1(1), 170–176. https://doi.org/10.1515/jccall-2021-2007.
Zhu, F. (2021). Supporting EFL writing during the pandemic: the effectiveness of Data-Driven Learning in Error correction. Asian EFL Journal, 25(5), 8–27. https://eprints.lancs.ac.uk/id/eprint/157758.
Acknowledgements
The author is grateful to the participants who took part in the study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The author declares no conflict of interest.
Ethical approval
Confidentiality and participants’ identities were protected by collecting data anonymously. All participants were assured that their responses would only be used for scientific research purposes.
Informed consent
All participants were asked to answer the questions willingly and stress-free. Otherwise, they could easily withdraw.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Muftah, M. Data-driven learning (DDL) activities: do they truly promote EFL students’ writing skills development?. Educ Inf Technol 28, 13179–13205 (2023). https://doi.org/10.1007/s10639-023-11620-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-023-11620-z