Abstract
Information overload in MOOC discussion forums is a major problem that hinders the effectiveness of learner facilitation by the course staff. To address this issue, supervised classification models have been studied and developed in order to assist course facilitators in detecting forum discussions that seek for their intervention. A key issue studied by the literature refers to the transferability of these models to domains other than the domain in which they were initially trained. Typically these models employ domain-dependent features, and therefore they fail to transfer to other subject matters. In this study, we propose and evaluate an alternative way of building supervised models in this context, by using the semantic similarities of the forum transcripts with the dynamically created corpora from the MOOC environment as training features. Specifically, in this study, we analyze the case of two MOOCs, in which the models that we built are classifying forum discussions into three categories, course logistics, content-related and no action required. Furthermore, we evaluate the transferability of the derived models and interpret which features can be effectively transferred to other unseen courses. The findings of this study reveal the main benefits and trade-offs of the proposed approach and provide MOOC developers with insights about the main issues that inhibit the transferability of these models.
Similar content being viewed by others
Change history
22 April 2021
A Correction to this paper has been published: https://doi.org/10.1007/s00521-021-05904-z
References
Terras MM, Ramsay J (2015) Massive open online courses (MOOCs): Insights and challenges from a psychological perspective. B J Educ Technol 46(3):472–487. https://doi.org/10.1111/bjet.12274
O’Reilly UM, Veeramachaneni K (2014) Technology for mining the big data of MOOCs. Res Pract Assess 9:29–37
Kizilcec RF, Piech C, Schneider E (2013) Deconstructing disengagement: analyzing learner subpopulations in massive open online courses. In: Learning analytics & knowledge, pp 170–179. https://doi.org/10.1145/2460296.2460330
Kennedy G, Coffrin C, De Barba P, Corrin L (2015) Predicting success: how learners’ prior knowledge, skills and activities predict MOOC performance. In: Learning analytics & knowledge, pp 136–140
Liyanagunawardena TR, Parslow P, Williams SA (2014) Dropout: MOOC Participants’ Perspective. In: European MOOCs Stakeholders Summit, pp 95–100
Hecking T, Chounta IA, Hoppe HU (2017) Role modelling in MOOC discussion forums. J Learn Anal 4(1):85–116. https://doi.org/10.18608/jla.2017.41.6
Kumar M, Kan MY, Tan BC, Ragupathi K (2015) Learning Instructor Intervention from MOOC Forums: Early Results and Issues. In: Educational data mining, pp 218-225
Wiley DA, Edwards EK (2002) Online self-organizing social systems: The decentralized future of online learning. Q Rev Distance Educ 3(1):33–46
Drachsler H, Kalz M (2016) The MOOC and learning analytics innovation cycle (MOLAC): a reflective summary of ongoing research and its challenges. J Comput Assist Learn 32(3):281–290. https://doi.org/10.1111/jcal.12135
Ntourmas A, Avouris N, Daskalaki S, Dimitriadis Y (2018) Teaching assistants’ interventions in online courses: a comparative study of two massive open online courses. In: Pan-Hellenic conference on informatics, pp 288-293. https://doi.org/10.1145/3291533.3291563
Peters VL, Hewitt J (2010) An investigation of student practices in asynchronous computer conferencing courses. Comput Educ 54(4):951–961. https://doi.org/10.1016/j.compedu.2009.09.030
Brinton CG, Chiang M, Jain S, Lam H, Liu Z, Wong FMF (2014) Learning about social learning in MOOCs: From statistical analysis to generative model. IEEE Trans Learn Technol 7(4):346–359. https://doi.org/10.1109/TLT.2014.2337900
Rowe M (2018) Operating at the Limit of what was Possible: A case study of facilitator experiences in an Open Online Course. Curric Teach 33(2):91–105. https://doi.org/10.7459/ct/33.2.06
Ntourmas A, Avouris N, Daskalaki S, Dimitriadis Y (2019) Evaluation of a Massive Online Course forum: design issues and their impact on learners’ support. In: IFIP conference on human-computer interaction, pp 197-206
Ntourmas A, Avouris N, Daskalaki S, Dimitriadis Y (2019) Teaching Assistants in MOOCs Forums: Omnipresent Interlocutors or Knowledge Facilitators. In: European conference on technology enhanced learning, pp 236-250
Sharif A, Magrill B (2015) Discussion forums in MOOCs. Int J Learn Teach Educ Res 12(1):119–132
Fu S, Zhao J, Cui W, Qu H (2016) Visual analysis of MOOC forums with iForum. IEEE Trans Vis Comput Graph 23(1):201–210. https://doi.org/10.1109/TVCG.2016.2598444
Wong JS (2018) Messagelens: A visual analytics system to support multifaceted exploration of MOOC forum discussions. Visual Inf. 2(1):37–49. https://doi.org/10.1016/j.visinf.2018.04.005
Chandrasekaran MK, Kan MY, Tan BC, Ragupathi K (2015) Learning instructor intervention from mooc forums: Early results and issues. In: Educational data mining, pp 218-225
Chandrasekaran MK, Epp CD, Kan MY, Litman DJ (2017) Using discourse signals for robust instructor intervention prediction. In: AAAI conference on artificial intelligence, pp 3415–3421
Yang D, Piergallini M, Howley I, Rose C (2014) Forum thread recommendation for massive open online courses. In: Educational data mining, pp 257–260
Howley I, Tomar GS, Ferschke O, Rose CP (2017) Reputation systems impact on help seeking in mooc discussion forums. IEEE Trans Learn Technol 99(1):1–14. https://doi.org/10.1109/TLT.2017.2776273
Ntourmas A, Avouris N, Daskalaki S, Dimitriadis Y (2018) Comparative study of MOOC forums: Does course subject matter?. In: ICT in Education, pp 1–8
Moreno-Marcos PM, De Laet T, Muñoz-Merino PJ, Van Soom C, Broos T, Verbert K, Delgado Kloos C (2019) Generalizing predictive models of admission test success based on online interactions. Sustainability 11(18):4940. https://doi.org/10.3390/su11184940
Ferguson R, Clow D, Macfadyen L, Essa A, Dawson S, Alexander S (2014) Setting learning analytics in context: Overcoming the barriers to large-scale adoption. In: Learning Analytics And Knowledge, pp 251-253. https://doi.org/10.1145/2567574.2567592
Gašević D, Dawson S, Siemens G (2015) Let’s not forget: Learning analytics are about learning. TechTrends 59(1):64–71. https://doi.org/10.1007/s11528-014-0822-x
Shatnawi S, Gaber MM, Cocea M (2014) Automatic content related feedback for MOOCs based on course domain ontology. In: Intelligent data engineering and automated learning, pp 27-35. https://doi.org/10.1007/978-3-319-10840-7_4
Atapattu T, Falkner K (2016) A framework for topic generation and labeling from MOOC discussions. In: Learning at Scale, pp 201-204. https://doi.org/10.1145/2876034.2893414
Ezen-Can A, Boyer KE, Kellogg S, Booth S (2015) Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach. Learning Analytics & Knowledge, pp 416–150 https://doi.org/10.1145/2723576.2723589
Liu W, Kidzićski Ł, Dillenbourg P (2016) Semiautomatic annotation of mooc forum posts. In: State-of-the-art and future directions of smart learning, pp 399-408
Almatrafi O, Johri A, Rangwala H (2018) Needle in a haystack: Identifying learner posts that require urgent response in MOOC discussion forums. Comput Educ 118:1–9. https://doi.org/10.1016/j.compedu.2017.11.002
Boyer S, Veeramachaneni K (2015) Transfer learning for predictive models in massive open online courses. In: Artificial intelligence in education, pp 54-63
Whitehill J, Williams J, Lopez G, Coleman C, Reich J (2015) Beyond prediction: First steps toward automatic intervention in MOOC student stopout. Educational data mining, pp 171–178. https://doi.org/10.2139/ssrn.2611750
Kizilcec RF, Halawa S (2015) Attrition and achievement gaps in online learning. Learning at Scale, pp 57–66. https://doi.org/10.1145/2724660.2724680
Kidzinsk L, Sharma K, Boroujeni MS, Dillenbourg P (2016) On Generalizability of MOOC Models. In: International educational data mining society, pp 406–411
Wise AF, Cui Y, Vytasek J (2016) Bringing order to chaos in MOOC discussion forums with content related thread identification. Learning Analytics & Knowledge, pp 188–197https://doi.org/10.1145/2883851.2883916
Banerjee M, Capozzoli M, McSweeney L, Sinha D (1999) Beyond kappa: a review of interrater agreement measures. Can J Stat 27(1):3–23. https://doi.org/10.2307/3315487
Hernandez N, Hazem A (2018). PyRATA, Python Rule-based feAture sTructure Analysis. Language Resources and Evaluation. https://www.aclweb.org/anthology/L18-1330
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Duan KB, Keerthi SS (2005) Which is the best multiclass SVM method? An empirical study. In: International workshop on multiple classifier systems, pp 278-285
Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inf Retr 4(1):5–31. https://doi.org/10.1023/A:1011441423217
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. https://doi.org/10.2307/2529310
Ntourmas A, Avouris N, Daskalaki S, Dimitriadis Y (2019) Comparative study of two different MOOC forums posts classifiers: analysis and generalizability issues. In: International conference on information, intelligence, systems and applications, pp 1-8. https://doi.org/10.1109/IISA.2019.8900682
Acknowledgements
This research is performed in the frame of collaboration of the University of Patras with online platform mathesis.cup.gr. Supply of MOOCs data, by Mathesis is gratefully acknowledged. Doctoral scholarship “Strengthening Human Resources Research Potential via Doctorate Research – 2nd Cycle” (MIS-5000432), implemented by the State Scholarships Foundation (IKY) is also gratefully acknowledged. This research has also been partially funded by the Spanish State Research Agency (AEI) under project Grants TIN2014-53199-C3-2-R and TIN2017-85179-C3-2-R, the Regional Government of Castilla y León Grant VA082U16, the EC Grant 588438-EPP-1-2017-1-EL-EPPKA2-KA.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original version of this article was revised: Due to open choice cancellation.
Rights and permissions
About this article
Cite this article
Ntourmas, A., Daskalaki, S., Dimitriadis, Y. et al. Classifying MOOC forum posts using corpora semantic similarities: a study on transferability across different courses. Neural Comput & Applic 35, 161–175 (2023). https://doi.org/10.1007/s00521-021-05750-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05750-z