ABSTRACT
Supervised classification models are commonly used for classifying discussions in a MOOC forum. In most cases these models require a tedious process for manual labeling the forum messages as training data. So, new methods are needed to reduce the human effort necessary for the preparation of such training datasets. In this study we follow an incremental approach in order to examine how soon after the beginning of a new course, we have collected enough data for training a supervised classification model. We show that by employing features that derive from a seeded topic modeling method, we achieve classifiers with reliable performance early enough in the course life, thus reducing significantly the human effort. The content of the MOOC platform is used to bias the topic extraction towards discussions related to (a) course content, (b) logistics, or (c) social interactions. Then, we develop a supervised model at the start of each week based on the topic features of all previous weeks and evaluate its performance in classifying the discussions for the rest of the course. Our approach was implemented in three different MOOCs of different subjects and different sizes. The findings reveal that supervised models are able to perform reliably quite early in a MOOC's life and retain a steady overall accuracy across the remaining weeks, without requiring to be trained with the entire forum dataset.
Supplemental Material
- Melody M. Terras and Judith Ramsay. 2015. Massive open online courses (MOOCs): Insights and challenges from a psychological perspective. British Journal of Educational Technology 46, 3 (2015), 472--487.Google ScholarCross Ref
- René F. Kizilcec, Chris Piech, and Emily Schneider. 2013. Deconstructing disengagement: analyzing learner subpopulations in massive open online courses. In Proceedings of the Third International Conference on Learning Analytics and Knowledge (LAK '13), Association for Computing Machinery, New York, NY, USA, 170--179.Google ScholarDigital Library
- Anastasios Ntourmas, Nikolaos Avouris, Sophia Daskalaki, and Yannis Dimitriadis. 2019. Evaluation of a Massive Online Course Forum: Design Issues and Their Impact on Learners' Support. In Human-Computer Interaction -- INTERACT 2019 (Lecture Notes in Computer Science), Springer International Publishing, Cham, 197--206.Google Scholar
- Panagiotis Adamopoulos. 2013. What Makes a Great MOOC? An Interdisciplinary Analysis of Student Retention in Online Courses. In Proceedings of the 34th International Conference on Information Systems: ICIS 2013 (2013).Google Scholar
- David A. Wiley and Erin K. Edwards. 2002. Online Self-Organizing Social Systems: The Decentralized Future of Online Learning. Quarterly Review of Distance Education 3, 1 (2002), 33--46.Google Scholar
- Siwei Fu, Jian Zhao, Weiwei Cui and Huamin Qu. 2017. Visual Analysis of MOOC Forums with iForum. IEEE Transactions on Visualization and Computer Graphics 23, 1 (January 2017), 201--210.Google ScholarDigital Library
- Alyssa Friend Wise, Yi Cui, and Jovita Vytasek. 2016. Bringing order to chaos in MOOC discussion forums with content-related thread identification. In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge (LAK '16), Association for Computing Machinery, New York, NY, USA, 188--197.Google ScholarDigital Library
- Christopher G. Brinton, Mung Chiang, Shaili Jain, Henry Lam, Zhenming Liu and Felix Ming Fai Wong. 2014. Learning about Social Learning in MOOCs: From Statistical Analysis to Generative Model. IEEE Transactions on Learning Technologies 7, 4 (October 2014), 346--359.Google ScholarCross Ref
- Michael Rowe. 2018. "Operating at the Limit of what was Possible": A case study of facilitator experiences in an Open Online Course. Curriculum and Teaching 33, 2 (December 2018), 91--105.Google ScholarCross Ref
- Afsaneh Sharif and Barry Magrill. 2015. Discussion Forums in MOOCs. International Journal of Learning, Teaching and Educational Research 12, 1 (July 2015).Google Scholar
- Omaima Almatrafi, Aditya Johri, and Huzefa Rangwala. 2018. Needle in a haystack: Identifying learner posts that require urgent response in MOOC discussion forums. Computers & Education 118, (March 2018), 1--9.Google Scholar
- Xiaocong Wei, Hongfei Lin, Liang Yang, and Yuhai Yu. 2017. A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification. Information 8, 3 (September 2017), 92.Google ScholarCross Ref
- Jing Chen, Jun Feng, Xia Sun, and Yang Liu. 2020. Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts. Symmetry 12, 1 (January 2020), 8.Google ScholarCross Ref
- Mi Fei and Dit-Yan Yeung. 2015. Temporal Models for Predicting Student Dropout in Massive Open Online Courses. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), 256--263.Google Scholar
- Marius Kloft, Felix Stiehler, Zhilin Zheng, and Niels Pinkwart. 2014. Predicting MOOC Dropout over Weeks Using Machine Learning Methods. In Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, Association for Computational Linguistics, Doha, Qatar, 60--65.Google ScholarCross Ref
- Thushari Atapattu and Katrina Falkner. 2016. A Framework for Topic Generation and Labeling from MOOC Discussions. In Proceedings of the Third ACM Conference on Learning @ Scale (L@S'16), Association for Computing Machinery, New York, NY, USA, 201--204.Google ScholarDigital Library
- Alexander William Wong, Ken Wong, and Abram Hindle. 2019. Tracing Forum Posts to MOOC Content using Topic Analysis. arXiv:1904.07307 (April 2019).Google Scholar
- Jagadeesh Jagarlamudi, Hal Daumé, and Raghavendra Udupa. 2012. Incorporating lexical priors into topic models. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL '12), Association for Computational Linguistics, USA, 204--213.Google ScholarDigital Library
- Anastasios Ntourmas, Sophia Daskalaki, Yannis Dimitriadis, and Nikolaos Avouris. 2021. Classifying MOOC forum posts using corpora semantic similarities: a study on transferability across different courses. Neural Computing and Applications, 1--15.Google Scholar
- Arti Ramesh, Shachi H. Kumar, James Foulds, and Lise Getoor. 2015. Weakly Supervised Models of Aspect-Sentiment for Online Course Discussion Forums. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Beijing, China, 74--83.Google Scholar
- Anastasios Ntourmas, Nikolaos Avouris, Sophia Daskalaki, and Yannis Dimitriadis. 2019. Teaching Assistants in MOOCs Forums: Omnipresent Interlocutors or Knowledge Facilitators. In European conference on technology enhanced learning, Springer International Publishing, Cham, 236--250.Google ScholarDigital Library
- Mousumi Banerjee, Michelle Capozzoli, Laura McSweeney and Debajyoti Sinha. 1999. "Beyond kappa: A review of interrater agreement measures," Canadian Journal of Statistics 27, 1 (1999), 3--23.Google ScholarCross Ref
- Nicolas Hernandez and Amir Hazem. 2018. PyRATA, Python Rule-based feAture sTructure Analysis. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), 2093--2098.Google Scholar
- Ryan J. Gallagher, Kyle Reing, David Kale, and Greg Ver Steeg. 2017. Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge. Transactions of the Association for Computational Linguistics 5, (December 2017), 529--542.Google ScholarCross Ref
- Wanli Xing, Xin Chen, Jared Stein, and Michael Marcinkowski. 2016. Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization. Computers in Human Behavior 58, (May 2016), 119--129.Google ScholarDigital Library
- Cheng Ye and Gautam Biswas. 2014. Early Prediction of Student Dropout and Performance in MOOCs using Higher Granularity Temporal Information. Learning Analytics 1, 3 (December 2014), 169--172.Google ScholarCross Ref
- Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information Processing & Management 24, 5 (January 1988), 513--523.Google ScholarDigital Library
- J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 1 (1977), 159--174.Google ScholarCross Ref
- Rasoul S. Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man and Cybernetics 21, 3 (May 1991), 660--674.Google ScholarCross Ref
- Colleen M. Farrelly. 2017. Deep vs. Diverse Architectures for Classification Problems. arXiv:1708.06347Google Scholar
- Saumya Debray, Sampath Kannan, and Mukul Paithane. 1992. Weighted Decision Trees. In Proceedings of the Joint International Conference and Symposium on Logic Programming, MIT Press, 654--668.Google Scholar
- Jaime Arguello and Kyle Shaffer. 2015. Predicting Speech Acts in MOOC Forum Posts. In Proceedings of the Ninth International AAAI Conference on Web and Social Media (ICWSM) 9, 1 (April 2015).Google Scholar
Index Terms
- Classification of Discussions in MOOC Forums: An Incremental Modeling Approach
Recommendations
A Framework for Topic Generation and Labeling from MOOC Discussions
L@S '16: Proceedings of the Third (2016) ACM Conference on Learning @ ScaleThis study proposes a standardised open framework to automatically generate and label discussion topics from Massive Open Online Courses (MOOCs). The proposed framework expects to overcome the issues experienced by MOOC participants and teaching staff ...
Superposter behavior in MOOC forums
L@S '14: Proceedings of the first ACM conference on Learning @ scale conferenceDiscussion forums, employed by MOOC providers as the primary mode of interaction among instructors and students, have emerged as one of the important components of online courses. We empirically study contribution behavior in these online collaborative ...
Untangling chaos in discussion forums: A temporal analysis of topic-relevant forum posts in MOOCs
AbstractAn effective experience in discussion forums is important for online learners to maintain their persistence in a MOOC. The purpose of this research is to identify learners’ meaningful participation patterns of topic-related forum posts ...
Highlights- Classified topic-relevant forum posts in MOOCs using latent semantic analysis machine learning algorithm.
Comments