ABSTRACT
This paper presents an approach to summarize educational videos using Deep Learning Transformers models. The approach focuses on educational content by summarizing captions and using the text results to summarize the videos. Tests were conducted using the EDUVSUM dataset, which improved upon the original paper’s results, achieving an accuracy of 26.53% in a multi-class problem, with a mean absolute error of 1.49 per video frame and 1.45 per video segment. Transformer techniques for automatic text summarization have proven effective in creating multimedia learning objects. The results suggest that these techniques can generate more efficient and high-quality digital educational resources, reducing the time and effort required for their creation.
- [1] Potapov, D., Douze, M., Harchaoui, Z., & Schmid, C. (2014). Category-specific video summarization. In Springer (Ed.), European Conference on Computer Vision (pp. 540-555). [S.l.].Google Scholar
- [2] Ghauri, J. A., Hakimov, S., & Ewerth, R. (2021). Supervised video summarization via multiple feature sets with parallel attention. In IEEE (Ed.), 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6s). [S.l.]: IEEE.Google ScholarCross Ref
- [3] Song, Y., Vallmitjana, J., Stent, A., & Jaimes, A. (2015). Tvsum: Summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5179-5187).Google Scholar
- [4] Mubarak, A. A., Cao, H., & Ahmed, S. A. (2021). Predictive learning analytics using deep learning model in MOOCs’ courses videos. Education and Information Technologies, 26(1), 371-392.Google ScholarDigital Library
- [5] Ghauri, J. A., Hakimov, S., & Ewerth, R. (2020). Classification of important segments in educational videos using multimodal features. arXiv preprint arXiv:2010.13626.Google Scholar
- [6] Oliveira, L. M. R., Busson, A. J. G., Salles, S. N. Carlos de, Santos, G. N. dos, & Colcher, S. (2021). Automatic generation of learning objects using text summarizer based on deep learning models. In SBC (Eds.), Anais do XXXII Simpósio Brasileiro de Informática na Educação (pp. 728-736). [S.l.].Google ScholarCross Ref
- [7] Alrumiah, S. S., & Al-Shargabi, A. A. (2022). Educational videos subtitles’ summarization using latent dirichlet allocation and length enhancement. CMC-Computers Materials & Continua, 70(3), 6205–6221.Google ScholarCross Ref
- [8] Abhilash, R. K., Anurag, C., Avinash, V., & Uma, D. (2021). Lecture video summarization using subtitles. In EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing (pp. 83-92). Springer.Google ScholarCross Ref
- [9] Moraes, L., Marcacini, R. M., & Goularte, R. (2022, November). Video summarization using text subjectivity classification. In Proceedings of the Brazilian Symposium on Multimedia and the Web (pp. 133-141).Google ScholarDigital Library
- [10] de Souza Barbieri, T. T., & Goularte, R. (2020, November). Investigating Subjectivity Criterion for Multi-video Summarization. In Proceedings of the Brazilian Symposium on Multimedia and the Web (pp. 137-144).Google ScholarDigital Library
- [11] Mendes, P. R. C., Vieira, E. S., de Freitas, P. V. A., Busson, A. J. G., Guedes, Á. L. V., Neto, C. D. S. S., & Colcher, S. (2020, November). Shaping the Video Conferences of Tomorrow With AI. In Anais Estendidos do XXVI Simpósio Brasileiro de Sistemas Multimídia e Web (pp. 165-168). SBC.Google Scholar
- [12] Soares, E. R., & Barrére, E. (2018, October). A framework for automatic topic segmentation in video lectures. In Anais Estendidos do XXIV Simpósio Brasileiro de Sistemas Multimídia e Web (pp. 31-36). SBC.Google Scholar
- [13] Narasimhan, M., Rohrbach, A., & Darrell, T. (2021). Clip-it! language-guided video summarization. Advances in Neural Information Processing Systems, 34, 13988-14000.Google Scholar
- [14] Huang, J. H., Murn, L., Mrak, M., & Worring, M. (2021, August). Gpt2mvs: Generative pre-trained transformer-2 for multi-modal video summarization. In Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 580-589).Google ScholarDigital Library
- [15] Shang, X., Yuan, Z., Wang, A., & Wang, C. (2021, October). Multimodal video summarization via time-aware transformers. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 1756-1765).Google Scholar
- [16] Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv preprint arXiv:1908.10084.Google Scholar
- [17] Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). Ms Marco: A human generated machine reading comprehension dataset. In CoCo@ NIPs. [S.l.: s.n.].Google Scholar
- [18] Mosley, L. (2013). A balanced approach to the multi-class imbalance problem (Doctoral dissertation). Iowa State University of Science and Technology, USA.Google ScholarCross Ref
- [19] de Freitas, P. V., Santos, G. N. D., Busson, A. J., Guedes, Á. L., & Colcher, S. (2019, October). A baseline for NSFW video detection in e-learning environments. In Proceedings of the 25th Brazillian Symposium on Multimedia and the Web (pp. 357-360).Google ScholarDigital Library
- [20] Balraj, B. (2021). Multilabel Active Learning for User Context Recognition In-the-Wild. North Carolina State University.Google Scholar
Index Terms
- Summarization of Educational Videos with Transformers Networks
Recommendations
Do Open Educational Resources and Cloud Classroom Really Improve Students' Learning?
More and more educational institutions are using educational technologies and online learning materials to help students achieve satisfactory learning effects. However, not all teachers are able to prepare and design digital learning materials for ...
Using Time-Anchored Peer Comments to Enhance Social Interaction in Online Educational Videos
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing SystemsOnline learning is increasingly prevalent as an option for self-learning and as a resource for instructional design. Prerecorded video is currently the main medium of online education content delivery and instruction; this affords asynchronicity and ...
Understanding foreign language learners’ perceptions of teachers' practice with educational technology with specific reference to Kahoot! and Padlet: A case from China
AbstractThis article reports on a classroom-based investigation into English as a foreign language (EFL) learners’ views on lessons which integrated m-learning tools for assessment (Kahoot!) and collaboration (Padlet). 289 Chinese university students’ ...
Comments