research-article

Summarization of Educational Videos with Transformers Networks

Authors:
Leandro Massetti Ribeiro Oliveira

TeleMídia@MA Lab / PPGCC, Universidade Federal do Maranhão, Brazil

TeleMídia@MA Lab / PPGCC, Universidade Federal do Maranhão, Brazil

0000-0002-0097-5161
View Profile

,
Li Chang Shuen

TeleMídia@MA Lab / DCCMAPI, Universidade Federal do Maranhão, Brazil

TeleMídia@MA Lab / DCCMAPI, Universidade Federal do Maranhão, Brazil

0000-0001-9192-6471
View Profile

,
Allan Kássio Beckman Soares da Cruz

TeleMídia@MA Lab / DCCMAPI, Universidade Federal do Maranhão, Brazil

TeleMídia@MA Lab / DCCMAPI, Universidade Federal do Maranhão, Brazil

0000-0002-2631-2032
View Profile

,
Carlos de Salles Soares

TeleMídia@MA Lab / DCCMAPI / PPGCC, Universidade Federal do Maranhão, Brazil

TeleMídia@MA Lab / DCCMAPI / PPGCC, Universidade Federal do Maranhão, Brazil

0000-0002-6800-1881
View Profile

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the WebOctober 2023Pages 137–143https://doi.org/10.1145/3617023.3617042

Published:23 October 2023Publication History

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

Pages 137–143

ABSTRACT

This paper presents an approach to summarize educational videos using Deep Learning Transformers models. The approach focuses on educational content by summarizing captions and using the text results to summarize the videos. Tests were conducted using the EDUVSUM dataset, which improved upon the original paper’s results, achieving an accuracy of 26.53% in a multi-class problem, with a mean absolute error of 1.49 per video frame and 1.45 per video segment. Transformer techniques for automatic text summarization have proven effective in creating multimedia learning objects. The results suggest that these techniques can generate more efficient and high-quality digital educational resources, reducing the time and effort required for their creation.

References

[1] Potapov, D., Douze, M., Harchaoui, Z., & Schmid, C. (2014). Category-specific video summarization. In Springer (Ed.), European Conference on Computer Vision (pp. 540-555). [S.l.].Google Scholar
[2] Ghauri, J. A., Hakimov, S., & Ewerth, R. (2021). Supervised video summarization via multiple feature sets with parallel attention. In IEEE (Ed.), 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6s). [S.l.]: IEEE.Google ScholarCross Ref
[3] Song, Y., Vallmitjana, J., Stent, A., & Jaimes, A. (2015). Tvsum: Summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5179-5187).Google Scholar
[4] Mubarak, A. A., Cao, H., & Ahmed, S. A. (2021). Predictive learning analytics using deep learning model in MOOCs’ courses videos. Education and Information Technologies, 26(1), 371-392.Google ScholarDigital Library
[5] Ghauri, J. A., Hakimov, S., & Ewerth, R. (2020). Classification of important segments in educational videos using multimodal features. arXiv preprint arXiv:2010.13626.Google Scholar
[6] Oliveira, L. M. R., Busson, A. J. G., Salles, S. N. Carlos de, Santos, G. N. dos, & Colcher, S. (2021). Automatic generation of learning objects using text summarizer based on deep learning models. In SBC (Eds.), Anais do XXXII Simpósio Brasileiro de Informática na Educação (pp. 728-736). [S.l.].Google ScholarCross Ref
[7] Alrumiah, S. S., & Al-Shargabi, A. A. (2022). Educational videos subtitles’ summarization using latent dirichlet allocation and length enhancement. CMC-Computers Materials & Continua, 70(3), 6205–6221.Google ScholarCross Ref
[8] Abhilash, R. K., Anurag, C., Avinash, V., & Uma, D. (2021). Lecture video summarization using subtitles. In EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing (pp. 83-92). Springer.Google ScholarCross Ref
[9] Moraes, L., Marcacini, R. M., & Goularte, R. (2022, November). Video summarization using text subjectivity classification. In Proceedings of the Brazilian Symposium on Multimedia and the Web (pp. 133-141).Google ScholarDigital Library
[10] de Souza Barbieri, T. T., & Goularte, R. (2020, November). Investigating Subjectivity Criterion for Multi-video Summarization. In Proceedings of the Brazilian Symposium on Multimedia and the Web (pp. 137-144).Google ScholarDigital Library
[11] Mendes, P. R. C., Vieira, E. S., de Freitas, P. V. A., Busson, A. J. G., Guedes, Á. L. V., Neto, C. D. S. S., & Colcher, S. (2020, November). Shaping the Video Conferences of Tomorrow With AI. In Anais Estendidos do XXVI Simpósio Brasileiro de Sistemas Multimídia e Web (pp. 165-168). SBC.Google Scholar
[12] Soares, E. R., & Barrére, E. (2018, October). A framework for automatic topic segmentation in video lectures. In Anais Estendidos do XXIV Simpósio Brasileiro de Sistemas Multimídia e Web (pp. 31-36). SBC.Google Scholar
[13] Narasimhan, M., Rohrbach, A., & Darrell, T. (2021). Clip-it! language-guided video summarization. Advances in Neural Information Processing Systems, 34, 13988-14000.Google Scholar
[14] Huang, J. H., Murn, L., Mrak, M., & Worring, M. (2021, August). Gpt2mvs: Generative pre-trained transformer-2 for multi-modal video summarization. In Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 580-589).Google ScholarDigital Library
[15] Shang, X., Yuan, Z., Wang, A., & Wang, C. (2021, October). Multimodal video summarization via time-aware transformers. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 1756-1765).Google Scholar
[16] Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv preprint arXiv:1908.10084.Google Scholar
[17] Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). Ms Marco: A human generated machine reading comprehension dataset. In CoCo@ NIPs. [S.l.: s.n.].Google Scholar
[18] Mosley, L. (2013). A balanced approach to the multi-class imbalance problem (Doctoral dissertation). Iowa State University of Science and Technology, USA.Google ScholarCross Ref
[19] de Freitas, P. V., Santos, G. N. D., Busson, A. J., Guedes, Á. L., & Colcher, S. (2019, October). A baseline for NSFW video detection in e-learning environments. In Proceedings of the 25th Brazillian Symposium on Multimedia and the Web (pp. 357-360).Google ScholarDigital Library
[20] Balraj, B. (2021). Multilabel Active Learning for User Context Recognition In-the-Wild. North Carolina State University.Google Scholar

Index Terms

Summarization of Educational Videos with Transformers Networks
1. Applied computing
  1. Education
    1. E-learning
    2. Learning management systems
2. Computing methodologies
  1. Machine learning

Recommendations

Do Open Educational Resources and Cloud Classroom Really Improve Students' Learning?

More and more educational institutions are using educational technologies and online learning materials to help students achieve satisfactory learning effects. However, not all teachers are able to prepare and design digital learning materials for ...
Read More
Using Time-Anchored Peer Comments to Enhance Social Interaction in Online Educational Videos
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems

Online learning is increasingly prevalent as an option for self-learning and as a resource for instructional design. Prerecorded video is currently the main medium of online education content delivery and instruction; this affords asynchronicity and ...
Read More
Understanding foreign language learners’ perceptions of teachers' practice with educational technology with specific reference to Kahoot! and Padlet: A case from China
Abstract
This article reports on a classroom-based investigation into English as a foreign language (EFL) learners’ views on lessons which integrated m-learning tools for assessment (Kahoot!) and collaboration (Padlet). 289 Chinese university students’ ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web
October 2023
285 pages
ISBN:9798400709081
DOI:10.1145/3617023

Copyright © 2023 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Machine learning
e-learning
transformers
video summarization.
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate270of873submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 56
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Summarization of Educational Videos with Transformers Networks

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Do Open Educational Resources and Cloud Classroom Really Improve Students' Learning?

Using Time-Anchored Peer Comments to Enhance Social Interaction in Online Educational Videos

Understanding foreign language learners’ perceptions of teachers' practice with educational technology with specific reference to Kahoot! and Padlet: A case from China

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Summarization of Educational Videos with Transformers Networks

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Do Open Educational Resources and Cloud Classroom Really Improve Students' Learning?

Using Time-Anchored Peer Comments to Enhance Social Interaction in Online Educational Videos

Understanding foreign language learners’ perceptions of teachers' practice with educational technology with specific reference to Kahoot! and Padlet: A case from China

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media