Real Time Captioning and Notes Making of Online Classes

Vasantha Raman, A.; Sanjay Thiruvengadam, V.; Santhosh, J.; Durairaj, Thenmozhi

doi:10.1007/978-3-031-16364-7_16

A. Vasantha Raman¹⁹,
V. Sanjay Thiruvengadam¹⁹,
J. Santhosh¹⁹ &
…
Thenmozhi Durairaj¹⁹

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 654))

Included in the following conference series:

International Conference on Computational Intelligence in Data Science

415 Accesses

Abstract

Due to the COVID-19 pandemic, all activities have turned online. The people who are hard of hearing are facing high difficulty to continue their education. So, the presented system supports them in attending the online classes by providing the real time captions. Additionally, it provides summarized notes for all the students so that they can refer to them before the next class. Google Speech to Text API is used to convert the speech to text, for providing real time captions. Three text summarization models were explored, namely BART, Seq2Seq model and the TextRank algorithm. The BART and the Seq2Seq models require a labelled dataset for training, whereas the TextRank algorithm is an unsupervised learning algorithm. For BART, the dataset is built using semi supervised methods. We evaluated all these models with rouge score evaluation metrics, among these BART proves to be best for our dataset with the following scores of 0.47, 0.30, 0.48 for rouge-1, rouge-2 and rouge-l respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech

Support System for Lecture Captioning Using Keyword Detection by Automatic Speech Recognition

Multimodal Corpus Analysis of Autoblog 2020: Lecture Videos in Machine Learning

References

Baevski, A., Zhou, H., Mohamed, A., Auli, M.: wav2vec 2.0: a framework for self-supervised learning of speech representations. Adv. Neural. Inf. Process. Syst. 33, 2541–2551 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3862–3872 (2017)
Google Scholar
Chiu, C.-C., et al.: State-of-the-art speech recognition with sequence-to-sequence models. IEEE Trans. Learn. Technol. 5(3), 1206–1214
Google Scholar
Zaheer, M.: Big bird: transformers for longer sequences. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 505–516 (2020)
Google Scholar
Lewis, M., et al.: Facebook AI ‘BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension’, pp. 7871–7880. Association for Computational Linguistics (ACL) (2019)
Google Scholar
Yoshioka, T., et al.: Advances in online audio-visual meeting transcription. IEEE Trans. Learn. Technol. 4(2), 1181–1192 (2019)
Google Scholar
Mihalcea, R., Tarau, P: TextRank: Bringing Order into Texts, pp. 404–411. Association for Computational Linguistics (2004)
Google Scholar
Renals, S., Simpson, M.N., Bell, P.J., Barrett, J.: Just-in-time prepared captioning for live transmissions, pp 27–35. IET Publication (2016)
Google Scholar
Ranchal, R., et al.: Using speech recognition for real-time captioning and lecture transcription in the classroom. IEEE Trans. Learn. Technol. 6(4), 299–311 (2013)
Article Google Scholar
Shadiev, R., Hwang, W.-Y., Chen, N.-S., Huang, Y.-M.: Review of speech-to-text recognition technology for enhancing learning. Educ. Technol. Soc. 17, 65–84 (2014)
Google Scholar
Liu, Y., Lapata, M.: Text Summarization with Pretrained Encoders, pp. 3730–3740. Association for Computational Linguistics (ACL) (2019)
Google Scholar
Yan, Y., et al.: ProphetNet: predicting future n-gram for sequence-to-sequence pre-training. In: Findings of the Association for Computational Linguistics, EMNLP 2020, pp. 2401–2410 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
A. Vasantha Raman, V. Sanjay Thiruvengadam, J. Santhosh & Thenmozhi Durairaj

Authors

A. Vasantha Raman
View author publications
You can also search for this author in PubMed Google Scholar
V. Sanjay Thiruvengadam
View author publications
You can also search for this author in PubMed Google Scholar
J. Santhosh
View author publications
You can also search for this author in PubMed Google Scholar
Thenmozhi Durairaj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thenmozhi Durairaj .

Editor information

Editors and Affiliations

Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Lekshmi Kalinathan
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Priyadharsini R.
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Madheswari Kanmani
Sri Sivasubramaniya Nadar College of Engineering, Chennai, India
Manisha S.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vasantha Raman, A., Sanjay Thiruvengadam, V., Santhosh, J., Durairaj, T. (2022). Real Time Captioning and Notes Making of Online Classes. In: Kalinathan, L., R., P., Kanmani, M., S., M. (eds) Computational Intelligence in Data Science. ICCIDS 2022. IFIP Advances in Information and Communication Technology, vol 654. Springer, Cham. https://doi.org/10.1007/978-3-031-16364-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-16364-7_16
Published: 29 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16363-0
Online ISBN: 978-3-031-16364-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Real Time Captioning and Notes Making of Online Classes