Abstract
Textbooks play a vital role in any educational system, despite their clarity and information, students tend to use community question answers (cQA) forums to acquire more knowledge. Due to the high data volume, the quality of Question-Answers (QA) of cQA forums can differ greatly, so it takes additional effort to go through all possible QA pairs for a better insight. This paper proposes an “sentence-level text enrichment system” where the fine-tuned BERT (Bidirectional Encoder Representations from Transformers) summarizer understands the given text, picks out the important sentence, and then rearranged them to give the overall summary of the text document. For each important sentence, we recommend the relevant QA pairs from cQA to make the learning more effective. In this work, we fine-tuned the pre-trained BERT model to extract the relevant QA sets that are most relevant for enriching important sentences of the textbook. We notice that fine-tuning the BERT model significantly improves the performance for QA selection and find that it outperforms existing RNN-based models for such tasks. We also investigate the effectiveness of our fine-tuned BERT\(_\mathrm{Large}\) model on three cQA datasets for the QA selection task and observed a maximum improvement of 19.72% compared to the previous models. Experiments have been carried out on NCERT (Grade IX and X) Textbooks from India and “Pattern Recognition and Machine Learning” Textbook. The extensive evaluation methods demonstrate that the proposed model offers more precise and relevant recommendations in comparison to the state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: Proceedings of the First ACM Symposium on Computing for Development, ACM DEV 2010, pp. 19:1–19:9 (2010)
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2006)
Chen, Q., Hu, Q., Huang, J.X., He, L.: Can: enhancing sentence similarity modeling with collaborative and adversarial network. In: The 41st International ACM SIGIR Conference on Research; Development in Information Retrieval. SIGIR 2018, pp. 815–824. NY, USA, New York (2018)
Chen, Q., Hu, Q., Huang, X., He, L.: Ca-rnn: using context-aligned recurrent neural networks for modeling sentence similarity. In: AAAI (2018)
Yang, D.: Multi-task learning with multi-view attention for answer selection and knowledge base question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33(01), pp. 6318–6325, July 2019
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: re-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 ACL: Human Language Technologies, vol. 1, pp. 4171–4186, Minneapolis, Minnesota, June 2019
Herrera, J., Poblete, B., Parra, D.: Learning to leverage microblog information for QA retrieval. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 507–520. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_38
Kumar, S., Chauhan, A.: Enriching textbooks by question-answers using CQA. In: IEEE Region 10 Conference (TENCON), pp. 707–714 (2019)
Kumar, S., Chauhan, A.: Making kids learning joyful using artistic style transferred youtube vcs. In: IEEE Region 10 Conference (TENCON), pp. 1106–1111 (2020)
Kumar, S., Chauhan, A.: Recommending question-answers for enriching textbooks. In: Bellatreche, L., Goyal, V., Fujita, H., Mondal, A., Reddy, P.K. (eds.) BDA 2020. LNCS, vol. 12581, pp. 308–328. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66665-1_20
Laskar, Md.T.R., Hoque, E., Huang, J.X.: Utilizing bidirectional encoder representations from transformers for answer selection (2020)
Laskar, Md.T.R., Huang, J.X., Hoque, E.: Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5505–5514, May 2020
Macina, J., Srba, I., Williams, J.J., Bielikova, M.: Educational question routing in online student communities. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys ’17, pp. 47–55 (2017)
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space (2013)
Nakov, P.: SemEval-2017 task 3: community question answering. In: Proceedings of the 11th International Workshop on (SemEval-2017), pp. 27–48, Vancouver, Canada, August 2017
Patro, B.N., Kurmi, V.K., Kumar, S., Namboodiri, V.P.: Learning semantic sentence embeddings using pair-wise discriminator. CoRR, abs/1806.00807 (2018)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1532–1543, October 2014
Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of ACL: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237, New Orleans, Louisiana, June 2018
Rao, J., Liu, L., Tay, Y., Yang, W., Shi, P., Lin, J.: Bridging the gap between relevance matching and semantic matching for short text similarity modeling. In: Proceedings of EMNLP-IJCNLP, pp. 5370–5381, November 2019
Sha, L., Zhang, X., Qian, F., Chang, B., Sui, Z.: A multi-view fusion neural network for answer selection. In: AAAI (2018)
Tay, Y., Luu, A.T., Hui, S.: Hyperbolic representation learning for fast and efficient neural question answering. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (2018)
Tay, Y., Phan, M.C., Tuan, L.A., Hui, S.C.: Learning to rank question answer pairs with holographic dual lstm architecture. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017, pp. 695–704 (2017)
Ashish Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates Inc. (2017)
Weissenborn, D., Wiese, G., Seiffe, L.: Fastqa: a simple and efficient neural architecture for question answering. ArXiv, abs/1703.04816 (2017)
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. CoRR, abs/1910.03771 (2019)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kumar, S., Chauhan, A. (2021). A Finetuned Language Model for Recommending cQA-QAs for Enriching Textbooks. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-75765-6_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75764-9
Online ISBN: 978-3-030-75765-6
eBook Packages: Computer ScienceComputer Science (R0)