Skip to main content

A Finetuned Language Model for Recommending cQA-QAs for Enriching Textbooks

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Abstract

Textbooks play a vital role in any educational system, despite their clarity and information, students tend to use community question answers (cQA) forums to acquire more knowledge. Due to the high data volume, the quality of Question-Answers (QA) of cQA forums can differ greatly, so it takes additional effort to go through all possible QA pairs for a better insight. This paper proposes an “sentence-level text enrichment system” where the fine-tuned BERT (Bidirectional Encoder Representations from Transformers) summarizer understands the given text, picks out the important sentence, and then rearranged them to give the overall summary of the text document. For each important sentence, we recommend the relevant QA pairs from cQA to make the learning more effective. In this work, we fine-tuned the pre-trained BERT model to extract the relevant QA sets that are most relevant for enriching important sentences of the textbook. We notice that fine-tuning the BERT model significantly improves the performance for QA selection and find that it outperforms existing RNN-based models for such tasks. We also investigate the effectiveness of our fine-tuned BERT\(_\mathrm{Large}\) model on three cQA datasets for the QA selection task and observed a maximum improvement of 19.72% compared to the previous models. Experiments have been carried out on NCERT (Grade IX and X) Textbooks from India and “Pattern Recognition and Machine Learning” Textbook. The extensive evaluation methods demonstrate that the proposed model offers more precise and relevant recommendations in comparison to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.quora.com/.

  2. 2.

    https://www.reddit.com/.

References

  1. Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: Proceedings of the First ACM Symposium on Computing for Development, ACM DEV 2010, pp. 19:1–19:9 (2010)

    Google Scholar 

  2. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2006)

    MATH  Google Scholar 

  3. Chen, Q., Hu, Q., Huang, J.X., He, L.: Can: enhancing sentence similarity modeling with collaborative and adversarial network. In: The 41st International ACM SIGIR Conference on Research; Development in Information Retrieval. SIGIR 2018, pp. 815–824. NY, USA, New York (2018)

    Google Scholar 

  4. Chen, Q., Hu, Q., Huang, X., He, L.: Ca-rnn: using context-aligned recurrent neural networks for modeling sentence similarity. In: AAAI (2018)

    Google Scholar 

  5. Yang, D.: Multi-task learning with multi-view attention for answer selection and knowledge base question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33(01), pp. 6318–6325, July 2019

    Google Scholar 

  6. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: re-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 ACL: Human Language Technologies, vol. 1, pp. 4171–4186, Minneapolis, Minnesota, June 2019

    Google Scholar 

  7. Herrera, J., Poblete, B., Parra, D.: Learning to leverage microblog information for QA retrieval. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 507–520. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_38

    Chapter  Google Scholar 

  8. Kumar, S., Chauhan, A.: Enriching textbooks by question-answers using CQA. In: IEEE Region 10 Conference (TENCON), pp. 707–714 (2019)

    Google Scholar 

  9. Kumar, S., Chauhan, A.: Making kids learning joyful using artistic style transferred youtube vcs. In: IEEE Region 10 Conference (TENCON), pp. 1106–1111 (2020)

    Google Scholar 

  10. Kumar, S., Chauhan, A.: Recommending question-answers for enriching textbooks. In: Bellatreche, L., Goyal, V., Fujita, H., Mondal, A., Reddy, P.K. (eds.) BDA 2020. LNCS, vol. 12581, pp. 308–328. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66665-1_20

    Chapter  Google Scholar 

  11. Laskar, Md.T.R., Hoque, E., Huang, J.X.: Utilizing bidirectional encoder representations from transformers for answer selection (2020)

    Google Scholar 

  12. Laskar, Md.T.R., Huang, J.X., Hoque, E.: Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5505–5514, May 2020

    Google Scholar 

  13. Macina, J., Srba, I., Williams, J.J., Bielikova, M.: Educational question routing in online student communities. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys ’17, pp. 47–55 (2017)

    Google Scholar 

  14. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space (2013)

    Google Scholar 

  15. Nakov, P.: SemEval-2017 task 3: community question answering. In: Proceedings of the 11th International Workshop on (SemEval-2017), pp. 27–48, Vancouver, Canada, August 2017

    Google Scholar 

  16. Patro, B.N., Kurmi, V.K., Kumar, S., Namboodiri, V.P.: Learning semantic sentence embeddings using pair-wise discriminator. CoRR, abs/1806.00807 (2018)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1532–1543, October 2014

    Google Scholar 

  18. Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of ACL: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237, New Orleans, Louisiana, June 2018

    Google Scholar 

  19. Rao, J., Liu, L., Tay, Y., Yang, W., Shi, P., Lin, J.: Bridging the gap between relevance matching and semantic matching for short text similarity modeling. In: Proceedings of EMNLP-IJCNLP, pp. 5370–5381, November 2019

    Google Scholar 

  20. Sha, L., Zhang, X., Qian, F., Chang, B., Sui, Z.: A multi-view fusion neural network for answer selection. In: AAAI (2018)

    Google Scholar 

  21. Tay, Y., Luu, A.T., Hui, S.: Hyperbolic representation learning for fast and efficient neural question answering. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (2018)

    Google Scholar 

  22. Tay, Y., Phan, M.C., Tuan, L.A., Hui, S.C.: Learning to rank question answer pairs with holographic dual lstm architecture. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017, pp. 695–704 (2017)

    Google Scholar 

  23. Ashish Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates Inc. (2017)

    Google Scholar 

  24. Weissenborn, D., Wiese, G., Seiffe, L.: Fastqa: a simple and efficient neural architecture for question answering. ArXiv, abs/1703.04816 (2017)

    Google Scholar 

  25. Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. CoRR, abs/1910.03771 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, S., Chauhan, A. (2021). A Finetuned Language Model for Recommending cQA-QAs for Enriching Textbooks. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75765-6_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75764-9

  • Online ISBN: 978-3-030-75765-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics