Skip to main content

LTWNN: A Novel Approach Using Sentence Embeddings for Extracting Diverse Concepts in MOOCs

  • Conference paper
  • First Online:
  • 1754 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13151))

Abstract

As a global online education platform, Massive Open Online Courses (MOOCs) provide high-quality learning content. It is a challenging issue to design a key course concept for students with different backgrounds. Even though much work concerned with course concept extraction in MOOC has been done, those related works simply utilize external knowledge to get the relatedness of two different candidate concepts. Furthermore, they require the input to belong to multi-document and severely rely on seed sets, in which their model shows poor performance when input is a single document. Addressing these drawbacks, we tackle concept extraction from a single document using LTWNN, a novel method Learning to Weight with Neural Network for Course Concept Extraction in MOOCs. With LTWNN, we make full use of external knowledge via making relatedness between each candidate concept and document by introducing an embedding-based maximal marginal relevance (MMR), which explicitly increases diversity among selected concepts. Moreover, we combine the inner statistical information and external knowledge, in which the neural network automatically learns to allocate weight for them. Experiments on different course corpus show that our method outperforms alternative methods.

Supported by organization x.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The source dataset is released on http://moocdata.cn/data/concept-extraction.

  2. 2.

    https://github.com/boudinfl/pke.

  3. 3.

    https://github.com/thukg/concept-expansion-snippet.

References

  1. Adi, Y., Kermany, E., Belinkov, Y., Lavi, O., Goldberg, Y.: Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207 (2016)

  2. Bennani-Smires, K., Musat, C., Hossmann, A., Baeriswyl, M., Jaggi, M.: Simple unsupervised keyphrase extraction using sentence embeddings. arXiv preprint arXiv:1801.04470 (2018)

  3. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine (1998)

    Google Scholar 

  4. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336 (1998)

    Google Scholar 

  5. Chen, P., Lu, Y., Zheng, V.W., Chen, X., Yang, B.: KnowEdu: a system to construct knowledge graph for education. IEEE Access 6, 31553–31563 (2018)

    Article  Google Scholar 

  6. Church, K., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)

    Google Scholar 

  7. Dunning, T.E.: Accurate methods for the statistics of surprise and coincidence. Comput. Linguist. 19(1), 61–74 (1993)

    Google Scholar 

  8. Hisamitsu, T., Niwa, Y., Tsujii, J.: A method of measuring term representativeness-baseline method using co-occurrence distribution. In: COLING 2000: The 18th International Conference on Computational Linguistics, vol. 1 (2000)

    Google Scholar 

  9. Kiros, R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)

    Google Scholar 

  10. Korkontzelos, I., Klapaftis, I.P., Manandhar, S.: Reviewing and evaluating automatic term recognition techniques. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 248–259. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85287-2_24

    Chapter  Google Scholar 

  11. Li, S., Li, J., Song, T., Li, W., Chang, B.: A novel topic model for automatic term extraction. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 885–888 (2013)

    Google Scholar 

  12. Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 366–376 (2010)

    Google Scholar 

  13. Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893 (2018)

  14. Lu, W., Zhou, Y., Yu, J., Jia, C.: Concept extraction and prerequisite relation learning from educational data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 9678–9685 (2019)

    Google Scholar 

  15. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)

    Google Scholar 

  16. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  17. Mori, T., Sasaki, T.: Information gain ratio meets maximal marginal relevance. In: les actes de National Institute of Informatics Test Collections for Information Retrieval (NTCIR) (2002)

    Google Scholar 

  18. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-Gram features. arXiv preprint arXiv:1703.02507 (2017)

  19. Pan, L., Wang, X., Li, C., Li, J., Tang, J.: Course concept extraction in MOOCs via embedding-based graph propagation. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 875–884 (2017)

    Google Scholar 

  20. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  21. Ramos, J., et al.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, vol. 242, pp. 133–142, New Jersey, USA (2003)

    Google Scholar 

  22. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  23. Seaton, D.T., Bergner, Y., Chuang, I., Mitros, P., Pritchard, D.E.: Who does what in a massive open online course? Commun. ACM 57(4), 58–65 (2014)

    Article  Google Scholar 

  24. Shang, J., Liu, J., Jiang, M., Ren, X., Voss, C.R., Han, J.: Automated phrase mining from massive text corpora. IEEE Trans. Knowl. Data Eng. 30(10), 1825–1837 (2018)

    Article  Google Scholar 

  25. Yu, J., et al.: Course concept expansion in MOOCs with external knowledge and interactive game. arXiv preprint arXiv:1909.07739 (2019)

  26. Zesch, T., Gurevych, I.: Approximate matching for evaluating keyphrase extraction. In: Proceedings of the International Conference RANLP-2009, pp. 484–489 (2009)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China (No. 62077015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Z., Zhu, J., Xu, S., Yan, Z., Liang, W. (2022). LTWNN: A Novel Approach Using Sentence Embeddings for Extracting Diverse Concepts in MOOCs. In: Long, G., Yu, X., Wang, S. (eds) AI 2021: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13151. Springer, Cham. https://doi.org/10.1007/978-3-030-97546-3_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-97546-3_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-97545-6

  • Online ISBN: 978-3-030-97546-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics