Skip to main content

Methods for Assessing Theme Adherence in Student Thesis

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11697))

Abstract

In this paper we study approaches to assessing the quality of student theses in pedagogics. We consider a specific subtask in thesis scoring of estimating its adherence to the thesis’s theme. The special document (theme header) comprising the theme, aim, object, tasks of the thesis is formed. The theme adherence is calculated as the similarity value between the theme header and thesis segments. For evaluation we order theses in the increased value of the calculated theme adherence and compare the ordering with expert grades using the NDCG measure. We explore different methods, including probabilistic topic modeling, word embeddings and ontologies. The best configuration for theses ranking is based on the weighted averaged sum of word embeddings (word2vec) and keywords extracted from the theme header.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://window.edu.ru/catalog/pdf2txt/200/58200/28147.

  2. 2.

    https://radimrehurek.com/gensim/index.html.

References

  1. Amorim, E., Cançado, M., Veloso, A.: Automated essay scoring in the presence of biased ratings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 229–237 (2018)

    Google Scholar 

  2. Avros, R., Volkovich, Z.: Detection of computer-generated papers using one-class SVM and cluster approaches. In: Perner, P. (ed.) MLDM 2018. LNCS (LNAI), vol. 10935, pp. 42–55. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96133-0_4

    Chapter  Google Scholar 

  3. Bakhteev, O., Kuznetsova, R., Romanov, A., Chekhovich, Y.: About one method of detecting artificial and unscientific texts in an extensive collection of documents. Electron. Libr. 20(5), 298–304 (2017)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  5. Dobrov, B.V., Loukachevitch, N.V.: Development of linguistic ontology on natural sciences and technology. In: LREC, pp. 1077–1082 (2006)

    Google Scholar 

  6. Foltz, P.W., Laham, D., Landauer, T.K.: Automated essay scoring: applications to educational technology. In: EdMedia+ Innovate Learning, pp. 939–944. Association for the Advancement of Computing in Education (AACE) (1999)

    Google Scholar 

  7. Higgins, D., Burstein, J., Marcu, D., Gentile, C.: Evaluating multiple aspects of coherence in student essays. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)

    Google Scholar 

  8. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc. (1999)

    Google Scholar 

  9. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  10. Kakkonen, T., Myller, N., Sutinen, E., Timonen, J.: Comparison of dimension reduction methods for automated essay grading. J. Educ. Technol. Soc. 11(3), 275–288 (2008)

    Google Scholar 

  11. Khritankov, A., Botov, P., Surovenko, N., Tsarkov, S., Viuchnov, D., Chekhovich, Y.: Discovering text reuse in large collections of documents: a study of theses in history sciences. In: Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 26–32. IEEE (2015)

    Google Scholar 

  12. Labbé, C., Labbé, D.: Duplicate and fake publications in the scientific literature: how many scigen papers in computer science? Scientometrics 94(1), 379–396 (2013)

    Article  Google Scholar 

  13. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)

    Article  Google Scholar 

  14. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  15. Osipov, G., Smirnov, I., Tikhomirov, I., Vybornova, O.: Technologies for semantic analysis of scientific publications. In: 2012 6th IEEE International Conference Intelligent Systems, pp. 058–062. IEEE (2012)

    Google Scholar 

  16. Persing, I., Ng, V.: Modeling prompt adherence in student essays. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1534–1543 (2014)

    Google Scholar 

  17. Reimers, N., Gurevych, I.: Optimal hyperparameters for deep LSTM-networks for sequence labeling tasks. arXiv preprint arXiv:1707.06799 (2017)

  18. Riedl, M., Biemann, C.: TopicTiling: a text segmentation algorithm based on LDA. In: Proceedings of ACL 2012 Student Research Workshop, pp. 37–42. Association for Computational Linguistics (2012)

    Google Scholar 

  19. Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891 (2016)

    Google Scholar 

  20. Tikhomirov, M., Loukachevitch, N., Dobrov, B.: Assessing theme adherence in student thesis. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2019), pp. 649–661 (2019)

    Google Scholar 

  21. Vorontsov, K., Potapenko, A.: Additive regularization of topic models. Mach. Learn. 101(1–3), 303–323 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This article contains the results of the project “Developing new methods for analyzing large text data using linguistic and ontological resources, machine learning methods and neural networks”, carried out as part of the Competence Center program of the National Technology Initiative “Center for Big Data Storage and Analysis”, supported by the Ministry of Science and Higher Education of the Russian Federation under the Contract of Moscow State University with the Fund for Support of Projects of the National Technology Initiative No. 13/1251/2018 dated December 11, 2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikhail Tikhomirov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tikhomirov, M., Loukachevitch, N., Dobrov, B. (2019). Methods for Assessing Theme Adherence in Student Thesis. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27947-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27946-2

  • Online ISBN: 978-3-030-27947-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics