Abstract
In this paper we study approaches to assessing the quality of student theses in pedagogics. We consider a specific subtask in thesis scoring of estimating its adherence to the thesis’s theme. The special document (theme header) comprising the theme, aim, object, tasks of the thesis is formed. The theme adherence is calculated as the similarity value between the theme header and thesis segments. For evaluation we order theses in the increased value of the calculated theme adherence and compare the ordering with expert grades using the NDCG measure. We explore different methods, including probabilistic topic modeling, word embeddings and ontologies. The best configuration for theses ranking is based on the weighted averaged sum of word embeddings (word2vec) and keywords extracted from the theme header.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Amorim, E., Cançado, M., Veloso, A.: Automated essay scoring in the presence of biased ratings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 229–237 (2018)
Avros, R., Volkovich, Z.: Detection of computer-generated papers using one-class SVM and cluster approaches. In: Perner, P. (ed.) MLDM 2018. LNCS (LNAI), vol. 10935, pp. 42–55. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96133-0_4
Bakhteev, O., Kuznetsova, R., Romanov, A., Chekhovich, Y.: About one method of detecting artificial and unscientific texts in an extensive collection of documents. Electron. Libr. 20(5), 298–304 (2017)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Dobrov, B.V., Loukachevitch, N.V.: Development of linguistic ontology on natural sciences and technology. In: LREC, pp. 1077–1082 (2006)
Foltz, P.W., Laham, D., Landauer, T.K.: Automated essay scoring: applications to educational technology. In: EdMedia+ Innovate Learning, pp. 939–944. Association for the Advancement of Computing in Education (AACE) (1999)
Higgins, D., Burstein, J., Marcu, D., Gentile, C.: Evaluating multiple aspects of coherence in student essays. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)
Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc. (1999)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Kakkonen, T., Myller, N., Sutinen, E., Timonen, J.: Comparison of dimension reduction methods for automated essay grading. J. Educ. Technol. Soc. 11(3), 275–288 (2008)
Khritankov, A., Botov, P., Surovenko, N., Tsarkov, S., Viuchnov, D., Chekhovich, Y.: Discovering text reuse in large collections of documents: a study of theses in history sciences. In: Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 26–32. IEEE (2015)
Labbé, C., Labbé, D.: Duplicate and fake publications in the scientific literature: how many scigen papers in computer science? Scientometrics 94(1), 379–396 (2013)
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Osipov, G., Smirnov, I., Tikhomirov, I., Vybornova, O.: Technologies for semantic analysis of scientific publications. In: 2012 6th IEEE International Conference Intelligent Systems, pp. 058–062. IEEE (2012)
Persing, I., Ng, V.: Modeling prompt adherence in student essays. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1534–1543 (2014)
Reimers, N., Gurevych, I.: Optimal hyperparameters for deep LSTM-networks for sequence labeling tasks. arXiv preprint arXiv:1707.06799 (2017)
Riedl, M., Biemann, C.: TopicTiling: a text segmentation algorithm based on LDA. In: Proceedings of ACL 2012 Student Research Workshop, pp. 37–42. Association for Computational Linguistics (2012)
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891 (2016)
Tikhomirov, M., Loukachevitch, N., Dobrov, B.: Assessing theme adherence in student thesis. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2019), pp. 649–661 (2019)
Vorontsov, K., Potapenko, A.: Additive regularization of topic models. Mach. Learn. 101(1–3), 303–323 (2015)
Acknowledgements
This article contains the results of the project “Developing new methods for analyzing large text data using linguistic and ontological resources, machine learning methods and neural networks”, carried out as part of the Competence Center program of the National Technology Initiative “Center for Big Data Storage and Analysis”, supported by the Ministry of Science and Higher Education of the Russian Federation under the Contract of Moscow State University with the Fund for Support of Projects of the National Technology Initiative No. 13/1251/2018 dated December 11, 2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tikhomirov, M., Loukachevitch, N., Dobrov, B. (2019). Methods for Assessing Theme Adherence in Student Thesis. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-27947-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27946-2
Online ISBN: 978-3-030-27947-9
eBook Packages: Computer ScienceComputer Science (R0)