Methods for Assessing Theme Adherence in Student Thesis

Tikhomirov, Mikhail; Loukachevitch, Natalia; Dobrov, Boris

doi:10.1007/978-3-030-27947-9_6

Methods for Assessing Theme Adherence in Student Thesis

Mikhail Tikhomirov⁹,
Natalia Loukachevitch⁹ &
Boris Dobrov⁹

Conference paper
First Online: 06 August 2019

813 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11697))

Abstract

In this paper we study approaches to assessing the quality of student theses in pedagogics. We consider a specific subtask in thesis scoring of estimating its adherence to the thesis’s theme. The special document (theme header) comprising the theme, aim, object, tasks of the thesis is formed. The theme adherence is calculated as the similarity value between the theme header and thesis segments. For evaluation we order theses in the increased value of the calculated theme adherence and compare the ordering with expert grades using the NDCG measure. We explore different methods, including probabilistic topic modeling, word embeddings and ontologies. The best configuration for theses ranking is based on the weighted averaged sum of word embeddings (word2vec) and keywords extracted from the theme header.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Amorim, E., Cançado, M., Veloso, A.: Automated essay scoring in the presence of biased ratings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 229–237 (2018)
Google Scholar
Avros, R., Volkovich, Z.: Detection of computer-generated papers using one-class SVM and cluster approaches. In: Perner, P. (ed.) MLDM 2018. LNCS (LNAI), vol. 10935, pp. 42–55. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96133-0_4
Chapter Google Scholar
Bakhteev, O., Kuznetsova, R., Romanov, A., Chekhovich, Y.: About one method of detecting artificial and unscientific texts in an extensive collection of documents. Electron. Libr. 20(5), 298–304 (2017)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
MATH Google Scholar
Dobrov, B.V., Loukachevitch, N.V.: Development of linguistic ontology on natural sciences and technology. In: LREC, pp. 1077–1082 (2006)
Google Scholar
Foltz, P.W., Laham, D., Landauer, T.K.: Automated essay scoring: applications to educational technology. In: EdMedia+ Innovate Learning, pp. 939–944. Association for the Advancement of Computing in Education (AACE) (1999)
Google Scholar
Higgins, D., Burstein, J., Marcu, D., Gentile, C.: Evaluating multiple aspects of coherence in student essays. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)
Google Scholar
Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc. (1999)
Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Article Google Scholar
Kakkonen, T., Myller, N., Sutinen, E., Timonen, J.: Comparison of dimension reduction methods for automated essay grading. J. Educ. Technol. Soc. 11(3), 275–288 (2008)
Google Scholar
Khritankov, A., Botov, P., Surovenko, N., Tsarkov, S., Viuchnov, D., Chekhovich, Y.: Discovering text reuse in large collections of documents: a study of theses in history sciences. In: Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 26–32. IEEE (2015)
Google Scholar
Labbé, C., Labbé, D.: Duplicate and fake publications in the scientific literature: how many scigen papers in computer science? Scientometrics 94(1), 379–396 (2013)
Article Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Osipov, G., Smirnov, I., Tikhomirov, I., Vybornova, O.: Technologies for semantic analysis of scientific publications. In: 2012 6th IEEE International Conference Intelligent Systems, pp. 058–062. IEEE (2012)
Google Scholar
Persing, I., Ng, V.: Modeling prompt adherence in student essays. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1534–1543 (2014)
Google Scholar
Reimers, N., Gurevych, I.: Optimal hyperparameters for deep LSTM-networks for sequence labeling tasks. arXiv preprint arXiv:1707.06799 (2017)
Riedl, M., Biemann, C.: TopicTiling: a text segmentation algorithm based on LDA. In: Proceedings of ACL 2012 Student Research Workshop, pp. 37–42. Association for Computational Linguistics (2012)
Google Scholar
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891 (2016)
Google Scholar
Tikhomirov, M., Loukachevitch, N., Dobrov, B.: Assessing theme adherence in student thesis. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2019), pp. 649–661 (2019)
Google Scholar
Vorontsov, K., Potapenko, A.: Additive regularization of topic models. Mach. Learn. 101(1–3), 303–323 (2015)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This article contains the results of the project “Developing new methods for analyzing large text data using linguistic and ontological resources, machine learning methods and neural networks”, carried out as part of the Competence Center program of the National Technology Initiative “Center for Big Data Storage and Analysis”, supported by the Ministry of Science and Higher Education of the Russian Federation under the Contract of Moscow State University with the Fund for Support of Projects of the National Technology Initiative No. 13/1251/2018 dated December 11, 2018.

Author information

Authors and Affiliations

Lomonosov Moscow State University, Moscow, Russia
Mikhail Tikhomirov, Natalia Loukachevitch & Boris Dobrov

Authors

Mikhail Tikhomirov
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Loukachevitch
View author publications
You can also search for this author in PubMed Google Scholar
Boris Dobrov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mikhail Tikhomirov .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Kamil Ekštein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tikhomirov, M., Loukachevitch, N., Dobrov, B. (2019). Methods for Assessing Theme Adherence in Student Thesis. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-27947-9_6
Published: 06 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27946-2
Online ISBN: 978-3-030-27947-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics