research-article

Auto Code Comment Assessment for Online Judge using Word Embedding and Word Mover's Distance

Authors:

Rosa Ariani Sukamto,

Muhammad Nabillah Fihira Rischa,

Rani MegasariAuthors Info & Claims

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

Pages 345 - 349

https://doi.org/10.1145/3575882.3575949

Published: 27 February 2023 Publication History

Abstract

Comments in source code are a form of inline documentation created by programmers to help others understand the function of the program. The students of the basic programming subject need how to learn to write better code comments which can be difficulties for the lecturer assessing. Therefore, the author proposes an automatic source code comment assessment method for the online judge system with a corpus-based text similarity approach. Word2vec, GloVe, and fastText models will be used to train word vectors with the Indonesian Wikipedia Dump. The Similarities will be measured using Word Mover's Distance (WMD). Experiments were carried out using epoch variations during the training process. Spearman's rho correlation coefficient, mean average error (MAE), and performance measurements of each model will be compared. The methods with the proposed word embedding approach still provide not good results.

References

[1]

P. J. De Pasquale, M. E. Locasto, L. Kaczmarczyk, M. Martinovic. 2012. "//TODO: Help students improve commenting practices. In 2012 Frontiers in Education Conference Proceedings, 1-6.

Digital Library

[2]

Bai Yang, Zhang Liping, Zhao Fengrong. 2019. A Survey on Research of Code Comment. In ICMSS 2019: Proceedings of the 2019 3rd International Conference on Management Engineering, Software Engineering and Service Sciences, 45–51.

Digital Library

[3]

D. Steidl, B. Hummel, dan E. Juergens. 2013. Quality analysis of source code comments. In IEEE International Conference on Program Comprehension, 83–92.

[4]

Yuan Huang, Nan Jia, Qiang Zhou, Xiangping Chen, Yingfei Xiong, Xiaonan Luo. 2018. Guiding developers to make informative commenting decisions in source code. In Proceedings of the 40th International Conference on Software Engineering, 260 -261.

Digital Library

[5]

Peter J. De Pasquale, Michael E. Locasto, Lisa C. Kaczmarczyk. 2012. Identifying effective pedagogical practices for commenting computer source code. In Proceedings of the 43rd ACM technical symposium on Computer Science Education, 678.

Digital Library

[6]

W. H. Gomaa dan A. A. Fahmy. 2013. A Survey of Text Similarity Approaches. In International Journal Computing Application, Vol. 68.

[7]

W. H. Gomaa dan A. A. Fahmy. 2020. Ans2vec: A Scoring System for Short Answers. In The Internasional Conference on Advanced Machine Learning Technologies and Applications, 586–595.

[8]

C. Jin, B. He, dan J. Xu. 2017. A study of distributed semantic representations for automated essay scoring. In KSEM 2017: Knowledge Science, Engineering and Management, Vol. 10412.

[9]

Tsegaye Misikir Tashu, Tomas Horvath. Pair-Wise: Automatic Essay Evaluation using Word Mover's Distance. 2018. In Proceedings of the 10th International Conference on Computer Supported Education, 59-66. ISBN: 978-989-758-291-2.

[10]

Rosa Ariani Sukamto, Rani Megasari, Erna Piantari, M Nabillah Fihira Rischa. 2020. Code Comment Assessment Development for Basic Programming Subject using Online Judge. In Proceedings of the 7th Mathematics, Science, and Computer Science Education International Seminar.

[11]

E. B. Setiawan, D. H. Widyantoro, dan K. Surendro. 2016. Feature expansion using word embedding for tweet topic classification. In Proceeding 2016 10th International Conference on Telecommunication System, Services and Application.

[12]

S. Arora, Y. Liang, dan T. Ma. 2016. Simple but Tough-to-Beat Baseline for Sentence Embeddings. In International Conference on Learning Representations, 416–424.

[13]

T. Mikolov, K. Chen, G. Corrado, dan J. Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR. http://arxiv.org/abs/1301.3781.

[14]

J. Pennington, R. Socher, dan C. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 1532–1543.

[15]

P. Bojanowski, E. Grave, A. Joulin, dan T. Mikolov. 2017. Enriching Word Vectors with Subword Information. In Transactions of the Association for Computational Linguistics, 135–146. http://arxiv.org/abs/1607.04606.

[16]

M. J. Kusner, Y. Sun, N. I. Kolkin, dan K. Q. Weinberger. 2015. From word embeddings to document distances. In 32nd International Conference on Machine Learning, 957–966.

Recommendations

An Efficient Approach for Findings Document Similarity Using Optimized Word Mover’s Distance
Pattern Recognition and Machine Intelligence
Abstract
We introduce Optimized Word Mover’s Distance (OWMD), a similarity function that compares two sentences based on their word embeddings. The method determines the degree of semantic similarity between two sentences considering their interdependent ...
Using Word Mover’s Distance with Spatial Constraints for Measuring Similarity Between Mongolian Word Images
Neural Information Processing
Abstract
In the framework of bag-of-visual-words, visual words are independent each other, which results in discarding spatial relations and lacking semantic information of visual words. To capture semantic information of visual words, a deep learning ...
Improving Vietnamese WordNet using word embedding
NLPIR '19: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval

This paper presents a simple but effective method to improve the quality of WordNet synsets and extract glosses for synsets. We translate the Princeton WordNet and other intermediate WordNets to a target language using a machine translator, then the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

November 2022

415 pages

ISBN:9781450397902

DOI:10.1145/3575882

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

IC3INA 2022

IC3INA 2022: The 2022 International Conference on Computer, Control, Informatics and Its Applications

November 22 - 23, 2022

Virtual Event, Indonesia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
31
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)3

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten