Abstract
At the present time the plagiarism is a growing problem due to a lot of easily accessible resources, and many papers deal with this topic. New algorithms are constantly being created, but there are not currently manny of systems, that we could use for plagiarism detection. Our aim is to explore plagiarism on a large scale.
This paper focuses on selecting the appropriate representation of the source code, that is very important when searching for plagiarism. There is an overview of the current representation possibilities. We focus on representation source code using AST. Comparison of the tree structures is time-consuming operation. We will try to find how effectively represent AST in order to facilitate comparison. There are two ways to represent AST. Representation by hashing or using characteristic vectors. We present the experiment and results on which we choose the appropriate form of the representation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ďuračík, M., Kršák, E., Patrik, H.: Using concepts of text based plagiarism detection in source code plagiarism analysis (2017)
Curtis, G.J., Vardanega, L.: Is plagiarism changing over time? A 10-year time-lag study with three points of measurement. High. Educ. Res. Dev. 35(6), 1167–1179 (2016)
Kravjar, J.: Sk Antiplag is Bearing Fruit (2015)
Ďuračík, M., Kršák, E., Hrkút, P.: Current trends in source code analysis, plagiarism detection and issues of analysis big datasets. Procedia Eng. 192, 136–141 (2017)
Zhao, J., et al.: An AST-based code plagiarism detection algorithm. In: 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE (2015)
Tao, G., et al.: Improved plagiarism detection algorithm based on abstract syntax tree. In: 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies (EIDWT). IEEE (2013)
Lazar, F.M., Banias, O.: Clone detection algorithm based on the abstract syntax tree approach. In: 2014 IEEE 9th International Symposium on Applied Computational Intelligence and Informatics (SACI). IEEE (2014)
Feng, J., Cui, B., Xia, K.: A code comparison algorithm based on AST for plagiarism detection. In: 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies (EIDWT). IEEE (2013)
Jiang, L., et al.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering. IEEE Computer Society (2007)
Lee, M.-W., et al.: Instant code clone search. In: Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM (2010)
Acknowledgement
This contribution/publication is the result of the project implementation:
Centre of excellence for systems and services of intelligent transport II, ITMS 26220120050 supported by the Research & Development Operational Programme funded by the ERDF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Ďuračík, M., Kršák, E., Hrkút, P. (2018). Source Code Representations for Plagiarism Detection. In: Uden, L., Liberona, D., Ristvej, J. (eds) Learning Technology for Education Challenges. LTEC 2018. Communications in Computer and Information Science, vol 870. Springer, Cham. https://doi.org/10.1007/978-3-319-95522-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-95522-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95521-6
Online ISBN: 978-3-319-95522-3
eBook Packages: Computer ScienceComputer Science (R0)