Skip to main content

Source Code Representations for Plagiarism Detection

  • Conference paper
  • First Online:
Learning Technology for Education Challenges (LTEC 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 870))

Included in the following conference series:

Abstract

At the present time the plagiarism is a growing problem due to a lot of easily accessible resources, and many papers deal with this topic. New algorithms are constantly being created, but there are not currently manny of systems, that we could use for plagiarism detection. Our aim is to explore plagiarism on a large scale.

This paper focuses on selecting the appropriate representation of the source code, that is very important when searching for plagiarism. There is an overview of the current representation possibilities. We focus on representation source code using AST. Comparison of the tree structures is time-consuming operation. We will try to find how effectively represent AST in order to facilitate comparison. There are two ways to represent AST. Representation by hashing or using characteristic vectors. We present the experiment and results on which we choose the appropriate form of the representation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ďuračík, M., Kršák, E., Patrik, H.: Using concepts of text based plagiarism detection in source code plagiarism analysis (2017)

    Google Scholar 

  2. Curtis, G.J., Vardanega, L.: Is plagiarism changing over time? A 10-year time-lag study with three points of measurement. High. Educ. Res. Dev. 35(6), 1167–1179 (2016)

    Article  Google Scholar 

  3. Kravjar, J.: Sk Antiplag is Bearing Fruit (2015)

    Google Scholar 

  4. Ďuračík, M., Kršák, E., Hrkút, P.: Current trends in source code analysis, plagiarism detection and issues of analysis big datasets. Procedia Eng. 192, 136–141 (2017)

    Article  Google Scholar 

  5. Zhao, J., et al.: An AST-based code plagiarism detection algorithm. In: 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE (2015)

    Google Scholar 

  6. Tao, G., et al.: Improved plagiarism detection algorithm based on abstract syntax tree. In: 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies (EIDWT). IEEE (2013)

    Google Scholar 

  7. Lazar, F.M., Banias, O.: Clone detection algorithm based on the abstract syntax tree approach. In: 2014 IEEE 9th International Symposium on Applied Computational Intelligence and Informatics (SACI). IEEE (2014)

    Google Scholar 

  8. Feng, J., Cui, B., Xia, K.: A code comparison algorithm based on AST for plagiarism detection. In: 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies (EIDWT). IEEE (2013)

    Google Scholar 

  9. Jiang, L., et al.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering. IEEE Computer Society (2007)

    Google Scholar 

  10. Lee, M.-W., et al.: Instant code clone search. In: Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM (2010)

    Google Scholar 

Download references

Acknowledgement

This contribution/publication is the result of the project implementation:

Centre of excellence for systems and services of intelligent transport II, ITMS 26220120050 supported by the Research & Development Operational Programme funded by the ERDF.

figure b

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Ďuračík .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ďuračík, M., Kršák, E., Hrkút, P. (2018). Source Code Representations for Plagiarism Detection. In: Uden, L., Liberona, D., Ristvej, J. (eds) Learning Technology for Education Challenges. LTEC 2018. Communications in Computer and Information Science, vol 870. Springer, Cham. https://doi.org/10.1007/978-3-319-95522-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-95522-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-95521-6

  • Online ISBN: 978-3-319-95522-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics