skip to main content
article

Sentence-based natural language plagiarism detection

Published:01 December 2004Publication History
Skip Abstract Section

Abstract

With the increasing levels of access to higher education in the United Kingdom, larger class sizes make it unrealistic for tutors to be expected to identify instances of peer-to-peer plagiarism by eye and so automated solutions to the problem are required. This document details a novel algorithm for comparison of suspect documents at a sentence level and has been implemented as a component of plagiarism detection software for detecting similarities in both natural language documents and comments within program source-code. The algorithm is capable of detecting sophisticated obfuscation (such as paraphrasing, reordering, merging, and splitting sentences) as well as direct copying. The implemented algorithm has also been used to successfully detect plagiarism on real assignments at the university. The software has been evaluated by comparison with other plagiarism detection tools.

References

  1. Carroll, J. and Appleton, J. 2001. Plagiarism: A good practice guide. Tech. rep., Joint Information Services Committee. Available: http://www.jisc.ac.uk/index.cfm?name=project_plag_practise (Accessed 27th January 2004).Google ScholarGoogle Scholar
  2. Chester, G. 2001. Pilot of free-text detection software. Tech. rep., Joint Information Services Committee. Available: http://www.jisc.ac.uk/index.cfm?name=project_plag_pilot (Accessed 20th April 2005).Google ScholarGoogle Scholar
  3. Culwin, F. and Lancaster, T. 2001. Visualising intra-corpal plagiarism. In 5th International Conference of Information Visualisation (IV 2001). London, England. Google ScholarGoogle Scholar
  4. Culwin, F. and Lancaster, T. 2004. Plagiarism prevention and detection. online. Available: http://cise.lsbu.ac.uk (Accessed 20th April 2005).Google ScholarGoogle Scholar
  5. Culwin, F., Macleod, A., and Lancaster, T. 2001. Source-code plagiarism in uk he computing schools. In 2nd Annual Conference of the LTSN Centre for Information and Computer Sciences. University of North London, England.Google ScholarGoogle Scholar
  6. Curtis, P. 2003. Hodge defends higher education target. online. Available: http://education. guardian.co.uk/print/0,3858,4582592-108229,00.html (Accessed 20th April 2005).Google ScholarGoogle Scholar
  7. Decoo, W. 2002. Crisis on Campus: Confronting Academic Misconduct. The MIT Press, Cambridge, MA. Google ScholarGoogle Scholar
  8. Finkel, R. A., Zaslavsky, A., Monostori, K., and Schmidt, H. 2002. Signature extraction for overlap detection in documents. In Proceedings of the 25th Australasian Conference on Computer Science. Australian Computer Society, Inc., 59--64. Google ScholarGoogle Scholar
  9. Hoad, T. C. and Zobel, J. 2003. Methods for identifying versioned and plagiarized documents. Journal of the American Society for Information Science and Technology 54, 3, 203--215. Google ScholarGoogle Scholar
  10. iParadigms. 2005. Jisc service---solutions for a new era in education. online. Available: http://www.submit.ac.uk (Accessed 20th April 2005).Google ScholarGoogle Scholar
  11. Joy, M. S. and Luck, M. 1998. Computer Based Assessment (Vol. 2): Case Studies in Science and Computing. SEED Publications, University of Plymouth, United Kingdom. The BOSS system for on-line submission and assessment of computing assignments, 39--44.Google ScholarGoogle Scholar
  12. Joy, M. S. and Luck, M. 1999. Plagiarism in programming assignments. IEEE Transactions on Education 42, 2, 129--133. Google ScholarGoogle Scholar
  13. Monostori, K., Zaslavsky, A., and Bia, A. 2001. Using the matchdetectreveal system for comparative analysis of texts. In Proceedings of the 6th Australian Document Computing Symposium (ADCS 2001). 51--58.Google ScholarGoogle Scholar
  14. Monostori, K., Finkel, R. A., Zaslavsky, A., Hodasz, G., and Pataki, M. 2002. Comparison of overlap detection techniques. In International Conference on Computational Science (ICCS 2002). Amsterdam, The Netherlands, 51--60. Google ScholarGoogle Scholar
  15. Prechelt, L., Malpohl, G., and Phillipsen, M. 2002. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science 8, 11.Google ScholarGoogle Scholar
  16. Ribler, R. L. and Abrams, M. 2000. Using visualization to detect plagiarism in computer science classes. In IEEE Symposium on Information Visualisation. Salt Lake City, Utah, 173--177. Google ScholarGoogle Scholar
  17. Witten, I. H., Moffat, A., and Bell, T. C. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd Edn. Morgan Kaufmann, San Francisco, California. Google ScholarGoogle Scholar
  18. Woolls, D. 2003. Private correspondence.Google ScholarGoogle Scholar
  19. Woolls, D. 2004. Welcome to the home of powerful text analysis tools. online. Available: http://www.copycatchgold.com/ (Accessed 20th April 2005).Google ScholarGoogle Scholar

Index Terms

  1. Sentence-based natural language plagiarism detection

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image Journal on Educational Resources in Computing
                    Journal on Educational Resources in Computing  Volume 4, Issue 4
                    December 2004
                    55 pages
                    ISSN:1531-4278
                    EISSN:1531-4278
                    DOI:10.1145/1086339
                    Issue’s Table of Contents

                    Copyright © 2004 ACM

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 December 2004
                    Published in jeric Volume 4, Issue 4

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader