Abstract
With the increasing levels of access to higher education in the United Kingdom, larger class sizes make it unrealistic for tutors to be expected to identify instances of peer-to-peer plagiarism by eye and so automated solutions to the problem are required. This document details a novel algorithm for comparison of suspect documents at a sentence level and has been implemented as a component of plagiarism detection software for detecting similarities in both natural language documents and comments within program source-code. The algorithm is capable of detecting sophisticated obfuscation (such as paraphrasing, reordering, merging, and splitting sentences) as well as direct copying. The implemented algorithm has also been used to successfully detect plagiarism on real assignments at the university. The software has been evaluated by comparison with other plagiarism detection tools.
- Carroll, J. and Appleton, J. 2001. Plagiarism: A good practice guide. Tech. rep., Joint Information Services Committee. Available: http://www.jisc.ac.uk/index.cfm?name=project_plag_practise (Accessed 27th January 2004).Google Scholar
- Chester, G. 2001. Pilot of free-text detection software. Tech. rep., Joint Information Services Committee. Available: http://www.jisc.ac.uk/index.cfm?name=project_plag_pilot (Accessed 20th April 2005).Google Scholar
- Culwin, F. and Lancaster, T. 2001. Visualising intra-corpal plagiarism. In 5th International Conference of Information Visualisation (IV 2001). London, England. Google Scholar
- Culwin, F. and Lancaster, T. 2004. Plagiarism prevention and detection. online. Available: http://cise.lsbu.ac.uk (Accessed 20th April 2005).Google Scholar
- Culwin, F., Macleod, A., and Lancaster, T. 2001. Source-code plagiarism in uk he computing schools. In 2nd Annual Conference of the LTSN Centre for Information and Computer Sciences. University of North London, England.Google Scholar
- Curtis, P. 2003. Hodge defends higher education target. online. Available: http://education. guardian.co.uk/print/0,3858,4582592-108229,00.html (Accessed 20th April 2005).Google Scholar
- Decoo, W. 2002. Crisis on Campus: Confronting Academic Misconduct. The MIT Press, Cambridge, MA. Google Scholar
- Finkel, R. A., Zaslavsky, A., Monostori, K., and Schmidt, H. 2002. Signature extraction for overlap detection in documents. In Proceedings of the 25th Australasian Conference on Computer Science. Australian Computer Society, Inc., 59--64. Google Scholar
- Hoad, T. C. and Zobel, J. 2003. Methods for identifying versioned and plagiarized documents. Journal of the American Society for Information Science and Technology 54, 3, 203--215. Google Scholar
- iParadigms. 2005. Jisc service---solutions for a new era in education. online. Available: http://www.submit.ac.uk (Accessed 20th April 2005).Google Scholar
- Joy, M. S. and Luck, M. 1998. Computer Based Assessment (Vol. 2): Case Studies in Science and Computing. SEED Publications, University of Plymouth, United Kingdom. The BOSS system for on-line submission and assessment of computing assignments, 39--44.Google Scholar
- Joy, M. S. and Luck, M. 1999. Plagiarism in programming assignments. IEEE Transactions on Education 42, 2, 129--133. Google Scholar
- Monostori, K., Zaslavsky, A., and Bia, A. 2001. Using the matchdetectreveal system for comparative analysis of texts. In Proceedings of the 6th Australian Document Computing Symposium (ADCS 2001). 51--58.Google Scholar
- Monostori, K., Finkel, R. A., Zaslavsky, A., Hodasz, G., and Pataki, M. 2002. Comparison of overlap detection techniques. In International Conference on Computational Science (ICCS 2002). Amsterdam, The Netherlands, 51--60. Google Scholar
- Prechelt, L., Malpohl, G., and Phillipsen, M. 2002. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science 8, 11.Google Scholar
- Ribler, R. L. and Abrams, M. 2000. Using visualization to detect plagiarism in computer science classes. In IEEE Symposium on Information Visualisation. Salt Lake City, Utah, 173--177. Google Scholar
- Witten, I. H., Moffat, A., and Bell, T. C. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd Edn. Morgan Kaufmann, San Francisco, California. Google Scholar
- Woolls, D. 2003. Private correspondence.Google Scholar
- Woolls, D. 2004. Welcome to the home of powerful text analysis tools. online. Available: http://www.copycatchgold.com/ (Accessed 20th April 2005).Google Scholar
Index Terms
- Sentence-based natural language plagiarism detection
Recommendations
A natural language processing approach to automatic plagiarism detection
SIGITE '07: Proceedings of the 8th ACM SIGITE conference on Information technology educationThe problem of plagiarism has existed for a long time but with the advance of information technology the problem becomes worse. It is because there are many electronic versions of published materials available to everyone. The Web is an important and ...
Using Sentence Embedding for Cross-Language Plagiarism Detection
Artificial Intelligence XXXVIIAbstractThe growth of textual content in various languages and the advancement of automatic translation systems has led to an increase of cases of translated plagiarism. When a text is translated into another language, word order will change and words may ...
Plagiarism Detection in Marathi Language Using Semantic Analysis
In this article, the authors have proposed a method to detect plagiarism in the Marathi language by using semantic analysis. Nowadays, plagiarism is a challenging task in educational and research fields. Currently, there are some tools available to ...
Comments