Abstract
Plagiarism is increasing day by day. Plagiarism detection is one of the most complex, but a must requirement. This paper deals with word level plagiarism detection for Marathi text by using N-gram language model and a Marathi corpus. This is most simple in form still provides good depth for understanding and emphasing copy-paste and paraphrased plagiarism detection. It forms basis for sentence as well as paragraph level processing
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
University of Melbourne (2005). What is plagiarism? https://services.unimelb.edu.au/__data/assets/pdf_file/0004/821668/5297-Avoiding-PlagiarismWEB.pdf. Accessed 27 June 2018
Paul clough, plagiarism in natural and programming languages an overview of current tools and technologies, Technical report, University of Sheffeld, Sheffeld, UK, June 2000
Grozea, C., et al.: ENCOPLOT: pairwise sequence matching in linear time applied to plagiarism detection. In 3rd PAN Workshop. Uncovering Plagiarism, Authorship, and Social Software Misuse, p. 10 (2009)
Grozea, C., Popescu, M.: Who’s the thief? automatic detection of the direction of plagiarism. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 700–710. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12116-6_59
Barrón-Cedeño, A., Rosso, P.: On automatic plagiarism detection based on n-grams comparison. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 696–700. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_69
Chiu, S., Uysal, I., Croft, B.W.: Evaluating text reuse discovery on the web. In: Proceedings of the Third Symposium on Information Interaction in Context, pp. 299–304 (2010)
Weber Wulff, D.: Copy, Shake, and Paste- A blog about plagiarism from a German professor, written in English. http://copy-shake-paste.blogspot.com. Accessed 28 June 2018
Lancaster, T.: Effective and efficient plagiarism detection. Ph.D. thesis, school of computing, information systems and mathematics south bank university (2003)
Barnbaum, C.: Plagiarism: A Student’s Guide to Recognizing It and Avoiding It. Valdos Ta state university. http://www.valdosta.edu/cbarnbau/personal/teaching_MISC/plagiarism.htm. Accessed 28 June 2018
Maurer, H., et al.: Plagiarism-a survey. J. Univ. Comput. Sci. 12, 1050–1084 (2006)
Bretag, T., Mahmud, S.: Self-plagiarism or appropriate textual re-use. J. Acad. Ethics 7, 193–205 (2009)
Vani, K., Gupta, D.: Using k-means cluster based techniques in external plagiarism detection. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I), pp. 1268–1273. IEEE 2014
Jurafsky, D., Martin, J.H.: Text book on “Speech and Language Processing”, Copyright c 2016. All rights reserved (2017)
What-are-n-grams.html. http://text-analytics101.rxnlp.com/2014/11/. Accessed 18 Aug 2018
Acknowledgement
Authors would like to acknowledge and thanks to CSRI DST Major Project sanctioned No.SR/CSRI/71/2015(G), Computational and Psycholinguistic Research Lab Facility supporting to this work and Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Naik, R.R., Landge, M.B., Mahender, C.N. (2019). Word Level Plagiarism Detection of Marathi Text Using N-Gram Approach. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1037. Springer, Singapore. https://doi.org/10.1007/978-981-13-9187-3_2
Download citation
DOI: https://doi.org/10.1007/978-981-13-9187-3_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9186-6
Online ISBN: 978-981-13-9187-3
eBook Packages: Computer ScienceComputer Science (R0)