Abstract
Plagiarism detection problem has been taken into account both individuals and organizations. This problem can be used to detect the copy of documents, e.g., publications, books, theses, and more. There are many approaches that have been proposed for plagiarism detection and they work well for English. Different countries may use different languages, thus, natural language processing (e.g. processing of acute accent, circumflex accent, etc.) as well as semantic or order of the words are still challenging. This work proposes an approach for plagiarism detection, especially for Vietnamese documents in learning/researching resources. The input data were pre-processed, extracted, vectorized and represented in term of TF-IDF. Then, Cosine similarity and word-order similarity of the documents are computed. Finally, an ensemble of these similarities is combined. Experimental results on a Vietnamese journal dataset show that the proposed approach is feasibility.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Born, A.: How to reduce plagiarism. J. Inf. Syst. Educ. 14, 223–224 (2003)
Howard, R.: Understanding “Internet plagiarism”. Comput. Compos. 24, 3–15 (2007)
Lewis, B., Duchac, J., Douglas Beets, S.: An academic publisher’s response to plagiarism. J. Bus. Ethics 102, 489–506 (2011)
Carter, C., Blanford, C.: Plagiarism and detection. J. Mater. Sci. 51, 7047–7048 (2016)
Brinkman, B.: An analysis of student privacy rights in the use of plagiarism detection systems. Sci. Eng. Ethics 19, 1255–1266 (2012)
Yousuf, S., Ahmad, M., Nasrullah, S.: A review of plagiarism detection based on Lexical and Semantic Approach. In: 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA) (2013)
AlSallal, M., Iqbal, R., Palade, V., Amin, S., Chang, V.: An integrated approach for intrinsic plagiarism detection. Futur. Gener. Comput. Syst. 96, 700–712 (2019)
Al-Shamery, E., Gheni, H.: Plagiarism detection using semantic analysis. Indian J. Sci. Technol. 9, 1–8 (2016). https://doi.org/10.17485/ijst/2016/v9i1/84235
Mukherjee, I., Kumar, B., Singh, S., Sharma, K.: Plagiarism detection based on semantic analysis. Int. J. Knowl. Learn. 12, 242 (2018)
Vani, K., Gupta, D.: Integrating syntax-semantic-based text analysis with structural and citation information for scientific plagiarism detection. J. Assoc. Inf. Sci. Technol. 69, 1330–1345 (2018)
Wali, W., Gargouri, B., Ben Hamadou, A.: An enhanced plagiarism detection based on syntactico-semantic knowledge. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) ISDA 2018 2018. AISC, vol. 941, pp. 264–274. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16660-1_26
Chowdhury, H.A., Bhattacharyya, D.K.: Plagiarism: Taxonomy, Tools and Detection Techniques. ArXiv abs/1801.06323 (2018)
Ahmed, R.A.: Overview of different plagiarism detection tools. Int. J. Futur. Trends Eng. Technol. 2, 1–3 (2015)
Osman, A.H., Salim, N., Binwahlan, M.S.: Plagiarism Detection Using Graph-Based Representation. ArXiv, abs/1004.4449 (2010)
Maurer, H.A., Kappe, F., Zaka, B.: Plagiarism - a survey. J. Univers. Comput. Sci. 12, 1050–1084 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dien, T.T., Han, H.N., Thai-Nghe, N. (2019). An Approach for Plagiarism Detection in Learning Resources. In: Dang, T., Küng, J., Takizawa, M., Bui, S. (eds) Future Data and Security Engineering. FDSE 2019. Lecture Notes in Computer Science(), vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-35653-8_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35652-1
Online ISBN: 978-3-030-35653-8
eBook Packages: Computer ScienceComputer Science (R0)