Abstract
The present paper introduces the first corpus for the evaluation of Arabic intrinsic plagiarism detection. The corpus consists of 1024 artificial suspicious documents in which 2833 plagiarism cases have been inserted automatically from source documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Springer Policy on Publishing Integrity. Guidelines for Journal Editors
Potthast, M., Stein, B., Eiselt, A., Barrón-Cedeño, A., Rosso, P.: Overview of the 1st International Competition on Plagiarism Detection. In: Stein, B., Rosso, P., Stamatatos, E., Koppel, M., Agirre, E. (eds.) SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2009), pp. 1–9 (2009)
Potthast, M., Stein, B., Barrón-Cedeño, A., Rosso, P.: An Evaluation Framework for Plagiarism Detection. In: Huang, C.-R., Jurafsky, D. (eds.) Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), pp. 997–1005. ACL (2010)
Potthast, M., Barrón-cedeño, A., Eiselt, A., Stein, B., Rosso, P.: Overview of the 2nd International Competition on Plagiarism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops (2010)
Potthast, M., Eiselt, A., Barrón-Cedeño, A., Stein, B., Rosso, P.: Overview of the 3rd International Competition on Plagiarism Detection. In: Petras, V., Forner, P., Clough, P. (eds.) Notebook Papers of CLEF 2011 LABs and Workshops (2011)
Potthast, M., Gollub, T., Hagen, M., Graßegger, J., Kiesel, J., Michel, M., Oberländer, A., Tippmann, M., Barrón-Cedeño, A., Gupta, P., Rosso, P., Stein, B.: Overview of the 4th International Competition on Plagiarism Detection. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop –Working Notes Papers (2012)
Juola, P.: An Overview of the Traditional Authorship Attribution Subtask Notebook for PAN at CLEF 2012. In: Forner, P., Karlgren, J., and Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop –Working Notes Papers (2012)
Yakout, M.M.: Examples of Plagiarism in Scientific and Cultural Communities (in Arabic), http://www.yaqout.net/ba7s_4.html
Abbasi, A., Chen, H.: Applying Authorship Analysis to Arabic Web Content. In: Kantor, P., Muresan, G., Roberts, F., Zeng, D.D., Wang, F.-Y., Chen, H., Merkle, R.C. (eds.) ISI 2005. LNCS, vol. 3495, pp. 183–197. Springer, Heidelberg (2005)
Shaker, K., Corne, D.: Authorship Attribution in Arabic using a hybrid of evolutionary search and linear discriminant analysis. In: 2010 UK Workshop on Computational Intelligence (UKCI), pp. 1–6. IEEE (2010)
Ouamour, S., Sayoud, H.: Authorship attribution of ancient texts written by ten arabic travelers using a SMO-SVM classifier. In: 2012 International Conference on Communications and Information Technology (ICCIT), pp. 44–47. IEEE (2012)
Bensalem, I., Rosso, P., Chikhi, S.: Intrinsic Plagiarism Detection in Arabic Text: Preliminary Experiments. In: Berlanga, R., Rosso, P. (eds.) 2nd Spanish Conference on Information Retrieval (CERI 2012), Valencia (2012)
Jadalla, A., Elnagar, A.: A Plagiarism Detection System for Arabic Text-Based Documents. In: Chau, M., Wang, G.A., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 145–153. Springer, Heidelberg (2012)
Alzahrani, S., Salim, N.: Statement-Based Fuzzy-Set Information Retrieval versus Fingerprints Matching for Plagiarism Detection in Arabic Documents. In: 5th Postgraduate Annual Research Seminar (PARS 2009), Johor Bahru, Malaysia, pp. 267–268 (2009)
Menai, M.E.B.: Detection of Plagiarism in Arabic Documents. International Journal of Information Technology and Computer Science 10, 80–89 (2012)
Jaoua, M., Jaoua, F.K., Hadrich Belguith, L., Ben Hamadou, A.: Automatic Detection of Plagiarism in Arabic Documents Based on Lexical Chains. Arab Computer Society Journal 4, 1–11 (2011) (in Arabic)
Potthast, M., Hagen, M., Völske, M., Stein, B.: Crowdsourcing Interaction Logs to Understand Text Reuse from the Web. In: 51st Annual Meeting of the Association of Computational Linguistics (ACL 2013). ACM (to appear, 2013)
Stein, B., Lipka, N., Prettenhofer, P.: Intrinsic plagiarism analysis. Language Resources and Evaluation 45, 63–82 (2010)
Bensalem, I., Rosso, P., Chikhi, S.: Building Arabic Corpora from Wikisource. In: 10th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2013). IEEE (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bensalem, I., Rosso, P., Chikhi, S. (2013). A New Corpus for the Evaluation of Arabic Intrinsic Plagiarism Detection. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visualization. CLEF 2013. Lecture Notes in Computer Science, vol 8138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40802-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-40802-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40801-4
Online ISBN: 978-3-642-40802-1
eBook Packages: Computer ScienceComputer Science (R0)