Skip to main content

A New Model for Detecting Similarity in Arabic Documents

  • Conference paper
  • First Online:
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017 (AISI 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 639))

Abstract

With the hug of the information on WWW and digital libraries, Plagiarism became one of the most important issues for universities, schools and researcher’s fields. While there are many systems for detecting plagiarism in Arabic language documents, the complexity of writing Arabic documents make such scheme a big challenge. On the other hand, although search engines such as Google can be utilized, there would be boring efforts to copy some sentences and paste them into the search engine to find similar resources. For that reason, developing Arabic plagiarism detection tool accelerate the process since plagiarism can be detected and highlighted automatically, and one only needs to submit the document to the system. This paper presents an effective web-enabled system for Arabic plagiarism detection called APDS, which can be integrated with e-learning systems to judge students’ assignments, papers and dissertations. The experimental results are provided to evaluate APDS regarding the precision and recall ratios. The result shows that the average percentage of the precision is 82% and the average percentage of the recall is 92.5%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lam, S., Lee, K., Choi, S.: iChecker: an efficient plagiarism detection tool for learning management systems. Int. J. Syst. Serv.-Orient. Eng. 3(6), 16–31 (2016)

    Google Scholar 

  2. Abdi, A., Idris, N., Alguliyev, R.: PDLK: plagiarism detection using linguistic knowledge. Sci. Direct, Expert Syst. Appl. 42, 8936–8946 (2015)

    Article  Google Scholar 

  3. Kahloula, B., Berri, J.: Plagiarism detection in arabic documents: approaches, architecture and systems. J. Digital Inf. Manage. 14(2), 124–135 (2016)

    Google Scholar 

  4. Abdelrahman, Y., Khalid, A., Osman, I.: A survey of plagiarism detection for arabic documents. Int. J. Adv. Comput. Technol. (IJACT) 4(6), 34–38 (2014)

    Google Scholar 

  5. Hussein, A.: Visualizing document similarity using n-grams and latent semantic analysis. SAI Comput. Conf., London 5, 269–279 (2016)

    Google Scholar 

  6. Borner, K., Chen, C., Boyack, K.: Knowledge domain visualization. In: Information Visualization, vol. 7, pp. 143–171. Springer, London (2006)

    Google Scholar 

  7. Subba, L.: An anti-plagiarism add-on for web-CAT. In: International Conference on Engaging Pedagogy (ICEP), Athlone Institute of Technology, Co. Westmeath, Ireland, vol. 3, pp. 102–1113 (2014)

    Google Scholar 

  8. Tao, G., Guowei, D., Baojiang, C.: Improved plagiarism detection algorithm based on abstract syntax tree. In: Fourth International Conference on Emerging Intelligent Data and Web Technologies, vol. 2, pp. 714–719 (2013)

    Google Scholar 

  9. El Bachir, M., Bagais, M.: APlag: a plagiarism checker for Arabic texts. Int. J. Inf. Technol. Comput. Sci. 10, 80–89 (2014)

    Google Scholar 

  10. Ahtiainen, A., Surakka, S., Rahikainen, M.: Plaggie: gnulicensed source code plagiarism detection engine for java exercises. In: Proceedings of the 6th Baltic Sea Conference on Computing Education Research: Koli Baltic, vol. 6, pp. 141–142 (2011)

    Google Scholar 

  11. Choi, Y., Park, Y., Choi, J.: RAMC: runtime abstract memory context based plagiarism detection in binary code. In: Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication (ICUIMC 2013), vol. 67. ACM Press, January 2013. doi:10.1145/2448556.2448623

  12. Osman, A., Salim, N., Binwahlan, M., Alteeb, R., Abuobieda, A.: An improved plagiarism detection scheme based on semantic role labeling. Appl. Soft Comput. J. 12, 1493–1502 (2012)

    Article  Google Scholar 

  13. Grozea, C., Popescu, M.: ENCOPLOT: pairwise sequence matching in linear time applied to plagiarism detection. In: 25th Annual Conference of the Spanish Society for Natural Language Processing, SEPLN, vol. 09, pp. 10–18 (2011)

    Google Scholar 

  14. Mozgovoy, M., Frederiksson, K., White, D.R., Joy, M.S., Sutinen, E.: Fast plagiarism detection system. In: String Processing and Information Retrieval, 12th International Conference (SPIRE 2014). Lecture Notes in Computer Science, vol. 3772, pp. 267–270 (2014)

    Google Scholar 

  15. Wise, M.: Yap 3: improved detection of similarities in computer program and other texts. ACM Spec. Interest Group Comput. Sci. Educ. 28(1), 130–134 (2012)

    Google Scholar 

  16. Prechelt, L., Malpohl, G., Phlippsen, M.: Jplag: finding plagiarisms among a set of programs. In: 7th International Conference on Ubiquitous Information Management and Communication, vol. 67. ACM Press, January 2013. doi:10.1145/2448556.2448623

  17. Sharma, K., Jindal, L.: An improved online plagiarism detection approach for semantic analysis using custom search engine. In: 3rd International Conference on Computing for Sustainable Global Development (INDIACom), vol. 3, pp. 124–129 (2016)

    Google Scholar 

  18. Aiken, A.: Moss: a system for detecting software plagiarism. University of California–Berkeley (2015). www.cs.berkeley.edu/aiken/moss.html

  19. Clough, P.: Plagiarism in natural and programming languages: an overview of current tools and technologies. Research Memoranda, Department of Computer Science, University of Sheffield, 2(2), 3140–3144 (2014)

    Google Scholar 

  20. Butakov, S., Murzintsev, S., Tskhai, A.: Detecting text similarity on a scalable no-SQL database platform. In: International Conference on Platform Technology and Service (PlatCon) (2016). doi:10.3115/991250.991316

  21. Hassan, A., Ibrahim, M.: Designing quality e-learning environments for higher education. Educ. Res. 1(6), 186–197 (2010)

    Google Scholar 

  22. Shehab, A., Elhoseny, M., Hassanien, A.: A hybrid scheme for automated essay grading based on LVQ and NLP techniques. In: 2016 12th International Computer Engineering Conference (ICENCO). IEEE (2016). doi:10.1109/ICENCO.2016.7856447

  23. Elhoseny, H., Elhoseny, M., Abdelrazek, S., Bakry, H., Riad, A.: Utilizing service oriented architecture (SOA) in smart cities. Int. J. Adv. Comput. Technol. (IJACT) 8(3), 77–84 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Elhoseny .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Zaher, M., Shehab, A., Elhoseny, M., Osman, L. (2018). A New Model for Detecting Similarity in Arabic Documents. In: Hassanien, A., Shaalan, K., Gaber, T., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017. AISI 2017. Advances in Intelligent Systems and Computing, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-64861-3_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64861-3_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64860-6

  • Online ISBN: 978-3-319-64861-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics