Skip to main content

Analyzing Mathematical Content for Plagiarism and Recommendations

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14612))

Included in the following conference series:

  • 356 Accesses

Abstract

Defined as “the use of ideas, concepts, words, or structures without appropriately acknowledging the source to benefit in a setting where originality is expected" [6], plagiarism poses a severe concern in the rapidly increasing number of scientific publications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alzahrani, S.M., Salim, N., Abraham, A.: Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(2), 133–149 (2012). https://doi.org/10.1109/TSMCC.2011.2134847

    Article  Google Scholar 

  2. Arabi, H., Akbari, M.: Improving plagiarism detection in text document using hybrid weighted similarity. Expert Syst. Appl. 207, 118034 (2022)

    Article  Google Scholar 

  3. Dadure, P., Pakray, P., Bandyopadhyay, S.: BERT-based embedding model for formula retrieval. In: CLEF (Working Notes), pp. 36–46 (2021)

    Google Scholar 

  4. Diaz, Y., Nishizawa, G., Mansouri, B., Davila, K., Zanibbi, R.: The mathdeck formula editor: interactive formula entry combining latex, structure editing, and search. In: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–5 (2021)

    Google Scholar 

  5. El-Rashidy, M.A., Mohamed, R.G., El-Fishawy, N.A., Shouman, M.A.: Reliable plagiarism detection system based on deep learning approaches. Neural Comput. Appl. 34(21), 18837–18858 (2022)

    Article  Google Scholar 

  6. Fishman, T.: We know it when we see it is not good enough: toward a standard definition of plagiarism that transcends theft, fraud, and copyright (2009)

    Google Scholar 

  7. Foltýnek, T., Meuschke, N., Gipp, B.: Academic plagiarism detection: a systematic literature review. ACM Comput. Surv. 52(6), 1–42 (2020). https://doi.org/10.1145/3345317

    Article  Google Scholar 

  8. Gienapp, L., Kircheis, W., Sievers, B., Stein, B., Potthast, M.: A large dataset of scientific text reuse in open-access publications. Sci. Data 10(1), 58 (2023). https://doi.org/10.1038/s41597-022-01908-z

    Article  Google Scholar 

  9. Greiner-Petter, A., Schubotz, M., Breitinger, C., Scharpf, P., Aizawa, A., Gipp, B.: Do the math: making mathematics in Wikipedia computable. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4384–4395 (2022). https://doi.org/10.1109/TPAMI.2022.3195261

  10. Lovepreet, Gupta, V., Kumar, R.: Survey on plagiarism detection systems and their comparison. In: Behera, H.S., Nayak, J., Naik, B., Pelusi, D. (eds.) Computational Intelligence in Data Mining: Proceedings of the International Conference on ICCIDM 2018, pp. 27–39. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8676-3_3

    Chapter  Google Scholar 

  11. Mansouri, B., Agarwal, A., Oard, D.W., Zanibbi, R.: Advancing math-aware search: the ARQMath-3 lab at CLEF 2022. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13186, pp. 408–415. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99739-7_51

    Chapter  Google Scholar 

  12. McCabe, D.L.: Cheating among college and university students: a North American perspective. Int. J. Educ. Integrity 1(1) (2005). https://doi.org/10.21913/IJEI.v1i1.14

  13. Meuschke, N., Schubotz, M., Hamborg, F., Skopal, T., Gipp, B.: Analyzing mathematical content to detect academic plagiarism. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM) (2017). https://doi.org/10.1145/3132847.3133144

  14. Meuschke, N., Stange, V., Schubotz, M., Kramer, M., Gipp, B.: Improving academic plagiarism detection for stem documents by analyzing mathematical content and citations. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL) (Jun 2019). https://doi.org/10.1109/JCDL.2019.00026

  15. Potthast, M., Stein, B., Eiselt, A., Barrón-Cedeño, A., Rosso, P.: PAN plagiarism corpus 2011 (PAN-PC-11) (Jun 2011). https://doi.org/10.5281/zenodo.3250095, https://doi.org/10.5281/zenodo.3250095

  16. Scharpf, P., Mackerracher, I., Schubotz, M., Beel, J., Breitinger, C., Gipp, B.: AnnoMathTex - a formula identifier annotation recommender system for stem documents. In: Proceedings of the 13th ACM Conference on Recommender Systems (RecSys 2019). ACM, Copenhagen, Denmark (Sept 2019). https://doi.org/10.1145/3298689.3347042

  17. Schubotz, M., Teschke, O., Stange, V., Meuschke, N., Gipp, B.: Forms of plagiarism in digital mathematical libraries. In: Intelligent Computer Mathematics - 12th International Conference, CICM 2019, Prague, Czech Republic, July 8–12, 2019, Proceedings (2019). https://doi.org/10.1007/978-3-030-23250-4_18

  18. Stein, B., Koppel, M., Stamatatos, E.: Plagiarism analysis, authorship identification, and near-duplicate detection PAN’07. ACM SIGIR Forum 41(2), 68–71 (2007). https://doi.org/10.1145/1328964.1328976

    Article  Google Scholar 

  19. Weber-Wulff, D.: Talking to a wall: the response of German universities to documentations of plagiarism in doctoral theses. In: Bjelobaba, S., Foltýnek, T., Glendinning, I., Krásničan, V., Dlabolová, D.H. (eds.) Academic Integrity: Broadening Practices, Technologies, and the Role of Students: Proceedings from the European Conference on Academic Integrity and Plagiarism 2021, pp. 363–371. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16976-2_20

    Chapter  Google Scholar 

  20. Yu, W., Pang, L., Xu, J., Su, B., Dong, Z., Wen, J.R.: Optimal partial transport based sentence selection for long-form document matching. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 2363–2373 (2022)

    Google Scholar 

  21. Zanibbi, R., Aizawa, A., Kohlhase, M., Ounis, I., Topic, G., Davila, K.: NTCIR-12 MathIR task overview. In: NTCIR (2016)

    Google Scholar 

  22. Zhong, W., Xie, Y., Lin, J.: Applying structural and dense semantic matching for the ARQMath lab 2022, CLEF. Proc. Working Notes CLEF 2022, 5–8 (2022)

    Google Scholar 

  23. Zhong, W., Yang, J.H., Lin, J.: Evaluating token-level and passage-level dense retrieval models for math information retrieval. arXiv preprint arXiv:2203.11163 (2022)

  24. Zhong, W., Zhang, X., Xin, J., Zanibbi, R., Lin, J.: Approach zero and Anserini at the CLEF-2021 ARQMath track: Applying substructure search and BM25 on operator tree path tokens. In: Proceedings CLEF 2021 (CEUR Working Notes) (2021)

    Google Scholar 

Download references

Acknowledgement

This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 437179652 and the Deutscher Akademischer Austauschdienst (DAAD, German Academic Exchange Service - 57515245).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ankit Satpute .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Satpute, A. (2024). Analyzing Mathematical Content for Plagiarism and Recommendations. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14612. Springer, Cham. https://doi.org/10.1007/978-3-031-56069-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56069-9_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56068-2

  • Online ISBN: 978-3-031-56069-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics