Abstract
Defined as “the use of ideas, concepts, words, or structures without appropriately acknowledging the source to benefit in a setting where originality is expected" [6], plagiarism poses a severe concern in the rapidly increasing number of scientific publications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alzahrani, S.M., Salim, N., Abraham, A.: Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(2), 133–149 (2012). https://doi.org/10.1109/TSMCC.2011.2134847
Arabi, H., Akbari, M.: Improving plagiarism detection in text document using hybrid weighted similarity. Expert Syst. Appl. 207, 118034 (2022)
Dadure, P., Pakray, P., Bandyopadhyay, S.: BERT-based embedding model for formula retrieval. In: CLEF (Working Notes), pp. 36–46 (2021)
Diaz, Y., Nishizawa, G., Mansouri, B., Davila, K., Zanibbi, R.: The mathdeck formula editor: interactive formula entry combining latex, structure editing, and search. In: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–5 (2021)
El-Rashidy, M.A., Mohamed, R.G., El-Fishawy, N.A., Shouman, M.A.: Reliable plagiarism detection system based on deep learning approaches. Neural Comput. Appl. 34(21), 18837–18858 (2022)
Fishman, T.: We know it when we see it is not good enough: toward a standard definition of plagiarism that transcends theft, fraud, and copyright (2009)
Foltýnek, T., Meuschke, N., Gipp, B.: Academic plagiarism detection: a systematic literature review. ACM Comput. Surv. 52(6), 1–42 (2020). https://doi.org/10.1145/3345317
Gienapp, L., Kircheis, W., Sievers, B., Stein, B., Potthast, M.: A large dataset of scientific text reuse in open-access publications. Sci. Data 10(1), 58 (2023). https://doi.org/10.1038/s41597-022-01908-z
Greiner-Petter, A., Schubotz, M., Breitinger, C., Scharpf, P., Aizawa, A., Gipp, B.: Do the math: making mathematics in Wikipedia computable. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4384–4395 (2022). https://doi.org/10.1109/TPAMI.2022.3195261
Lovepreet, Gupta, V., Kumar, R.: Survey on plagiarism detection systems and their comparison. In: Behera, H.S., Nayak, J., Naik, B., Pelusi, D. (eds.) Computational Intelligence in Data Mining: Proceedings of the International Conference on ICCIDM 2018, pp. 27–39. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8676-3_3
Mansouri, B., Agarwal, A., Oard, D.W., Zanibbi, R.: Advancing math-aware search: the ARQMath-3 lab at CLEF 2022. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13186, pp. 408–415. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99739-7_51
McCabe, D.L.: Cheating among college and university students: a North American perspective. Int. J. Educ. Integrity 1(1) (2005). https://doi.org/10.21913/IJEI.v1i1.14
Meuschke, N., Schubotz, M., Hamborg, F., Skopal, T., Gipp, B.: Analyzing mathematical content to detect academic plagiarism. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM) (2017). https://doi.org/10.1145/3132847.3133144
Meuschke, N., Stange, V., Schubotz, M., Kramer, M., Gipp, B.: Improving academic plagiarism detection for stem documents by analyzing mathematical content and citations. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL) (Jun 2019). https://doi.org/10.1109/JCDL.2019.00026
Potthast, M., Stein, B., Eiselt, A., Barrón-Cedeño, A., Rosso, P.: PAN plagiarism corpus 2011 (PAN-PC-11) (Jun 2011). https://doi.org/10.5281/zenodo.3250095, https://doi.org/10.5281/zenodo.3250095
Scharpf, P., Mackerracher, I., Schubotz, M., Beel, J., Breitinger, C., Gipp, B.: AnnoMathTex - a formula identifier annotation recommender system for stem documents. In: Proceedings of the 13th ACM Conference on Recommender Systems (RecSys 2019). ACM, Copenhagen, Denmark (Sept 2019). https://doi.org/10.1145/3298689.3347042
Schubotz, M., Teschke, O., Stange, V., Meuschke, N., Gipp, B.: Forms of plagiarism in digital mathematical libraries. In: Intelligent Computer Mathematics - 12th International Conference, CICM 2019, Prague, Czech Republic, July 8–12, 2019, Proceedings (2019). https://doi.org/10.1007/978-3-030-23250-4_18
Stein, B., Koppel, M., Stamatatos, E.: Plagiarism analysis, authorship identification, and near-duplicate detection PAN’07. ACM SIGIR Forum 41(2), 68–71 (2007). https://doi.org/10.1145/1328964.1328976
Weber-Wulff, D.: Talking to a wall: the response of German universities to documentations of plagiarism in doctoral theses. In: Bjelobaba, S., Foltýnek, T., Glendinning, I., Krásničan, V., Dlabolová, D.H. (eds.) Academic Integrity: Broadening Practices, Technologies, and the Role of Students: Proceedings from the European Conference on Academic Integrity and Plagiarism 2021, pp. 363–371. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16976-2_20
Yu, W., Pang, L., Xu, J., Su, B., Dong, Z., Wen, J.R.: Optimal partial transport based sentence selection for long-form document matching. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 2363–2373 (2022)
Zanibbi, R., Aizawa, A., Kohlhase, M., Ounis, I., Topic, G., Davila, K.: NTCIR-12 MathIR task overview. In: NTCIR (2016)
Zhong, W., Xie, Y., Lin, J.: Applying structural and dense semantic matching for the ARQMath lab 2022, CLEF. Proc. Working Notes CLEF 2022, 5–8 (2022)
Zhong, W., Yang, J.H., Lin, J.: Evaluating token-level and passage-level dense retrieval models for math information retrieval. arXiv preprint arXiv:2203.11163 (2022)
Zhong, W., Zhang, X., Xin, J., Zanibbi, R., Lin, J.: Approach zero and Anserini at the CLEF-2021 ARQMath track: Applying substructure search and BM25 on operator tree path tokens. In: Proceedings CLEF 2021 (CEUR Working Notes) (2021)
Acknowledgement
This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 437179652 and the Deutscher Akademischer Austauschdienst (DAAD, German Academic Exchange Service - 57515245).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Satpute, A. (2024). Analyzing Mathematical Content for Plagiarism and Recommendations. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14612. Springer, Cham. https://doi.org/10.1007/978-3-031-56069-9_42
Download citation
DOI: https://doi.org/10.1007/978-3-031-56069-9_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56068-2
Online ISBN: 978-3-031-56069-9
eBook Packages: Computer ScienceComputer Science (R0)