Skip to main content

RoMaPla: Using t-Test for Evaluating Robustness of Marathi Plagiarism

  • Conference paper
  • First Online:
Evolution in Computational Intelligence

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 267))

  • 317 Accesses

Abstract

Identifying plagiarism of the document is a mandatory task in the academic domain. Generally online available tools are used to check plagiarism. These tools calculate similarity between the documents using a sequence of the tokens/words present in the documents which are to be compared. A semantic relationship between the words for eg., word and its synonym are treated as different, while calculating the similarity between the documents. Few tools may be available for checking the similarity of English documents. But checking the plagiarism of Marathi documents is comparatively untouched field. Information present in the Marathi language is growing due to multilingual processing. The existing MaPla (Marathi Plagiarism checker) proved that Document synset matrix for Marathi (DSMM) similarity results are near to readings observed using cognitive ability of humans and it was performed on 4 documents. To further confirm robustness of MaPla, we experimented with 24 documents to calculate the similarity between all pairs of documents using cosine measure. Thus two, 24 × 24 matrices are formulated using DSMM and manual readings. Paired t-test, which was not carried out in MaPla, proves that there is no significant difference between two matrices and hence proves the robustness of the proposed technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Naik, R.R., Landge, M.B., Mahender, C.N.: Development of marathi text corpus for plagiarism detection in the marathi language. Corpus 6, 340 (2011)

    Google Scholar 

  2. Lamba, H., Govilkar, S.: A survey on plagiarism detection techniques for indian regional languages. Int. J. Comput. Appl. 975, 8887 (2017)

    Google Scholar 

  3. Shenoy, N., Potey, M.A.: Semantic similarity search model for obfuscated plagiarism detection in Marathi language using Fuzzy and Naïve Bayes approaches IOSR. J. Comput. Eng. 18(3), 83–88 (2016)

    Google Scholar 

  4. Bafna P.B., Saini J.R.: MaPla: a marathi plagiarism checker using document synset matrix. Int. J. Adv. Sci. Technol. (2020). in press

    Google Scholar 

  5. Bafna P.B., Saini J.R.: Marathi text analysis using unsupervised learning and word cloud. Int. J. Eng. Adv. Technol. 9(3) (2020)

    Google Scholar 

  6. Naik, R.R., Landge, M.B.: Plagiarism detection in marathi language using semantic analysis. In: Scholarly Ethics and Publishing: Breakthroughs in Research and Practice, pp. 473–482. IGI Global (2019)

    Google Scholar 

  7. Al-Ayyoub, M., Nuseir, A., Alsmearat, K., Jararweh, Y., Gupta, B.: Deep learning for Arabic NLP: a survey. J. Comput. Sci. 26, 522–531 (2018)

    Google Scholar 

  8. Gupta, N., Mathur, P.: Spell Checking Techniques in NLP: A Survey (2012)

    Google Scholar 

  9. Khan, W., Daud, A., Nasir, J.A., Amjad, T.: A survey on the state-of-the-art machine learning models in the context of NLP. Kuwait J. Sci. 43(4) (2016)

    Google Scholar 

  10. Ranjan, N., Mundada, K., Phaltane, K., Ahmad, S.: A survey on techniques in NLP. Int. J. Comput. Appl. 134(8), 6–9 (2016). odelling, pa

    Google Scholar 

  11. Naik, R.R., Landge, M.B., Mahender, C.N.: Word level plagiarism detection of marathi text using N-Gram approach. In: International Conference on Recent Trends in Image Processing and Pattern Recognition, pp. 14–23. Springer, Singapore (2018)

    Google Scholar 

  12. Srivastava, S., Govilkar, S.: Paraphrase identification of marathi sentences. In: International Conference on Intelligent Data Communication Technologies and Internet of Things, pp. 534–544. Springer, Cham (2018); Intelligent Computing: Theory and Applications, pp. 797–806. Springer, Singapore (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prafulla B. Bafna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saini, J.R., Bafna, P.B. (2022). RoMaPla: Using t-Test for Evaluating Robustness of Marathi Plagiarism. In: Bhateja, V., Tang, J., Satapathy, S.C., Peer, P., Das, R. (eds) Evolution in Computational Intelligence. Smart Innovation, Systems and Technologies, vol 267. Springer, Singapore. https://doi.org/10.1007/978-981-16-6616-2_5

Download citation

Publish with us

Policies and ethics