Urdu Spell Checker: A Scarce Resource Language

Aziz, Romila; Anwar, Muhammad Waqas

doi:10.1007/978-981-15-5232-8_40

Romila Aziz⁹ &
Muhammad Waqas Anwar⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1198))

Included in the following conference series:

International Conference on Intelligent Technologies and Applications

1035 Accesses
5 Citations

Abstract

In the digital world of computers, several software applications have been developed to ensure spellings of various words. English language is found to have gone far ahead in the development of spell checking applications whilst other languages specifically naming Urdu, lack behind to cherish such technologies. We develop “Urdu Spell Checker” which detects incorrect spellings of a word and provides a list of options containing correct spellings. The spell checker carries correct spellings of words residing inside a predefined lexicon or corpus. It is to ensure whether entered word is correct or not. In case if the input word matches with the corpus words it is considered correct otherwise it is considered as misspelled word. Multiple techniques are used individually as well as a combination these techniques is used to check which set of methods is best in terms of output. By using multiple techniques for error correction, it is observed that Jaro distance provides best results with combination of soundex, shapex and n-gram that is 80.0% precision, 44.87% recall and 57.37% F-Measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Spell Checker Using Norvig Algorithm for Gujarati Language

A hybrid model for spelling error detection and correction for Urdu language

Article 05 August 2021

Context Sensitive Tamil Language Spellchecker Using RoBERTa

Notes

References

Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)
Article Google Scholar
Naseem, T., Hussain, S.: A novel approach for ranking spelling error corrections. Lang. Resour. Eval. 41(2), 117–128 (2007)
Article Google Scholar
Naseem, T.: A hybrid approach for Urdu spell checking. Master of Science (Computer Science) thesis at the National University of Computer & Emerging Sciences, pp. 1–87 (2004)
Google Scholar
Das, M., Borgohain, S., Gogoi, J., Nair, S.B.: Design and implementation of a spell checker for Assamese, pp. 156–162. IEEE (2002)
Google Scholar
Solak, A., Oflazer, K.: Design and implementation of a spelling checker for Turkish. Literary Linguist. Comput. 8(3), 113–130 (1993)
Article Google Scholar
Durrani, N., Hussain, S.: Urdu word segmentation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 528–536 (2010)
Google Scholar
Zaghouani, W., et al.: Large scale arabic error annotation: guidelines and framework. In: LREC, pp. 2362–2369 (2014)
Google Scholar
Rasooli, M.S., Kahefi, O., Minaei-Bidgoli, B.: Effect of adaptive spell checking in Persian. In: 2011 7th International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), pp. 161–164. IEEE (2011)
Google Scholar
Iqbal, S., Anwar, M.W., Bajwa, U.I., Rehman, Z.: Urdu spell checking: reverse edit distance approach. In: Proceedings of the 4th Workshop on South and Southeast Asian Natural Language Processing, pp. 58–65 (2013)
Google Scholar
Magdy, W., Darwish, K.: Arabic OCR error correction using character segment correction, language modeling, and shallow morphology. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 408–414 (2006)
Google Scholar
Zhang, Q., Zhang, S., Hou, J., Cheng, X.: HANSpeller: a unified framework for Chinese spelling correction. Int. J. Comput. Linguist. Chin. Lang. Process. 20(1), 1–22 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
Romila Aziz & Muhammad Waqas Anwar

Authors

Romila Aziz
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Waqas Anwar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Romila Aziz .

Editor information

Editors and Affiliations

Islamia University of Bahawalpur, Baghdad, Pakistan
Imran Sarwar Bajwa
Metropolitan University, Belgrade, Serbia
Tatjana Sibalija
University of Technology Malaysia, Johor Bahru, Malaysia
Dayang Norhayati Abang Jawawi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aziz, R., Anwar, M.W. (2020). Urdu Spell Checker: A Scarce Resource Language. In: Bajwa, I., Sibalija, T., Jawawi, D. (eds) Intelligent Technologies and Applications. INTAP 2019. Communications in Computer and Information Science, vol 1198. Springer, Singapore. https://doi.org/10.1007/978-981-15-5232-8_40

Download citation

DOI: https://doi.org/10.1007/978-981-15-5232-8_40
Published: 09 May 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5231-1
Online ISBN: 978-981-15-5232-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Urdu Spell Checker: A Scarce Resource Language

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Spell Checker Using Norvig Algorithm for Gujarati Language

A hybrid model for spelling error detection and correction for Urdu language

Context Sensitive Tamil Language Spellchecker Using RoBERTa

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Urdu Spell Checker: A Scarce Resource Language

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Spell Checker Using Norvig Algorithm for Gujarati Language

A hybrid model for spelling error detection and correction for Urdu language

Context Sensitive Tamil Language Spellchecker Using RoBERTa

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation