skip to main content
research-article

Spelling Correction for Dialectal Arabic Dictionary Lookup

Published: 01 March 2011 Publication History

Abstract

The “Did You Mean...?” system, described in this article, is a spelling corrector for Arabic that is designed specifically for L2 learners of dialectal Arabic in the context of dictionary lookup. The authors use an orthographic density metric to motivate the need for a finer-grained ranking method for candidate words than unweighted Levenshtein edit distance. The Did You Mean...? architecture is described, and the authors show that mean reciprocal rank can be improved by tuning operation weights according to sound confusions, and by anticipating likely spelling variants.

References

[1]
Andrieu, C., de Freitas, N., Doucet, A., and Jordan, M. I. 2003. An introduction to MCMC for machine learning. Mach. Learn. 50, 5--43.
[2]
Andrews, S. 1997. The effect of orthographic similarity on lexical retrieval: Resolving neighborhood conflicts. Psychonomic Bull. Rev. 4, 4, 439--461.
[3]
Bates, D. M. 2007. Linear mixed model implementation in lme4. Manuscript, University of Wisconsin -- Madision.
[4]
Boyd, A. 2008. Pronunciation modeling in spelling correction for writers of English as a foreign language. M.S. thesis, The Ohio State University.
[5]
Brill, E. and Moore, R. C. 2000. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL’00). 286--293.
[6]
Church, K. W. and Gale, W. A. 1991. Probability scoring for spelling correction. Statistics and Computing 1, 93--103.
[7]
Coltheart, M., Davelaar, E., Jonasson, J. T., and Besner, D. 1977. Access to the internal lexicon. In Attention and Performance VI. S. Dornic Ed., Lawrence Erlbaum Associates, Hillsdale, NJ., 535--555.
[8]
Damerau, F. J. 1964. A technique for computer detection and correction of spelling errors. Comm. ACM 7, 3, 171--176.
[9]
Ferguson, C. A. 1959. Diglossia. Word 5, 325--340.
[10]
Grainger, J. 1990. Word frequency and neighborhood effects in lexical decision and naming. J. Mem. Lang. 29, 228--244.
[11]
Hassan, A., Noeman, S., and Hassan, H. 2008. Language independent text correction using finite state automata. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP’08).
[12]
Levenshtein, V. 1965. Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163, 4, 845--848.
[13]
Porter, N. Ed. 1913. Webster’s Revised Unabridged Dictionary of the English Language. G. and C. Merriam, Springfield, MA.
[14]
Rachidi, T., Bouzoubaa, M., Elmortaji, L., Boussouab, B., and Bensaid, A. 2003. Arabic user seach query correction and expansion. In Proceedings of the 1st Plenary Information Technology Pole of Competences Conference (COPSTIC’03).
[15]
Rytting, C. A., Rodrigues, P., Buckwalter, T., Zajic, D. Hirsch, B., Carnes, J., Lynn, N., Wayland, S., Taylor, C., White, J., Blake III, C., Browne, E., Miller, C., and Purvis, T. 2010. Error correction for Arabic dictionary lookup. In Proceedings of the Annual Conference on Language Resources and Evaluation (LREC’10).
[16]
Sethy, A., Mote, N., Narayanan, S., and Johnson, W. L. 2005. Modeling and automating detection of errors in Arabic language learner speech. In Proceedings of the European Conference on Speech Communication and Technology (INTERSPEECH’05). 177--180.
[17]
Shaalan, K., Allam, A., and Gomah, A. 2003. Towards automatic spell checking for Arabic. In Proceedings of the 4th Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE’03). 240--247.
[18]
Shannon, C. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423, 623--656.
[19]
Spieler, D. H. and Balota, D. A. 2000. Factors influencing word naming in younger and older adults. Psychology and Aging 15, 225--231.
[20]
Toutanova, K. and Moore, R. C. 2002. Pronunciation modeling for improved spelling correction. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL’02).
[21]
Wagner, R. A. and Fischer, M. J. 1974. The string-to-string correction problem. J. Assoc. Comput. Mach. 21, 1, 168--73.
[22]
Woodhead, D. R. and Beene, W. Eds. 2003. A Dictionary of Iraqi Arabic: Arabic - English. Georgetown University Press, Washington, D.C.
[23]
Ziegler, J. C., Muneaux, M., and Grainger, J. 2003. Neighborhood effects in auditory word recognition: Phonological competition and orthographic facilitation. J. Mem. Lang. 48, 779--793.
[24]
Zribi, C. B. O. and Ben Ahmed, M. 2003. Efficient automatic correction of misspelled Arabic words based on contextual information. In Proceedings of the 7th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES’03). V. Palade, R. J. Howlett, and L. Jain Eds., Oxford, Springer, 770--777.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian Language Information Processing
ACM Transactions on Asian Language Information Processing  Volume 10, Issue 1
March 2011
88 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/1929908
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2011
Accepted: 01 November 2010
Received: 01 August 2010
Published in TALIP Volume 10, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Arabic dialects
  2. Iraqi Arabic
  3. Spelling correction
  4. dictionary lookup
  5. error correction for non-native language learners
  6. weighted finite-state transducers

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Systematic Literature Review of Dialectal Arabic: Identification and DetectionIEEE Access10.1109/ACCESS.2021.30595049(31010-31042)Online publication date: 2021
  • (2016)Spelling correction and morphological analysis to aid electronic dictionary look-upLexicography10.1007/s40607-016-0027-x3:1(63-81)Online publication date: 1-Oct-2016
  • (2015)Spelling error patterns in brazilian portugueseComputational Linguistics10.1162/COLI_a_0021641:1(175-183)Online publication date: 1-Mar-2015
  • (2015)Chinese Spelling Checker Based on an Inverted Index List with a Rescoring MechanismACM Transactions on Asian and Low-Resource Language Information Processing10.1145/282623514:4(1-28)Online publication date: 11-Nov-2015
  • (2012)Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and DeletionsIEEE Transactions on Audio, Speech, and Language Processing10.1109/TASL.2012.219761220:7(2111-2122)Online publication date: 1-Sep-2012

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media