Abstract
The “Did You Mean...?” system, described in this article, is a spelling corrector for Arabic that is designed specifically for L2 learners of dialectal Arabic in the context of dictionary lookup. The authors use an orthographic density metric to motivate the need for a finer-grained ranking method for candidate words than unweighted Levenshtein edit distance. The Did You Mean...? architecture is described, and the authors show that mean reciprocal rank can be improved by tuning operation weights according to sound confusions, and by anticipating likely spelling variants.
- Andrieu, C., de Freitas, N., Doucet, A., and Jordan, M. I. 2003. An introduction to MCMC for machine learning. Mach. Learn. 50, 5--43.Google ScholarCross Ref
- Andrews, S. 1997. The effect of orthographic similarity on lexical retrieval: Resolving neighborhood conflicts. Psychonomic Bull. Rev. 4, 4, 439--461.Google ScholarCross Ref
- Bates, D. M. 2007. Linear mixed model implementation in lme4. Manuscript, University of Wisconsin -- Madision.Google Scholar
- Boyd, A. 2008. Pronunciation modeling in spelling correction for writers of English as a foreign language. M.S. thesis, The Ohio State University.Google Scholar
- Brill, E. and Moore, R. C. 2000. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL’00). 286--293. Google ScholarDigital Library
- Church, K. W. and Gale, W. A. 1991. Probability scoring for spelling correction. Statistics and Computing 1, 93--103.Google ScholarCross Ref
- Coltheart, M., Davelaar, E., Jonasson, J. T., and Besner, D. 1977. Access to the internal lexicon. In Attention and Performance VI. S. Dornic Ed., Lawrence Erlbaum Associates, Hillsdale, NJ., 535--555.Google Scholar
- Damerau, F. J. 1964. A technique for computer detection and correction of spelling errors. Comm. ACM 7, 3, 171--176. Google ScholarDigital Library
- Ferguson, C. A. 1959. Diglossia. Word 5, 325--340.Google ScholarCross Ref
- Grainger, J. 1990. Word frequency and neighborhood effects in lexical decision and naming. J. Mem. Lang. 29, 228--244.Google ScholarCross Ref
- Hassan, A., Noeman, S., and Hassan, H. 2008. Language independent text correction using finite state automata. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP’08).Google Scholar
- Levenshtein, V. 1965. Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163, 4, 845--848.Google Scholar
- Porter, N. Ed. 1913. Webster’s Revised Unabridged Dictionary of the English Language. G. and C. Merriam, Springfield, MA.Google Scholar
- Rachidi, T., Bouzoubaa, M., Elmortaji, L., Boussouab, B., and Bensaid, A. 2003. Arabic user seach query correction and expansion. In Proceedings of the 1st Plenary Information Technology Pole of Competences Conference (COPSTIC’03).Google Scholar
- Rytting, C. A., Rodrigues, P., Buckwalter, T., Zajic, D. Hirsch, B., Carnes, J., Lynn, N., Wayland, S., Taylor, C., White, J., Blake III, C., Browne, E., Miller, C., and Purvis, T. 2010. Error correction for Arabic dictionary lookup. In Proceedings of the Annual Conference on Language Resources and Evaluation (LREC’10).Google Scholar
- Sethy, A., Mote, N., Narayanan, S., and Johnson, W. L. 2005. Modeling and automating detection of errors in Arabic language learner speech. In Proceedings of the European Conference on Speech Communication and Technology (INTERSPEECH’05). 177--180.Google Scholar
- Shaalan, K., Allam, A., and Gomah, A. 2003. Towards automatic spell checking for Arabic. In Proceedings of the 4th Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE’03). 240--247.Google Scholar
- Shannon, C. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423, 623--656.Google ScholarCross Ref
- Spieler, D. H. and Balota, D. A. 2000. Factors influencing word naming in younger and older adults. Psychology and Aging 15, 225--231.Google ScholarCross Ref
- Toutanova, K. and Moore, R. C. 2002. Pronunciation modeling for improved spelling correction. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL’02). Google ScholarDigital Library
- Wagner, R. A. and Fischer, M. J. 1974. The string-to-string correction problem. J. Assoc. Comput. Mach. 21, 1, 168--73. Google ScholarDigital Library
- Woodhead, D. R. and Beene, W. Eds. 2003. A Dictionary of Iraqi Arabic: Arabic - English. Georgetown University Press, Washington, D.C.Google Scholar
- Ziegler, J. C., Muneaux, M., and Grainger, J. 2003. Neighborhood effects in auditory word recognition: Phonological competition and orthographic facilitation. J. Mem. Lang. 48, 779--793.Google ScholarCross Ref
- Zribi, C. B. O. and Ben Ahmed, M. 2003. Efficient automatic correction of misspelled Arabic words based on contextual information. In Proceedings of the 7th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES’03). V. Palade, R. J. Howlett, and L. Jain Eds., Oxford, Springer, 770--777.Google Scholar
Index Terms
- Spelling Correction for Dialectal Arabic Dictionary Lookup
Recommendations
Context-aware correction of spelling errors in Hungarian medical documents
HighlightsWe propose two methods to automatically correct Hungarian clinical text.Method 1 generates a ranked list of correction candidates disregarding context.Method 2 uses an SMT decoder to implement context-aware error correction.Method 1 is ...
Word2Vec based spelling correction method of Twitter message
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied ComputingTwitter1 became popular owing to the devices like smartphones and tablets, with which short messages can be easily composed. Due to the popularity of Twitter, the volume of Twitter messages has increased rapidly. Accordingly, studies have been carried ...
Information retrieval and spelling correction: an inquiry into lexical disambiguation
SAC '02: Proceedings of the 2002 ACM symposium on Applied computingIn a preliminary study, we show the effect of spelling errors on an ad hoc information retrieval task. Then, we report on the comparison of different strategies for correcting spelling errors resulting in non-existent words. Unlike interactive spelling ...
Comments