Abstract
Spell checking is a process including detecting, correcting or providing spelling suggestions for misspelled words. In this paper, we present our spell checking system relied on the context and our experimental results when doing for Vietnamese. This system uses N-gram model with large corpus. N-grams is compressed to save the memory. Furthermore, we take the contexts in both sides of syllables to improve the system’s performance. Our system got high accuracy approximate 94% F-score on the Vietnamese text.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blair, C.: A program for correcting errors. Information and Control, 60–70 (1960)
Carlson, A., Rosen, J., Roth, D.: Scaling up context-sensitive text correction. In: Proceedings of the 13th Innovative Applications of Artificial Intelligence Conference, pp. 45–50 (2001)
Carlson, A., Fette, I.: Memory-based Context-Sensitive Spelling Correction at Web Scale. In: Proceedings of the 6th International Conference on Machine Learning and Applications, pp. 166–171 (2007)
Chen, Y.Z., Wu, S.H., Yang, P.C., Ku, T., Chen, G.D.: Improve the detection of improperly used Chinese characters in student’s essays with error model. In: Int. J. Cont. Engineering Education and Life-Long Learning, pp. 103–116 (2001)
Cucerzan, S., Brill, E.: Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of EMNLP, pp. 293–300 (2004)
Damerau, F.: A technique for computer detection and correction of spelling errors. Communications of the ACM 7, 171–176 (1964)
Deorowicz, S., Ciura, M.G.: Correcting Spelling Errors by Modelling Their Causes. International Journal of Applied Mathematics and Computer Science 15, 275–285 (2005)
Golding, A., Roth, D.: A winnow-based approach to context-sensitive spelling correction. Machine Learning 34(1-3), 107–130 (1999)
Islam, A., Inkpen, D.: Real-word spelling correction using googleweb 1t 3-grams. In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2009), pp. 1241–1249 (2009)
Liu, W., Allison, B., Guthrie, L.: Professor or screaming beast? Detecting words misuse in Chinese. In: The 6th edition of the Language Resources and Evaluation Conference (2008)
Liu, C.L., Lai, M.H., Tien, K.W., Chuang, Y.H., Wu, S.H., Lee, C.Y.: Visually and phonologically similar characters in incorrect Chinese words: Analyses, identification, and applications. ACM Transactions on Asian Language Information Processing, 1–39 (2011)
Verberne, S.: Context-sensitive spell checking based on word trigram probabilities. Master thesis, University of Nijmegen (2002)
Whitelaw, C., Hutchinson, B., Chung, G.Y., Ellis, G.: Using the Web for Language Independent Spellchecking and Autocorrection. In: Proceedings of Conference on Empirical Methods In Natural Language Processing (EMNLP 2009), pp. 890–899 (2009)
Wu, S.H., Chen, Y.Z., Yang, P.C., Ku, T., Liu, C.L.: Reducing the False Alarm Rate of Chinese Character Error Detection and Correction. In: Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP 2010), pp. 54–61 (2010)
Zhang, L., Zhou, M., Huang, C.N., Pan, H.H.: Automatic detecting/correcting errors in Chinese text by an approximate word-matching algorithm. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 248–254 (2000)
Li, J., Wang, X.: Combine trigram and Automatic Weight Distribution in Chinese Spelling ErrorCorrection. Journal of Computer Science and Technology Archive 17(6), 915–923 (2002)
Mitton, R.: Ordering the Suggestions of a Spellchecker Without Using Context. Natural Language Engineering 15, 173–192 (2008)
Hai, N.D., Nhi, N.P.H.: Syntactic parser in Vietnamese sentences and its application in Spell Checking. In: Vietnamese, bachelor thesis, in University of Science Ho Chi Minh city (1999)
Duy, N.T.N., Dien, D.: An approach in Vietnamese spell checking. In: Vietnamese, bachelor thesis in University of Science Ho Chi Minh city (2004)
Quang, N.H.T.: Language model and word segmentation in Vietnamese Spell Checking. In: Vietnamese, bachelor thesis in University of Engineering and Technology, Hanoi National University (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Thi Xuan Huong, N., Dang, TT., Nguyen, TT., Le, AC. (2015). Using Large N-gram for Vietnamese Spell Checking. In: Nguyen, VH., Le, AC., Huynh, VN. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-11680-8_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-11680-8_49
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11679-2
Online ISBN: 978-3-319-11680-8
eBook Packages: EngineeringEngineering (R0)