Skip to main content
Log in

A model and a fast algorithm for multiple errors spelling correction

  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

A new model for multiple errors spelling correction is proposed. The model handles insert, delete, change, and transpose errors. In the new model, we put constraints on possible editing sequences to reflect the error occurrence phenomenon in spelling, resulting in an error measure different from the traditional editing distance error measure. Properties of the “error distance matrix” between two character strings are studied under the assumptions of the new model. A cut-off criterion has been discovered, which can detect whether the error distance between two character strings is greater than a prespecified value during the calculation. Based on this cut-off criterion, a fast algorithm has been developed to find the nearest neighbors of a given character string in a dictionary. Experiments have been conducted with results showing that the cut-off criterion can greatly cut down the computation time needed for the nearest neighbor searching.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Blair, C.R.: A program for correcting spelling errors. Inf. Control3, 60–67 (1960)

    Google Scholar 

  2. Bourne, C.P.: Frequency and impact of spelling errors in bibliographic data bases. Inf. Process. Manage.13(1), 1–12 (1977)

    Google Scholar 

  3. Spelling correction program for micros. Comput. Weekly 7 (July 3, 1980)

  4. Du, M.W., Chang, S.C.: A new approach to shortest editing sequence and longest common subsequence problems. Tech. Rep. GTE Labs. TR-047-06-89-500 (June 1989)

  5. Du, M.W., Chang, S.C.: A model and a fast algorithm for multiple errors spelling correction. Tech. Rep. GTE Labs. TR-065-09-89-500 (Sept. 1989)

  6. Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM7(3), 171–176 (1964)

    Google Scholar 

  7. Galil, Z., Giancarlo, R.: Data structures and algorithms for approximate string matching. J. Complexity4, 33–72 (1988)

    Google Scholar 

  8. Hall, P.A.V., Dowling, G.R.: Approximate string matching. ACM Comput. Surv.12(4), 381–402 (1980)

    Google Scholar 

  9. Litecky, C.R., Davis, G.B.: A study of errors, error-proneness, and error diagnosis in COBOL. Commun. ACM19(1), 33–37 (1976)

    Google Scholar 

  10. Lowrance, R., Wagner, R.A.: An extension of the string-to-string correction problem. J. ACM22(2), 177–183 (1975)

    Google Scholar 

  11. Morgan, H.L.: Spelling correction in systems programs. Commun. ACM13(2), 90–94 (1970)

    Google Scholar 

  12. Mukhopadhyay, A.: A fast algorithm for the longest-common-subsequence problem. Inf. Sci.20, 69–82 (1980)

    Google Scholar 

  13. Muth, F.E., Jr., Tharp, A.L.: Correcting human error in alphanumeric terminal input. Inf. Process. Manage.13(6), 329–337 (1977)

    Google Scholar 

  14. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM21(1), 168–173 (1974)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, M.W., Chang, S.C. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica 29, 281–302 (1992). https://doi.org/10.1007/BF01185682

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01185682

Keywords

Navigation