Skip to main content

Optimal and information theoretic syntactic Pattern Recognition involving traditional and transposition errors

  • Conference paper
  • First Online:
Foundations of Software Technology and Theoretical Computer Science (FSTTCS 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1180))

  • 143 Accesses

Abstract

In this paper we present a foundational basis for optimal and information theoretic, syntactic Pattern Recognition (PR) for syntactic patterns which can be “linearly” represented as strings. In an earlier paper Oommen and Kashyap [25] we had presented a formal basis for designing such systems when the errors involved were arbitrarily distributed Substitution, Insertion and Deletion (SID) syntactic errors. In this paper we generalize the framework and permit these traditional errors and Generalized Transposition (GT) errors. We do this by developing a rigorous model, MG*, for channels which permit all these errors in an arbitrarily distributed manner. The scheme is Functionally Complete and stochastically consistent. Besides the synthesis aspects, we also deal with the analysis of such a model and derive a technique by which Pr[Y¦U], the probability of receiving Y given that U was transmitted, can be computed in quartic time using dynamic programming. Experimental results which involve dictionaries with strings of lengths between 7 and 14 with an overall average noise of 70.5% demonstrate the superiority of our system over existing methods.

Partially supported by the Natural Sciences and Engineering Research Council of Canada.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Abridged list of references

  1. R. L. Bahl and F. Jelinek, Decoding with channels with insertions, deletions and substitutions with applications to speech recognition, IEEE T Inf. Th., IT-21:404–411 (1975).

    Google Scholar 

  2. Bunke, H. and Csirik, J, Parametric string edit distance and its application to pattern Recognition, IEEE T. Syst, Man and Cybern., SMC-25:202–206 (1993).

    Google Scholar 

  3. L. Devroye, Non-Uniform Random Variate Generation, Springer-Verlag, (1986).

    Google Scholar 

  4. G. Dewey, Relative Frequency of English Speech Sounds, Harvard Univ. Press, (1923).

    Google Scholar 

  5. R. O. Duda, P.E. Hart. Pattern Classification and Scene Analysis. Wiley & Sons, 1973.

    Google Scholar 

  6. G.D. Forney, The Viterbi Algorithm, Proceedings of the IEEE, Vol. 61. (1973).

    Google Scholar 

  7. K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 1972.

    Google Scholar 

  8. P. A. V. Hall and G.R. Dowling, Approximate string matching, Comp. Sur., 12:381–402 (1980).

    Google Scholar 

  9. R. L. Kashyap and B. J. Oommen, A common basis for similarity and dissimilarity measures involving two strings, Internat. J. Comput. Math., 13:17–40 (1983).

    Google Scholar 

  10. R. L. Kashyap and B. J. Oommen, An effective algorithm for string correction using generalized edit distances-I. Description of the algorithm and its optimality, Inf. Sci., 23(2): 123–142 (1981).

    Google Scholar 

  11. R. L. Kashyap, and B. J. Oommen, String correction using probabilistic methods, Pattern Recognition Letters, 147–154 (1984).

    Google Scholar 

  12. R. Lowrance and R. A. Wagner, An extension of the string to string correction problem, J. Assoc. Comput. Mach., 22:177–183 (1975).

    Google Scholar 

  13. A. Levenshtein, Binary codes capable of correcting deletions, insertions and reversals, Soviet Phys. Dokl., 10:707–710 (1966).

    Google Scholar 

  14. W. J. Masek and M. S. Paterson, A faster algorithm computing string edit distances, J. Comput. System Sci., 20:18–31 (1980).

    Google Scholar 

  15. D. L. Neuhoff, The Viterbi algorithm as an aid in text recognition, IEEE T. Inf. Th., 222–226 (1975).

    Google Scholar 

  16. T. Okuda, E. Tanaka, and T. Kasai, A method of correction of garbled words based on the Levenshtein metric, IEEE T. Comput., C-25:172–177 (1976).

    Google Scholar 

  17. Oommen, B.J. and Loke, R. K. S., “Pattern Recognition of Strings Containing Traditional and Generalized Transposition Errors”, Proceedings of the 1995 IEEE International Conference on Systems, Man and Cybernetics, Vancouver, October 1995, pp. 1154–1159.

    Google Scholar 

  18. B. J. Oommen and R. Loke, Information Theoretic Syntactic Pattern Recognition Involving Traditional and Transposition Errors, Unabridged version of the present paper.

    Google Scholar 

  19. D. Sankoff and J. B. Kruskal, Time Warps,String Edits and Macromolecules: The Theory and practice of Sequence Comparison, Addison-Wesley (1983).

    Google Scholar 

  20. R. Shinghal, and G. T. Toussaint, Experiments in text recognition with the modified Viterbi algorithm, IEEE T. on Pat. An. and M. Intel., 184–192 (1979).

    Google Scholar 

  21. S. Srihari, Computer Text Recognition and Error Correction, IEEE Computer Press, (1984).

    Google Scholar 

  22. A. J. Viterbi, Error bounds for convolutional codes and an asymptotically optimal decoding algorithm, IEEE T. on Information Theory, 260–26 (1967).

    Google Scholar 

  23. R. A. Wagner and M. J. Fisher, The string to string correction problem, J. Assoc. Comput. Mach., 21:168–173 (1974).

    Google Scholar 

  24. K. S. Fu, Syntactic Methods in Pattern Recognition, Academic Press, New York, 1974.

    Google Scholar 

  25. Oommen, B.J. and Kashyap, R. L., “Optimal and Information Theoretic Syntactic Pattern Recognition for Traditional Errors”. To appear in the Proceedings of SSPR-96, the 1996 International Symposium on Syntactic and Structural Pattern Recognition, Leipzig, Germany, August 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

V. Chandru V. Vinay

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oommen, B.J., Loke, R.K.S. (1996). Optimal and information theoretic syntactic Pattern Recognition involving traditional and transposition errors. In: Chandru, V., Vinay, V. (eds) Foundations of Software Technology and Theoretical Computer Science. FSTTCS 1996. Lecture Notes in Computer Science, vol 1180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62034-6_52

Download citation

  • DOI: https://doi.org/10.1007/3-540-62034-6_52

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62034-1

  • Online ISBN: 978-3-540-49631-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics