A near pattern-matching scheme based upon principal component analysis

https://doi.org/10.1016/0167-8655(94)00109-GGet rights and content

Abstract

In this paper, we present an efficient heuristic near pattern-matching scheme. Based upon an important multivariate analysis technique in statistics, called the principal components analysis, we develop algorithms to generate a set of new identifying keys for a given set of patterns to reduce the number of comparisons during the near-matching process. After some preprocessing work, the near-matching operation takes O(n log m) time in the worst case, where m is the number of identifying segments extracted from the patterns to be searched in a text file of length n.

References (13)

  • A.A. Afifi et al.

    Computer-Aided Multivariate Analysis

    (1990)
  • A.V. Aho et al.

    Efficient string matching: an aid to bibliographic search

    Comm. ACM

    (1975)
  • S.Y. Berkovich et al.

    Matching string patterns in large textual files

  • R.S. Boyer et al.

    A fast string searching algorithm

    Comm. ACM

    (1977)
  • Y.T. Chien et al.

    On the generalized Karhunen-Loeve expansion

    IEEE Trans. Inform. Theory

    (1967)
  • K.S. Fu

    Sequential Methods in Pattern Recognition and Machine Learning

    (1968)
There are more references available in the full text version of this article.

Cited by (18)

  • Fast k-nearest neighbors search using modified principal axis search tree

    2010, Digital Signal Processing: A Review Journal
  • PCA-Based algorithm for unsupervised bridge crack detection

    2006, Advances in Engineering Software
    Citation Excerpt :

    This reduction is accomplished by transforming the original set of variables to a new set of variables that are uncorrelated and ordered by their significance, so that the first few variables retain most of the variation present in all of the original data. PCA has many applications in image understanding and pattern recognition that includes pattern matching [13,14], neural networks [11,15], speech analysis [16], visual learning [17,18], and active vision [19]. In feature recognition, PCA has been extensively used to identify face features [20].

  • An Improved VQ Codebook Search Algorithm Using Principal Component Analysis

    1997, Journal of Visual Communication and Image Representation
View all citing articles on Scopus
View full text