Abstract
Approximate matching is one of the fundamental problems in pattern matching, and a ubiquitous problem in real applications. The Hamming distance is a simple and well studied example of approximate matching, motivated by typing, or noisy channels. Biological and image processing applications assign a different value to mismatches of different symbols.
We consider the problem of approximate matching in the L 1 metric – the k- L 1 -distance problem. Given text T=t 0,...,t n − 1 and pattern P=p 0,...,p m − 1 strings of natural number, and a natural number k, we seek all text locations i where the L 1 distance of the pattern from the length m substring of text starting at i is not greater than k, i.e. \(\sum_{j=0}^{m-1} |{t}_{i+j} - {p}_{j}| \leq k\).
We provide an algorithm that solves the k-L 1-distance problem in time \(O(n\sqrt{k\log k})\). The algorithm applies a bounded divide-and-conquer approach and makes novel uses of non-boolean convolutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abrahamson, K.: Generalized string matching. SIAM J. Comp. 16(6), 1039–1051 (1987)
Amir, A., Aumann, A., Cole, R., Lewenstein, M., Porat, E.: Function matching: Algorithms, applications, and a lower bound. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 929–942. Springer, Heidelberg (2003)
Amir, A., Cole, R., Hariharan, R., Lewenstein, M., Porat, E.: Overlap matching. Information and Computation 181(1), 57–74 (2003)
Amir, A., Eisenberg, E., Porat, E.: Swap and mismatch edit distance. In: Albers, S., Radzik, T. (eds.) ESA 2004. LNCS, vol. 3221, pp. 16–27. Springer, Heidelberg (2004)
Amir, A., Farach, M.: Efficient 2-dimensional approximate matching of halfrectangular figures. Information and Computation 118(1), 1–11 (1995)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms (2004)
Berkman, O., Breslauer, D., Galil, Z., Schieber, B., Vishkin, U.: Highly parallelizable problems. In: Proc. 21st ACM Symposium on Theory of Computation, pp. 309–319 (1989)
Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proc. 34st Annual Symposium on the Theory of Computing (STOC), pp. 592–601 (2002)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. 13th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 667–676. Society for Industrial and Applied Mathematics (2002)
Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, vol. 7, pp. 113–125 (1974)
Galil, Z.: Open problems in stringology. In: Galil, Z., Apostolico, A. (eds.) Combinatorial Algorithms on Words. NATO ASI Series F, vol. 12, pp. 1–8 (1985)
Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestor. Computer and System Science 13, 338–355 (1984)
Karloff, H.: Fast algorithms for approximately counting mismatches. Information Processing Letters 48(2), 53–60 (1993)
Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theoretical Computer Science 43, 239–249 (1986)
Lipsky, O.: Efficient distance computations. Master’s thesis, Bar-Ilan University, Department of Computer Science, Ramat-Gan 52900, ISRAEL (2003)
Maasoumi, E., Racine, J.: Entropy and predictability of stock market returns. Journal of Econometrics 107(1), 291–312 (2002), available at http://ideas.repec.org/a/eee/econom/v107y2002i1-2p291-312.html
Malagnini, L., Herman, R.B., Di Bona, M.: Ground motion scaling in the apennines (italy). Bull. Seism. Soc. Am. 90, 1062–1081 (2000)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. of the ACM 23, 262–272 (1976)
Olson, M.V.: A time to sequence. Science 270, 394–396 (1995)
Pentland, A.: Invited talk. In: NSF Institutional Infrastructure Workshop (1992)
Shmulevich, I., Yli-Harja, O., Coyle, E., Povel, D., Lemstrom, K.: Perceptual issues in music pattern recognition — complexity of rhythm and key finding (April 1999)
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amir, A., Lipsky, O., Porat, E., Umanski, J. (2005). Approximate Matching in the L 1 Metric. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_9
Download citation
DOI: https://doi.org/10.1007/11496656_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26201-5
Online ISBN: 978-3-540-31562-9
eBook Packages: Computer ScienceComputer Science (R0)