skip to main content
column

Parameterized string matching: an application to software maintenance

Published: 11 May 2010 Publication History

Abstract

In the problem of parameterized string matching, a given pattern P is said to match with a sub-string t of the text T, if there exists a one-one correspondence between the symbols of P and the symbols of t. This problem has an important application in software maintenance, where it is often required to find equivalency between two sections of codes. Two sections of codes are said to be equivalent if one can be transformed into the other by renaming only identifiers and variables. In this paper, we propose two new algorithms for the said problem by using the q-gram approach. The first one is obtained by using this approach on an existing string matching algorithm (simplified backward non-deterministic directed acyclic word graph matching (SBNDM)).The second one is obtained by using the q-gram approach on the parameterized string matching algorithm (parameterized backward non-deterministic directed acyclic word graph matching (PBNDM)). Performance of both the algorithms is tested for various values of q and it has been observed that both show their best performance for q nearly equal to half of the pattern length. We also study the effect on running time of these algorithms with increasing the duplicity in the text.

References

[1]
Baeza-Yates R A, Gonnet G H. (1992): A new approach to text searching. Communication of ACM, 35(10), pp. 74--82.
[2]
Boyer R S, Moore J S. (1977): A fast string-searching algorithm. Communication of ACM, 20(10) : pp. 762--772.
[3]
Aho A V, Corasick M J. (1975): Efficient String Matching: An aid to bibliographic search. Communication of ACM 18(6), pp. 333--340.
[4]
Michailidis P D, Margaritis K G. (2001): On-line String matching Algorithms: Survey and Experimental Results. International Journal of Computer Mathematics, 76(4), pp. 411--434.
[5]
Navarro G, Raffinot M. (2001): Fast and Flexible String Matching by Combining Bit-parallelism and Suffix automata. ACM Journal of Experimental Algorithms, 5(4), pp. 1--36.
[6]
Baker, B. S., (1997): Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance, SIAM Journal of Computing 26(5), pp. 1343--1362.
[7]
Baker, B. S., (1996): Parameterized pattern matching: algorithms and applications. Journal of Computer and System Sciences, 52(1), pp. 28--42.
[8]
Amir, A., Farach, M., Muthukrishnan, S., (1994): Alphabet dependence in parameterized matching. Information Processing Letters, 49(3):pp. 111--115.
[9]
Fredriksson K, Mozgovoy M. (2006): Efficient parametrized string matching. In Information Processing Letters (IPL), 100(3): pp. 91--96.
[10]
Prasad R, Agarwal S. (2008): Parameterized Shift-And String Matching Algorithm using Super Alphabet. In proc. of International Conference on Computer and Communication (available on IEEE Xplore), Kuala Lumpur Malaysia, May 13-15, pp. 937--942.
[11]
Prasad R, Agarwal S. (2008): A New Parameterized String Matching Algorithm by Combining Bit-Parallelism and Suffix Automata. In proc. of IEEE 8th International Conference on Computer and Information Technology (available on IEEE Xplore), Sydney, Australia, July 8-11, pp.778--783.
[12]
Salmela L, Tarhio J. (2008): Fast Parameterized Matching with q-grams. Journal of Discrete Algorithms, 6(3), pp. 408--419.
[13]
Durian B, Holub J, Peltola H, Tarhio J. (2009): Tuning BNDM with q-grams. In proc. of Tength Workshops on Algorithm Engineering and Experiments, pp. 29--37.
[14]
Lecroq T. (2007): Fast exact string matching algorithms. Information Processing Letter, 102(6), pp. 229--235.
[15]
Fredriksson K. (2003): Shift-or string matching with super alphabets. In Information Processing Letters (IPL), 87(4), pp. 201--204.
[16]
Peltola H, Tarhio J. (2003): Alternative algorithm for bit-parallel string matching. In String Processing and Information Retrieval, 10th International Symposium, SPIRE 2003, Volume 2857, LNCS, pp. 80--94.

Cited By

View all
  • (2020)A General Approach to Uniformly Handle Different String Metrics Based on Heterogeneous AlphabetsIEEE Access10.1109/ACCESS.2020.29788938(45231-45243)Online publication date: 2020
  • (2018)A brief history of parameterized matching problemsDiscrete Applied Mathematics10.1016/j.dam.2018.07.017Online publication date: Aug-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGSOFT Software Engineering Notes
ACM SIGSOFT Software Engineering Notes  Volume 35, Issue 3
May 2010
151 pages
ISSN:0163-5948
DOI:10.1145/1764810
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2010
Published in SIGSOFT Volume 35, Issue 3

Check for updates

Author Tags

  1. algorithm
  2. bit-parallelism
  3. finite automata
  4. parameterized matching
  5. software maintenance

Qualifiers

  • Column

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)A General Approach to Uniformly Handle Different String Metrics Based on Heterogeneous AlphabetsIEEE Access10.1109/ACCESS.2020.29788938(45231-45243)Online publication date: 2020
  • (2018)A brief history of parameterized matching problemsDiscrete Applied Mathematics10.1016/j.dam.2018.07.017Online publication date: Aug-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media