skip to main content
column

Software maintenance by multi-patterns parameterized string matching with q-gram

Published: 11 May 2010 Publication History

Abstract

In the multi-patterns parameterized string matching problem, a set of patterns P0, P1, P2...Pr-1, r≥1, are said to match with a sub-string t of the text T, if there exists a one-one correspondence between the symbols of patterns and the symbols of t. This problem has an important application in software maintenance, where it is often required to find equivalency between two sections of codes. Two sections of codes are said to be equivalent if one can be transformed into the other by renaming only identifiers and variables. In this paper, we extend Forward Non-deterministic Directed Acyclic Word Graph (DAWG) matching (FNDM) algorithm to PQFNDM for parameterized string matching problem by using the q-gram. Experimentally it has been observed that the performance of PQFNDM improves with increasing value of q up to half the length of the pattern. We further modify PQFNDM to MPQFNDM for handling multiple patterns. We compare the performance of PQFNDM (for q=1) with parameterized shift-or (PSO) algorithm and found that PQFNDM is better than PSO. We also show the benefits of using multiple patterns.

References

[1]
Baeza-Yates, R. A., Gonnet, G. H., (1992): A new approach to text searching, Communication of ACM, 35(10), pp. 74--82.
[2]
Boyer, R. S., Moore, J. S., (1977): A fast string-searching algorithm, Communication of ACM, 20(10), pp. 762--772.
[3]
Navarro, G., Raffinot, M., (2001): Fast and Flexible String Matching by Combining Bit-parallelism and Suffix automata, ACM Journal of Experimental Algorithms, 5(4), pp. 1--36.
[4]
Sunday, D. M., (1990): A Very Fast Substring Search Algorithm.Communication of the ACM, 33(8), pp. 132--142.
[5]
Aho, A. V., Corasick, M. J., (1975): Efficient String Matching: An aid to bibliographic search, Communication of ACM, 18(6), pp. 333--340.
[6]
Durian, B., Holub, J., Peltola, H., Tarhio, J., (2010): Improved Practical Exact String Matching, Information Processing Letters, 110(4),pp. 148--152.
[7]
Amir, A., Farach, M., Muthukrishnan, S., (1994): Alphabet dependence in parameterized matching, Information Processing Letters, 49(3),pp. 111--115.
[8]
Baker, B. S., (1996): Parameterized pattern matching: algorithms and applications, Journal of Computer and System Sciences, 52(1),pp. 28--42.
[9]
Fredriksson, K., Mozgovoy, M., (2006): Efficient parametrized string matching, Information Processing Letters, 100(3), pp. 91--96.
[10]
Prasad, R., Agarwal, S., (2008): Parameterized Shift-And String Matching Algorithm using Super Alphabet, in: proc. of International Conference on Computer and Communication (available on IEEE Xplore), Kuala Lumpur Malaysia, pp. 937--942.
[11]
Prasad, R., Agarwal, S, (2008): A New Parameterized String Matching Algorithm by Combining Bit-Parallelism and Suffix Automata, in: proc. of IEEE 8th International Conference on Computer and Information Technology (available on IEEE Xplore digital library), Sydney, Australia, pp. 778--783.
[12]
Prasad, R., Agarwal, S., (2008): An Alternative Bit-Parallel Algorithm for Parameterized String Matching. In: proc. of 3rd International Symposium on Information Technology (available on IEEE Xplore Digital Library), Kuala Lumpur, Malaysia, pp. 1--8.
[13]
Salmela, L., Tarhio, J., (2008): Fast Parameterized Matching with q-grams, Journal of Discrete Algorithms, 6(3),pp. 408--419.
[14]
Horspool, R. N., Practical,(1991): Fast Searching in Strings, Software-Practice and Experience, 21(11),pp. 1221--1248.
[15]
Lecroq, T., (2007): Fast exact string matching algorithms. Information Processing Letter, 102(6), pp. 229--235.
[16]
Fredriksson, K., (2003): Shift-or string matching with super alphabets. Information Processing Letters, 87(4), pp. 201--204.
[17]
Peltola, H., Tarhio, J., (2003): Alternative algorithm for bit-parallel string matching, in String Processing and Information Retrieval, 10th International Symposium, SPIRE, Vol.2857, LNCS, pp. 80--94.

Cited By

View all
  • (2022)Efficient Structural Matching for RNA Secondary Structure Using Bit-ParallelismHigh Performance Computing and Networking10.1007/978-981-16-9885-9_33(399-409)Online publication date: 23-Mar-2022
  • (2018)A brief history of parameterized matching problemsDiscrete Applied Mathematics10.1016/j.dam.2018.07.017Online publication date: Aug-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGSOFT Software Engineering Notes
ACM SIGSOFT Software Engineering Notes  Volume 35, Issue 3
May 2010
151 pages
ISSN:0163-5948
DOI:10.1145/1764810
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2010
Published in SIGSOFT Volume 35, Issue 3

Check for updates

Author Tags

  1. algorithm
  2. finite automata
  3. multiple patterns
  4. parameterized matching
  5. prev-encoding
  6. software maintenance

Qualifiers

  • Column

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Efficient Structural Matching for RNA Secondary Structure Using Bit-ParallelismHigh Performance Computing and Networking10.1007/978-981-16-9885-9_33(399-409)Online publication date: 23-Mar-2022
  • (2018)A brief history of parameterized matching problemsDiscrete Applied Mathematics10.1016/j.dam.2018.07.017Online publication date: Aug-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media