Abstract
In this paper, we consider the problem of enumerating all maximal motifs in an input string for the class of repeated motifs with wild cards. A maximal motif is such a representative motif that is not properly contained in any larger motifs with the same location lists. Although the enumeration problem for maximal motifs with wild cards has been studied in (Parida et al., CPM’01), (Pisanti et al.,MFCS’03) and (Pelfrene et al., CPM’03), its output-polynomial time computability is still open. The main result of this paper is a polynomial space polynomial delay algorithm for the maximal motif enumeration problem for the repeated motifs with wild cards. This algorithm enumerates all maximal motifs in an input string of length n with O(n 3) time per motif with O(n 2) space and O(n 3) delay. The key of the algorithm is depth-first search on a tree-shaped search route over all maximal motifs based on a technique called prefix-preserving closure extension. We also show an exponential lowerbound and a succinctness result on the number of maximal motifs, which indicate the limit of a straightforward approach.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Apostolico, A., Parida, L.: Compression and the wheel of fortune. In: Proc. the 2003 Data Compression Conference (DCC 2003). IEEE, Los Alamitos (2003)
Arimura, H., Uno, T.: A polynomial space polynomial delay algorithm for enumeration of maximal motifs in a sequence. Technical Report Series A, TCS-TR-A-05-6, Division of Computer Science, Hokkaido Univeristy (July 2005), http://www-alg.ist.hokudai.ac.jp/tra.html
Arimura, H., Uno, T.: An output-polynomial time algorithm for mining frequent closed attribute trees. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS (LNAI), vol. 3625, pp. 1–19. Springer, Heidelberg (2005)
Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: The complexity of generating maximal frequent and minimal infrequent sets. In: Kunii, T.L., Yao, S.B. (eds.) Data Base Design Techniques 1979. LNCS, vol. 133, pp. 133–141. Springer, Heidelberg (1982)
Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2002)
Goldberg, L.A.: Polynomial space polynomial delay algorithms for listing families of graphs. In: Proc. the 25th STOC, pp. 218–225. ACM, New York (1993)
Gusfield, D.: Algorithms on strings, trees, and sequences, Cambridge (1997)
Parida, L., Rigoutsos, I., Floratos, A., Platt, D., Gao, Y.: Pattern discovery on character sets and real-valued data: linear bound on irredundant motifs and effcient polynomial time algorithm. In: Proc. the 11th SIAM Symposium on Discrete Algorithms (SODA 2000), pp. 297–308 (2000)
Parida, L., Rigoutsos, I., Platt, D.E.: An Output-Sensitive Flexible Pattern Discovery Algorithm. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 131–142. Springer, Heidelberg (2001)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Pelfrêne, J., Abdeddaïm, S., Alexandre, J.: Extending Approximate Patterns. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 328–347. Springer, Heidelberg (2003)
Pisanti, N., Crochemore, M., Grossi, R., Sagot, M.-F.: A basis of tiling motifs for generating repeated patterns and its complexity for higher quorum. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 622–631. Springer, Heidelberg (2003)
Pisanti, N., Crochemore, M., Grossi, R., Sagot, M.-F.: A comparative study of bases for motif inference. In: String Algorithmics. KCL publications (2004)
Uno, T.: Two general methods to reduce delay and change of enumeration algorithms, NII Technical Report, NII-2003-004E (April 2003)
Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 16–31. Springer, Heidelberg (2004)
Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: Proc. SIGKDD 2003 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arimura, H., Uno, T. (2005). A Polynomial Space and Polynomial Delay Algorithm for Enumeration of Maximal Motifs in a Sequence. In: Deng, X., Du, DZ. (eds) Algorithms and Computation. ISAAC 2005. Lecture Notes in Computer Science, vol 3827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602613_73
Download citation
DOI: https://doi.org/10.1007/11602613_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30935-2
Online ISBN: 978-3-540-32426-3
eBook Packages: Computer ScienceComputer Science (R0)