Skip to main content

Mining Maximal Flexible Patterns in a Sequence

  • Conference paper
  • First Online:
Book cover New Frontiers in Artificial Intelligence (JSAI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4914))

Included in the following conference series:

Abstract

We consider the problem of enumerating all maximal flexible patterns in an input sequence database for the class of flexible patterns, where a maximal pattern (also called a closed pattern) is the most specific pattern among the equivalence class of patterns having the same list of occurrences in the input. Since our notion of maximal patterns is based on position occurrences, it is weaker than the traditional notion of maximal patterns based on document occurrences. Based on the framework of reverse search, we present an efficient depth-first search algorithm MaxFlex for enumerating all maximal flexible patterns in a given sequence database without duplicates in \(O(||{\mathcal{T}}||\times|\Sigma|)\) time per pattern and \(O(||{\mathcal T}||)\) space, where \(||{\mathcal T}||\) is the size of the input sequence database \(\mathcal T\) and |Σ| is the size of the alphabet on which the sequences are defined. This means that the enumeration problem for maximal flexible patterns is shown to be solvable in polynomial delay and polynomial space.

This research was partly supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Specially Promoted Research, 17002008, 2007 on “semi-structured data mining”, and 18017015, 2007 on “developing high-speed high-quality algorithms for analyzing huge genome database”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avis, D., Fukuda, K.: Reverse Search for Enumeration. Discrete Appl. Math. 65, 21–46 (1996)

    Article  MathSciNet  Google Scholar 

  2. Arimura, H., Fujino, R., Shinohara, T.: Protein motif discovery from positive examples by minimal multiple generalization over regular patterns. In: Proc. GIW 1994, pp. 39–48 (1994)

    Google Scholar 

  3. Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)

    Google Scholar 

  4. Arimura, H., Uno, T.: A polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1(3), 259–289 (1997)

    Article  Google Scholar 

  6. Parida, L., Rigoutsos, I., et al.: Pattern discovery on character sets and real-valued data: Linear-bound on irredandant motifs and efficient polynomial time algorithms. In: Proc. SODA 2000, SIAM-ACM (2000)

    Google Scholar 

  7. Pisanti, N., et al.: A basis of tiling motifs for generating repeated patterns and its complexity of higher quorum. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Shapiro, E.Y.: Algorithmic Program Debugging. MIT Press, Cambridge (1982)

    MATH  Google Scholar 

  9. Shimozono, S., Arimura, H., Arikawa, S.: Efficient discovery of optimal word-association patterns in large text databases. New Generation Comput. 18(1), 49–60 (2000)

    Article  Google Scholar 

  10. Shinohara, T.: Polynomial time inference of extended regular pattern Languages. In: Proc. RIMS Symp. on Software Sci. & Eng., pp. 115–127 (1982)

    Google Scholar 

  11. Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large databases. In: Proc. SDM 2003, SIAM (2003)

    Google Scholar 

  12. Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: Proc. ICDE 2004 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ken Satoh Akihiro Inokuchi Katashi Nagao Takahiro Kawamura

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Arimura, H., Uno, T. (2008). Mining Maximal Flexible Patterns in a Sequence. In: Satoh, K., Inokuchi, A., Nagao, K., Kawamura, T. (eds) New Frontiers in Artificial Intelligence. JSAI 2007. Lecture Notes in Computer Science(), vol 4914. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78197-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78197-4_29

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78196-7

  • Online ISBN: 978-3-540-78197-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics