Skip to main content

Compact Representations of Sequential Classification Rules

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 118))

Summary

In this chapter we address the problem of mining sequential classification rules. Unfortunately, while high support thresholds may yield an excessively small rule set, the solution set becomes rapidly huge for decreasing support thresholds. In this case, the extraction process becomes time consuming (or is unfeasible), and the generated model is too complex for human analysis.

We propose two compact forms to encode the knowledge available in a sequential classification rule set. These forms are based on the abstractions of general rule, specialistic rule, and complete compact rule. The compact forms are obtained by extending the concept of closed itemset and generator itemset to the context of sequential rules. Experimental results show that a significant compression ratio is achieved by means of both proposed forms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.cs.rpi.edu/~zaki/software/

  2. UCI repository of machine learning databases, http://www.ics.uci.edu/~mlearn/mlrepository.html

  3. R. Agrawal and R. Srikant. Fast algorithm for mining association rules. In VLDB’94, pages 487–499, 1994

    Google Scholar 

  4. R. Agrawal and R. Srikant. Mining sequential patterns. In IEEE ICDE’95, pages 3–14, 1995

    Google Scholar 

  5. R. Agrawal and R. Srikant. Mining sequential patterns: Generalizations and performance improvements. In EDBT 1996, pages 3–17

    Google Scholar 

  6. H. Ahonen-Myka. Discovery of frequent word sequences in text. In ESF Exploratory Workshop on Pattern Detection and Discovery 2002

    Google Scholar 

  7. E. Baralis and S. Chiusano. Essential classification rule sets. In ACM Transactions on Database Systems (TODS), vol. 29, no. 4, December 2004, pages 635–674

    Google Scholar 

  8. E. Baralis and P. Garza. A lazy approach to pruning classification rules. In IEEE ICDM’02, pages 35–42, 2002

    Google Scholar 

  9. Y. Bastide, R. Taouil, N. Pasquier, and L. Lakhal. Discovering frequent closed itemsets for association rules. In ICDT’99, pages 398–416, 1999

    Google Scholar 

  10. R.J. Bayardo. Efficiently mining long patterns from databases. In ACM SIGMOD’98, pages 85–93, 1998

    Google Scholar 

  11. B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In ACM KDD’98, pages 80–86, 1998

    Google Scholar 

  12. J.-F. Boulicaut, A. Bykowski, and C. Rigotti. Free-sets: A condensed representation of boolean data for the approximation of frequency queries. In Data Mining and Knowledge Discovery journal, vol. 7, pages 5–22, 2003

    Article  MathSciNet  Google Scholar 

  13. A. Bykowski and C. Rigotti. A condensed representation to find frequent patterns. In APODS 2001

    Google Scholar 

  14. T. Calders and B. Goethals. Mining all non-derivable frequent itemsets. In the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’02), pp. 74–85, Springer, Berlin Heidelberg New York, 2002

    Google Scholar 

  15. B. Cremilleux and J.-F. Boulicaut. Simplest rules characterizing classes generated by delta-free sets. In Proceedings of the 22 Annual International Conference Knowledge Based Systems and Applied Artificial Intelligence (ES’02), pages 33–46, Springer, Berlin Heidelberg New York, December 2002

    Google Scholar 

  16. M. Garofalakis, R. Rastogi, and K. Shim. Spirit: Sequential pattern mining with regular expression constraints. In VLDB 1999, pages 223–234

    Google Scholar 

  17. M.J. Zaki. Sequence mining in categorical domains: Incorporating constraints. In CIKM 2004, pages 442–429

    Google Scholar 

  18. M. Leleu, C. Rigotti, J.-F. Boulicaut, and G. Euvrard. Constraint-based mining of sequential pattern over datasets with consecutive repetitions. In PKDD 1997, pages 303–314

    Google Scholar 

  19. W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-association rules. In IEEE ICDM’01, pages 369–376, 2001

    Google Scholar 

  20. H. Mannila, H. Toivonen, and I. Verkamo. Discovery of frequent episodes in event sequence. In DMKD 1997, pages 259–289

    Google Scholar 

  21. S. Orlando, R. Perego, and C. Silvestri. A new algorithm for gap constrained sequence mining. In ACM SAC 2004

    Google Scholar 

  22. N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Closed set based discovery of small covers for association rules. In BDA 1999, pages 361–381

    Google Scholar 

  23. N. Pasquier, Y. Bastide, R. Taouil, G. Stumme, and L. Lakhal. Mining minimal non-redundant association rules using frequent closed itemsets. In CL 2000, pages 972–986

    Google Scholar 

  24. J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth. In IEEE ICDE’01, pages 215–226, 2001

    Google Scholar 

  25. J. Pei, J. Han, and W. Wang. Mining sequential patterns with constraints in large databases. In CIKM 2002

    Google Scholar 

  26. R. She, F. Chen, K. Wang, M. Ester, J.L. Gardy, and F.S.L. Brinkman. Frequent-subsequence-based prediction of outer membrane proteins. In SIGKDD 2003, pages 436–445

    Google Scholar 

  27. P. Tzvetkov, X. Yan, and J. Han. Sequential pattern mining using bitmap representation. In SIGKDD 2002, pages 429–435

    Google Scholar 

  28. P. Tzvetkov, X. Yan, and J. Han. Tsp: Mining top-k closed sequential patterns. In IEEE ICDM 2003, pages 347–354

    Google Scholar 

  29. J. Wang and J. Han. Bide: Efficient mining of frequent closed sequences. In IEEE ICDE’04, pages 79–90, 2004

    Google Scholar 

  30. K. Wang, S. Zhou, and Y. He. Growing decision trees on support-less association rules. In KDD’00, Boston, MA, pages 265–269, 2000

    Google Scholar 

  31. X. Yan, J. Han, and R. Afshar. Clospan: Mining closed sequential patterns in large datasets. In SIAM 2003

    Google Scholar 

  32. Q. Yang, I.T.Y. Li, and K. Wang. Building association-rule based sequential classifiers for web-document prediction. In DMKD 2004, pages 253–273

    Google Scholar 

  33. M. Zaki. Generating non-redundant association rules. In KDD 2000, pages 34–43

    Google Scholar 

  34. M. Zaki and C.-J. Hsiao. Charm: An efficient algorithm for closed itemset mining. In SIAM 2002

    Google Scholar 

  35. M.J. Zaki. Spade: An efficient algorithm for mining frequent sequences. Machine Learning (42), pages 31–60, 2001

    Article  MATH  Google Scholar 

  36. M. Zhang, B. Kao, D.W. Cheung, and K.Y. Yip. Mining periodic patterns with gap requirement from sequences. In ACM SIGMOD’05, pages 623–633, 2005

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Baralis, E., Chiusano, S., Dutto, R., Mantellini, L. (2008). Compact Representations of Sequential Classification Rules. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78488-3_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78487-6

  • Online ISBN: 978-3-540-78488-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics