Supervised Pattern Mining and Applications to Classification

Zimmermann, Albrecht; Nijssen, Siegfried

doi:10.1007/978-3-319-07821-2_17

Albrecht Zimmermann³ &
Siegfried Nijssen^4,5

5859 Accesses
11 Citations

Abstract

In this chapter we describe the use of patterns in the analysis of supervised data. We survey the different settings for finding patterns as well as sets of patterns. The pattern mining settings are categorized according to whether they include class labels as attributes in the data or whether they partition the data based on these labels. The pattern set mining settings are categorized along several dimensions, including whether they perform iterative mining or post-processing, operate globally or locally, and whether they use patterns directly or indirectly for prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pattern mining: current status and emerging topics

Article 08 March 2016

Introduction to Data Mining

Classification: Assigning Observations to Known Categories

References

C. C. Aggarwal. On effective classification of strings with wavelets. In KDD, pages 163–172. ACM, 2002.
Google Scholar
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI/MIT Press, 1996. ISBN 0-262-56097-6.
Google Scholar
M.-L. Antonie and O. R. Zaïane. Text document categorization by term association. In ICDM, pages 19–26. IEEE Computer Society, 2002.
Google Scholar
B. Arunasalam and S. Chawla. CCCS: a top-down associative classifier for imbalanced class distributions. In T. Eliassi-Rad, L. H. Ungar, M. Craven, and D. Gunopulos, editors, KDD, pages 517–522. ACM, 2006.
Google Scholar
M. Atzmüller and F. Puppe. SD-Map-a fast algorithm for exhaustive subgroup discovery. In [16], pages 6–17. ISBN 3-540-45374-1.
Google Scholar
S. D. Bay and M. J. Pazzani. Detecting group differences: Mining constrast sets. Data Mining and Knowledge Discovery, 5 (3): 213–246, 2001.
Article MATH Google Scholar
B. Bringmann and A. Zimmermann. Tree²-Decision trees for tree structured data. In A. Jorge, L. Torgo, P. Brazdil, R. Camacho, and J. Gama, editors, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, pages 46–58. Springer, 2005.
Google Scholar
B. Bringmann and A. Zimmermann. One in a million: picking the right patterns. Knowledge and Information Systems, 18 (1): 61–81, 2009.
Article Google Scholar
B. Bringmann, S. Nijssen, and A. Zimmermann. Pattern-based classification: A unifying perspective. In A. Knobbe and J. Fürnkranz, editors, From Local Patterns to Global Models: Proceedings of the ECML/PKDD-09 Workshop (LeGo-09), pages 36–50, 2009.
Google Scholar
L. Cerf, D. Gay, N. Selmaoui-Folcher, B. Crémilleux, and J.-F. Boulicaut. Parameter-free classification in multi-class imbalanced data sets. Data Knowl. Eng., 87: 109–129, 2013.
Article Google Scholar
H. Cheng, X. Yan, J. Han, and C.-W. Hsu. Discriminative frequent pattern analysis for effective classification. In Proceedings of the 23rd International Conference on Data Engineering, pages 716–725. IEEE, 2007.
Google Scholar
H. Cheng, X. Yan, J. Han, and P. S. Yu. Direct discriminative pattern mining for effective classification. In Proceedings of the 24th International Conference on Data Engineering, pages 169–178. IEEE, 2008.
Google Scholar
S. Chiappa, H. Saigo, and K. Tsuda. A Bayesian approach to graphy regression with relevant subgraph selection. In SDM, pages 295–304. SIAM, 2009.
Google Scholar
G. Dong, X. Zhang, L. Wong, and J. Li. Caep: Classification by aggregating emerging patterns. In S. Arikawa and K. Furukawa, editors, Discovery Science, volume 1721 of Lecture Notes in Computer Science, pages 30–42. Springer, 1999. ISBN 3-540-66713-X.
Google Scholar
W. Fan, K. Zhang, H. Cheng, J. Gao, X. Yan, J. Han, P. S. Yu, and O. Verscheure. Direct mining of discriminative and essential frequent patterns via model-based search tree. In Y. Li, B. Liu, and S. Sarawagi, editors, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 230–238. ACM, 2008. ISBN 978-1-60558-193-4.
Google Scholar
J. Fürnkranz, T. Scheffer, and M. Spiliopoulou, editors. Knowledge Discovery in Databases: PKDD 2006,10th European Conference on Principles and Practice of Knowledge Discovery in Databases, Berlin, Germany, September 18–22, 2006, Proceedings, 2006. Springer. ISBN 3-540-45374-1.
Google Scholar
F. B. Galiano, J. C. Cubero, D. Sánchez, and J.-M. Serrano. Art: A hybrid classification model. Machine Learning, 54 (1): 67–92, 2004.
Article Google Scholar
D. Gay, N. Selmaoui, and J.-F. Boulicaut. Pattern-based decision tree construction. In ICDIM, pages 291–296. IEEE, 2007.
Google Scholar
H. Grosskreutz, S. Rüping, and S. Wrobel. Tight optimistic estimates for fast subgroup discovery. In W. Daelemans, B. Goethals, and K. Morik, editors, ECML/PKDD (1), volume 5211 of Lecture Notes in Computer Science, pages 440–456. Springer, 2008. ISBN 978-3-540-87478-2.
Google Scholar
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In W. Chen, J. F. Naughton, and P. A. Bernstein, editors, SIGMOD Conference, pages 1–12. ACM, 2000. ISBN 1-58113-218-2.
Google Scholar
B. Kavsek and N. Lavrac. Apriori-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20 (7): 543–583, 2006.
Article Google Scholar
S. Kramer and L. De Raedt. Feature construction with version spaces for biochemical applications. In C. E. Brodley and A. P. Danyluk, editors, ICML, pages 258–265. Morgan Kaufmann, 2001. ISBN 1-55860-778-1.
Google Scholar
N. Lavrač, B. Kavsek, P. A. Flach, and L. Todorovski. Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5: 153–188, 2004.
MathSciNet Google Scholar
D. Leman, A. Feelders, and A. J. Knobbe. Exceptional model mining. In ECML/PKDD (2), pages 1–16, 2008.
Google Scholar
W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-association rules. In N. Cercone, T. Y. Lin, and X. Wu, editors, Proceedings of the 2001 IEEE International Conference on Data Mining, pages 369–376, San José, California, USA, Nov. 2001. IEEE Computer Society.
Google Scholar
J. Li, G. Dong, K. Ramamohanarao, and L. Wong. A new instance-based lazy discovery and classification system. Machine Learning, 54 (2): 99–124, 2004.
Article MATH Google Scholar
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In R. Agrawal, P. E. Stolorz, and G. Piatetsky-Shapiro, editors, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pages 80–86, New York City, New York, USA, Aug. 1998. AAAI Press.
Google Scholar
D. Meretakis and B. Wüthrich. Extending naïve bayes classifiers using long itemsets. In U. M. Fayyad, S. Chaudhuri, and D. Madigan, editors, KDD, pages 165–174. ACM, 1999. ISBN 1-58113-143-7.
Google Scholar
S. Morishita and J. Sese. Traversing itemset lattices with statistical metric pruning. In Proceedings of the Nineteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 226–236, Dallas, Texas, USA, May 2000. ACM.
Google Scholar
S. Nijssen and É. Fromont. Optimal constraint-based decision tree induction from itemset lattices. Data Min. Knowl. Discov., 21 (1): 9–51, 2010.
Article MathSciNet Google Scholar
S. Nijssen and J. N. Kok. Multi-class correlated pattern mining. In F. Bonchi and J.-F. Boulicaut, editors, KDID, volume 3933 of Lecture Notes in Computer Science, pages 165–187. Springer, 2005. ISBN 3-540-33292-8.
Google Scholar
P. K. Novak, N. Lavrac, and G. I. Webb. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10: 377–403, 2009.
MATH Google Scholar
H. Saigo, N. Krämer, and K. Tsuda. Partial least squares regression for graph mining. In Y. Li, B. Liu, and S. Sarawagi, editors, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 230–238. ACM, 2008., pages 578–586. ISBN 978-1-60558-193-4.
Google Scholar
H. Saigo, S. Nowozin, T. Kadowaki, T. Kudo, and K. Tsuda. gboost:a mathematical programming approach to graph classification and regression. Machine Learning, 75 (1): 69–89, 2009.
Google Scholar
M. Thoma, H. Cheng, A. Gretton, J. Han, H.-P. Kriegel, A. J. Smola, L. Song, P. S. Yu, X. Yan, and K. M. Borgwardt. Discriminative frequent subgraph mining with optimality guarantees. Statistical Analysis and Data Mining, 3 (5): 302–318, 2010.
Article MathSciNet Google Scholar
M. van Leeuwen, J. Vreeken, and A. Siebes. Compression picks item sets that matter. In [16], pages 585–592. ISBN 3-540-45374–1.
Google Scholar
A. Veloso, W. M. Jr., and M. J. Zaki. Lazy associative classification. In ICDM, pages 645–654. IEEE Computer Society, 2006.
Google Scholar
J. Wang and G. Karypis. Harmony: Efficiently mining the best rules for classification. In SDM, 2005.
Google Scholar
G. I. Webb. Opus: An efficient admissible algorithm for unordered search. J. Artif. Intell. Res. (JAIR), 3: 431–465, 1995.
Google Scholar
M. J. Zaki. Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng., 12 (3): 372–390, 2000.
Article MathSciNet Google Scholar
M. J. Zaki and C. C. Aggarwal. XRules: an effective structural classifier for XML data. In L. Getoor, T. E. Senator, P. Domingos, and C. Faloutsos, editors, Proceedings http://www.nakedcapitalism.com/of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 316–325, Washington, DC, USA, Aug. 2003. ACM.
A. Zimmermann and B. Bringmann. Ctc-correlating tree patterns for classification. In J. Han, B. W. Wah, V. Raghavan, X. Wu, and R. Rastogi, editors, Proceedings of the Fifth IEEE International Conference on Data Mining, pages 833–836, Houston, Texas, USA, Nov. 2005. IEEE.
Google Scholar
A. Zimmermann and L. De Raedt. Corclass: Correlated association rule mining for classification. In E. Suzuki and S. Arikawa, editors, Proceedings of the 7th International Conference on Discovery Science, pages 60–72, Padova, Italy, Oct. 2004. Springer.
Google Scholar
A. Zimmermann and L. De Raedt. Cluster-grouping: from subgroup discovery to clustering. Machine Learning, 77 (1): 125–159, 2009.
Article Google Scholar
A. Zimmermann, B. Bringmann, and U. Rückert. Fast, effective molecular feature mining by local optimization. In J. L. Balcázar, F. Bonchi, A. Gionis, and M. Sebag, editors, ECML/PKDD (3), volume 6323 of Lecture Notes in Computer Science, pages 563–578. Springer, 2010. ISBN 978-3-642-15938-1.
Google Scholar

Download references

Author information

Authors and Affiliations

INSA Lyon, LIRIS CNRS UMR 5205, Bâtiment Blaise Pascal, 69621, Villeurbanne, CEDEX, France
Albrecht Zimmermann
KU Leuven, Celestijnenlaan 200 A, 3001, Leuven, Belgium
Siegfried Nijssen
Universiteit Leiden, Niels Bohrweg 1, 2333, Leiden, CA, The Netherlands
Siegfried Nijssen

Authors

Albrecht Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Nijssen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Albrecht Zimmermann .

Editor information

Editors and Affiliations

IBM, Yorktown Heights, New York, USA
Charu C. Aggarwal
University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
Jiawei Han

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zimmermann, A., Nijssen, S. (2014). Supervised Pattern Mining and Applications to Classification. In: Aggarwal, C., Han, J. (eds) Frequent Pattern Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-07821-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-07821-2_17
Published: 30 August 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07820-5
Online ISBN: 978-3-319-07821-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics