Abstract
Interesting pattern discovery is an important topic in data mining research. Many different definitions have been proposed to describe whether a pattern is interesting. Among these many definitions, unexpectedness has shown to be a highly promising measure. Mining unexpected patterns allows one to identify a failing in prior knowledge and may suggest an aspect of the data that deserves further investigation. Unexpected patterns are typically mined using belief-driven methods, but these require an established belief system. Prior studies have manually built their own partial belief systems to apply their method, but these remain laborious to create. In this study, we propose a novel approach that is able to automatically detect beliefs from data, which can in turn be used to reveal unexpected patterns. Central to this approach is a clustering-based method in which clusters represent beliefs and outliers are potential unexpected patterns. We also propose a pattern representation that captures the semantic relation between patterns rather than the lexical difference. An experimental evaluation on different datasets and a comparison to some other methods demonstrate the effectiveness of the proposed method, as well as the relevance of the discovered patterns.
Similar content being viewed by others
References
Aggarwal C C, Yu P S (2001) A new approach to online generation of association rules. TKDE 13:527–540
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of international conference on very large databases, pp 487–499
Ashrafi M Z, Taniar D, Smith K (2004) A new approach of eliminating redundant association rules. In: Database and expert systems applications. Springer, Berlin, pp 465–474
Bendimerad A, Plantevit M, Robardet C (2018) Mining exceptional closed patterns in attributed graphs. Knowl Inf Syst 56:1–25
Bendimerad AA, Plantevit M, Robardet C (2016) Unsupervised exceptional attributed sub-graph mining in urban data. In: Proceedings of IEEE international conference on data mining, pp 21–30
Chang M -Y, Chiang R -D, Wu S -J, Chan C -H (2016) Mining unexpected patterns using decision trees and interestingness measures: A case study of endometriosis. Soft Comput 20:3991–4003
Daly O, Taniar D (2004) Exception rules mining based on negative association rules. In: Computational science and its applications. Springer, Berlin, pp 543–552
Taniar D, Rahayu W, Lee V, Daly O (2008) Exception rules in association rule mining. Appl Math Comput 205:735–750
Dash P, Fiore-Gartland A J, Hertz T, Wang G C, Sharma S, Souquette A, Crawford J C, Clemens E B, Nguyen T -H -O, Kedzierska K, La Gruta N L, Bradley P, Thomas P G (2017) Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547:89–93
De Bie T (2011) Maximum entropy models and subjective interestingness: An application to tiles in binary databases. Data Min Knowl Disc 23:407–446
De Neuter N, Bittremieux W, Beirnaert C, Cuypers B, Mrzic A, Moris P, Suls A, Van Tendeloo V, Ogunjimi B, Laukens K, Meysman P (2018) On the feasibility of mining CD8+ T cell receptor patterns underlying immunogenic peptide recognition. Immunogenetics 70:159–168
Dong G, Li J (1998) Interestingness of discovered association rules in terms of neighborhood based unexpectedness. In: Research and development in knowledge discovery and data mining. Springer, Berlin, pp 72–86
Dua D, Karra Taniskidou E (2017) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine
Duivesteijn W, Feelders A J, Knobbe A (2016) Exceptional model mining: Supervised descriptive local pattern mining with complex target concepts. Data Min Knowl Disc 30:47–98
Ester M, Kriegel H-P, Xu X (1996) A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proceedings of international conference on knowledge discovery and data mining, pp 226–231
Geng L, Hamilton H J (2006) Interestingness measures for data mining: A survey. ACM Comput Surv 38:9–es
Gupta GK, Strehl A, Ghosh J (1999) Distance based clustering of association rules. In: Intelligent engineering systems through artificial neural networks. ASME Press, pp 759–764
Hussain F, Liu H, Suzuki E, Lu H (2000) Exception rule mining with a relative interestingness measure. In: Knowledge discovery and data mining. Current issues and new applications. Springer, Berlin, pp 86–97
Jaroszewicz S, Scheffer T (2005) Fast discovery of unexpected patterns in data, relative to a Bayesian network. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 118–127
Jorge A (2004) Hierarchical clustering for thematic browsing and summarization of large sets of association rules. In: Proceedings of SIAM international conference on data mining, pp 178–187
Kaytoue M, Plantevit M, Zimmermann A, Bendimerad A, Robardet C (2017) Exceptional contextual subgraph mining. Mach Learn 106:1171–1211
Lent B, Swami A, Widom J (1997) Clustering association rules. In: Proceedings of international conference on data engineering, pp 220–231
Li H, Laurent A, Poncelet P (2007) Mining unexpected sequential patterns and rules. Laboratoire d’Informatique de Robotique et de Microélectronique de Montpellier
Liu B, Hsu W, Chen S (1997) Using general impressions to analyze discovered classification rules. In: Proceedings of international conference on knowledge and data mining, pp 31–36
Luna J M, Pechenizkiy M, Ventura S (2016) Mining exceptional relationships with grammar-guided genetic programming. Knowl Inf Syst 47:571–594
Meysman P, De Neuter N, Gielis S, Bui Thi D, Ogunjimi B, Laukens K (2018) On the viability of unsupervised T-cell receptor sequence clustering for epitope preference. Bioinformatics
Naulaerts S, Meysman P, Bittremieux W, et al. (2015) A primer to frequent itemset mining for bioinformatics. Brief Bioinform 16:216–231
Padmanabhan B, Tuzhilin A (1998) A belief-driven method for discovering unexpected patterns. In: Proceedings of international conference on knowledge discovery and data mining, pp 94–100
Roel B, Jilles V, Siebes A (2017) Efficiently discovering unexpected pattern-co-occurrences. In: Proceedings of SIAM international conference on data mining, pp 126–134
Silberschatz A, Tuzhilin A (1995) On subjective measures of interestingness in knowledge discovery. In: Proceedings of international conference on knowledge discovery and data mining, pp 275–281
Suzuki E (2002) Undirected discovery of interesting exception rules. Int J Pattern Recogn Artif Intell 16:1065–1086
Suzuki E, Żytkow JM (2005) Unified algorithm for undirected discovery of exception rules. Int J Intell Syst 20:673–691
Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: Proceedings of IEEE International Conference on Data Mining, pp 709–712
Han J, Pei H, Yin Y (2000) Mining frequent patterns without candidate generation. SIGMOD Rec 29 (2):1–12
Zaki M J (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Uno T, Kiyomi M, Arimura H (2004) LCM version 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations
Luna J M, Fournier-Viger P, Ventura S (2019) Frequent itemset mining: A 25 years review. WIREs Data Mining Knowl Discov 9:e1329
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supported by Universiteit Antwerpen under BOF docpro grant.
Rights and permissions
About this article
Cite this article
Bui-Thi, D., Meysman, P. & Laukens, K. Clustering association rules to build beliefs and discover unexpected patterns. Appl Intell 50, 1943–1954 (2020). https://doi.org/10.1007/s10489-020-01651-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01651-1