Summary
Mining patterns involving multiple values that are significantly relevant is a difficult but very important problem that crosses many disciplines. Multi-value association patterns, which generalize sequentially ordered patterns, are sets of associated values extracted from sampling outcomes of a random N-tuple. Because they are value patterns from multiple variables, they are more specifically defined than their corresponding variable patterns. They are also easier to interpret. Normally, they can be detected by statistical testing if the occurrence of a pattern event is significantly deviated from the expected according to a prior model or null hypothesis. When the null hypothesis presumes the values of a pattern to be independent, the alternative hypothesis asserts that the values as a whole are associated, allowing some values to be independent within the detected set. Recently, a special type of multi-value association pattern is proposed which we called nested high-order pattern (NHOP), which is a subtype of the high-order pattern (HOP). We discuss here these patterns together with a related one called consigned pattern (CP). Evaluations using relevant experiments of synthetic and biomolecular data are also included.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: SIGMOD Conference 1993, pp. 207–216 (1993)
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S.: The Pfam protein families database. Nucleic Acids Research 32, D138–D141 (2004)
Bazzi, I., Glass, J.: Learning units for domain-independent out-of-vocabulary word modelling. In: Proceedings of European Conference on Speech Communication and Technology, Aalborg, pp. 61–64 (September 2001)
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD Conference 1997, pp. 265–276 (1997)
Chiu, D.K.Y., Lui, T.W.H.: Integrated use of multiple interdependent patterns for biomolecular sequence analysis. International Journal of Fuzzy Systems, Special Issue on Intelligent Computation for Data Mining and Knowledge Discovery 4(3), 766–775 (2002)
Chiu, D.K.Y., Lui, T.W.H.: A multiple-pattern biosequence analysis method for diverse source association mining. Applied Bioinformatics 4(2), 85–92 (2005)
Chiu, D.K.Y., Wong, A.K.C., Cheung, B.: Information discovery through hierarchical maximum entropy discretization and synthesis. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 125–140. MIT/AAAI Press (1991)
Chiu, D.K.Y., Wong, A.K.C.: Multiple pattern associations for interpreting structural and functional characteristics of biomolecules. Information Science, An International Journal 167, 23–39 (2004)
Di Nardo, A.A., Larson, S.M., Davidson, A.R.: The relationship between conservation, thermodynamic stability, and function in the SH3 domain hydrophobic core. Journal of Molecular Biology 333(3), 641–655 (2003)
Haberman, S.J.: The analysis of residuals in cross-classified tables. Biometrics 29, 205–220 (1973)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD 2000), Dallas, TX, pp. 1–12 (2000)
Jaroszewicz, S., Simovici, D.A.: Interestingness of frequent itemsets using Bayesian networks as background knowledge. In: KDD 2004, pp. 178–186 (2004)
Jaroszewicz, S., Scheffer, T.: Fast discovery of unexpected patterns in data, relative to a Bayesian network. In: KDD 2005, pp. 118–127 (2005)
Lui, T.W.H., Chiu, D.K.Y.: Discovering maximized progressive high-order patterns in biosequences. In: Cao, P.Y., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences, pp. 110–115 (2007)
Lui, T.W.H., Chiu, D.K.Y.: Complementary Analysis of High-Order Association Patterns and Classification. In: Proceedings of the 21st Florida Artificial Intelligence Research Society Conference (FLAIRS), Florida, USA, pp. 294–299 (2008)
Sy, B.K.: Information-statistical pattern based approach for data mining. Journal of Statistical Computing and Simulation 69(2), 1–31 (2001)
Sy, B.K.: Discovering association patterns based on mutual information. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 369–378. Springer, Heidelberg (2003)
Tillier, E.R., Lui, T.W.H.: Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments. Bioinformatics 19, 750–755 (2003)
Wang, W., Yang, J.: Mining sequential patterns from large data sets. In: Elmagarmid, A.K. (ed.) Advances in Database Systems. Springer, Heidelberg (2005)
Wong, A.K.C., Wang, Y.: High-order pattern discovery from discrete-valued data. IEEE Transactions on Knowledge and Data Engineering 8(6), 877–892 (1997)
Wong, A.K.C., Wang, Y.: Pattern discovery: A data driven approach to decision support. IEEE Transactions on Knowledge and Data Engineering 15(3), 914–925 (2003)
Wu, X., Barbara, D., Ye, Y.: Screening and interpreting multi-item associations based on log-linear modeling. In: KDD 2003, pp. 276–285 (2003)
Zaki, M.J.: Scalable algorithms for association mining. IEEE Transactions on Knowlegde and Data Engineering 12, 372–390 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lui, T.W.H., Chiu, D.K.Y. (2009). Multi-value Association Patterns and Data Mining. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-01091-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01090-3
Online ISBN: 978-3-642-01091-0
eBook Packages: EngineeringEngineering (R0)