Skip to main content

Contrasting Correlations by an Efficient Double-Clique Condition

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6871))

  • 2063 Accesses

Abstract

Contrast set mining has been well-studied to detect the change between several contrasted databases. In the previous studies, they compared the supports of an itemset and extracted the itemsets with significantly different supports across those databases. Differently, we contrast the correlations of an itemset between two contrasted databases and try to detect potential changes. Any highly correlated itemset is out of our concern in order to focus on implicitly emerging correlation. Therefore, we set correlation constraints (upper bounds) in both databases, and then extract the itemsets consisting of items that are not highly correlated in both databases, but having a significant change of correlations from one database to the other. We regard both of positive and negative correlation. We also consider correlated itemsets under conditioning by third variables. Thus so called partial correlation is also regarded. To cover the correlation notion, we use extended mutual information. In our search procedure for the correlated itemsets, we use double clique condition that is necessary for itemsets to be solutions satisfying the correlation constraints. We show its usefulness by some experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Brin, S., Motwani, R., Silverstein, C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: ACM SIGMOD International Conference on Management of Data, pp. 265–276. ACM Press, New York (1997)

    Google Scholar 

  2. Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining Colossal Frequent Patterns by Core Pattern Fusion. In: 23rd IEEE International Conference on Data Engineering, pp. 706–715. IEEE Press, Los Alamitos (2007)

    Google Scholar 

  3. Taniguchi, T.: A Study on Correlation Mining Based on Contrast Sets. Doctoral Dissertation. IST, Hokkaido University, Japan (2008)

    Google Scholar 

  4. Dong, G., Li, J.: Mining Border Descriptions of Emerging Patterns from Dataset Pairs. Knowledge and Information Systems 8(2), 178–202 (2005)

    Article  Google Scholar 

  5. Bay, S.D., Pazzani, M.J.: Detecting Group Differences: Mining Contrast Sets. In: Data Mining and Knowledge Discovery, vol. 5, pp. 213–246. Kluwer Academic Publishers, Dordrecht (2001)

    Google Scholar 

  6. Zhang, X., Pan, F., Wang, W., Nobel, A.: Mining Non-Redundant High Order Correlations in Binary Data. In: Proceedings of VLDB, vol. 1(1), pp. 1178–1188 (2008)

    Google Scholar 

  7. Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. SIAM, Philadelphia (2007)

    Book  MATH  Google Scholar 

  8. Omiecinski, E.: Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Knowledge and Data Engineering 15, 57–69 (2003)

    Article  Google Scholar 

  9. Kim, W.Y., Lee, Y.K., Han, J.W.: CCMine: Efficient Mining of Confidence-Closed Correlated Patterns. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 569–579. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Younes, N.B., Hamrouni, T., Yahia, S.B.: Bridging Conjunctive and Disjunctive Search Spaces for Mining a New Concise and Exact Representation of Correlated Patterns. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 189–204. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Cheng, C., Fu, A., Zhang, Y.: Entropy-Based Subspace Clustering for Mining Numerical Data. In: 5th ACM SIGKDD, pp. 84–93. ACM press, New York (1999)

    Google Scholar 

  12. Novak, P.K., Lavrac, N., Webb, G.I.: Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining. Journal of Machine Learning Research 10, 377–403 (2009)

    MATH  Google Scholar 

  13. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM SIGMOD in 1993, pp. 207–216 (1993)

    Google Scholar 

  14. Ke, Y.P., Cheng, J., Ng, W.: Mining Quantitative Correlated Patterns Using an Information-Theoretic Approach. In: ACM KDD, pp. 227-236 (2006)

    Google Scholar 

  15. Rymon, R.: Search through Systematic Set Enumeration. In: International Conference on Principles of Knowledge Representation Reasoning-KR 1992, pp. 539–550. Morgan Kaufmann Publisher, CA (1992)

    Google Scholar 

  16. Sinka, M.P., Corne, D.W.: A Large Benchmark Dataset for Web Document Clustering. In: Soft Computing Systems: Design, Management and Applications. Frontiers in Artificial Intelligence and Applications, vol. 87, pp. 881–890 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, A., Haraguchi, M., Okubo, Y. (2011). Contrasting Correlations by an Efficient Double-Clique Condition. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2011. Lecture Notes in Computer Science(), vol 6871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23199-5_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23199-5_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23198-8

  • Online ISBN: 978-3-642-23199-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics