Skip to main content

Subjective Interestingness in Exploratory Data Mining

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8207))

Abstract

Exploratory data mining has as its aim to assist a user in improving their understanding about the data. Considering this aim, it seems self-evident that in optimizing this process the data as well as the user need to be considered. Yet, the vast majority of exploratory data mining methods (including most methods for clustering, itemset and association rule mining, subgroup discovery, dimensionality reduction, etc) formalize interestingness of patterns in an objective manner, disregarding the user altogether. More often than not this leads to subjectively uninteresting patterns being reported.

Here I will discuss a general mathematical framework for formalizing interestingness in a subjective manner. I will further demonstrate how it can be successfully instantiated for a variety of exploratory data mining problems. Finally, I will highlight some connections to other work, and outline some of the challenges and research opportunities ahead.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. De Bie, T.: An information-theoretic framework for data mining. In: Proc. of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2011)

    Google Scholar 

  2. De Bie, T.: Maximum entropy models and subjective interestingness: an application to tiles in binary databases. Data Mining and Knowledge Discovery 23(3), 407–446 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  3. Friedman, J., Tukey, J.: A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers 100(9), 881–890 (1974)

    Article  Google Scholar 

  4. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  5. Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Computing Surveys 38(3), 9 (2006)

    Article  Google Scholar 

  6. Gionis, A., Mannila, H., Mielikäinen, T., Tsaparas, P.: Assessing data mining results via swap randomization. ACM Transactions on Knowledge Discovery from Data 1(3), 14 (2007)

    Article  Google Scholar 

  7. Grünwald, P.: The Minimum Description Length Principle. MIT Press (2007)

    Google Scholar 

  8. Hanhijarvi, S., Ojala, M., Vuokko, N., Puolamäki, K., Tatti, N., Mannila, H.: Tell me something I don’t know: Randomization strategies for iterative data mining. In: Proc. of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 379–388 (2009)

    Google Scholar 

  9. Huber, P.: Projection pursuit. The Annals of Statistics, 435–475 (1985)

    Google Scholar 

  10. Kontonasios, K.-N., De Bie, T.: An information-theoretic approach to finding informative noisy tiles in binary databases. In: Proc. of the 2010 SIAM International Conference on Data Mining (SDM) (2010)

    Google Scholar 

  11. Kontonasios, K.-N., DeBie, T.: Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 161–171. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Kontonasios, K.-N., De Bie, T.: Subjectively interesting alternative clusterings. Machine Learning (2013)

    Google Scholar 

  13. Kontonasios, K.-N., Spyropoulou, E., De Bie, T.: Knowledge discovery interestingness measures based on unexpectedness. WIREs Data Mining and Knowledge Discovery 2(5), 386–399 (2012)

    Article  Google Scholar 

  14. Kontonasios, K.-N., Vreeken, J., De Bie, T.: Maximum entropy modelling for assessing results on real-valued data. In: Proc. of the IEEE International Conference on Data Mining (ICDM) (2011)

    Google Scholar 

  15. Kontonasios, K.-N., Vreeken, J., De Bie, T.: Maximum entropy models for iteratively identifying subjectively interesting structure in real-valued data. In: Proc. of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery from Databases (ECML-PKDD) (2013)

    Google Scholar 

  16. Padmanabhan, B., Tuzhilin, A.: A belief-driven method for discovering unexpected patterns. In: Proc. of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 94–100 (1998)

    Google Scholar 

  17. Siebes, A., Vreeken, J., van Leeuwen, M.: Item sets that compress. In: Proc. of the 2006 SIAM International Conference on Data Mining (SDM) (2006)

    Google Scholar 

  18. Silberschatz, A., Tuzhilin, A.: On subjective measures of interestingness in knowledge discovery. In: Proc. of the 1st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 275–281 (1995)

    Google Scholar 

  19. Spyropoulou, E., De Bie, T.: Interesting multi-relational patterns. In: Proc. of the IEEE International Conference on Data Mining (ICDM) (2011)

    Google Scholar 

  20. Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-relational data. Data Mining and Knowledge Discovery (2013)

    Google Scholar 

  21. Spyropoulou, E., De Bie, T., Boley, M.: Mining interesting patterns in multi-relational data with n-ary relationships. In: Discovery Science (DS) (2013)

    Google Scholar 

  22. Webb, G.: Discovering significant patterns. Machine Learning 68(1), 1–33 (2007)

    Article  Google Scholar 

  23. Webb, G.: Filtered-top-k association discovery. WIREs Data Mining and Knowledge Discovery 1(3), 183–192 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Bie, T. (2013). Subjective Interestingness in Exploratory Data Mining. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds) Advances in Intelligent Data Analysis XII. IDA 2013. Lecture Notes in Computer Science, vol 8207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41398-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41398-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41397-1

  • Online ISBN: 978-3-642-41398-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics