Skip to main content

Constraint-Based Mining of Fault-Tolerant Patterns from Boolean Data

  • Conference paper
Knowledge Discovery in Inductive Databases (KDID 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3933))

Included in the following conference series:

Abstract

Thanks to an important research effort during the last few years, inductive queries on local patterns (e.g., set patterns) and their associated complete solvers have been proved extremely useful to support knowledge discovery. The more we use such queries on real-life data, e.g., biological data, the more we are convinced that inductive queries should return fault-tolerant patterns. This is obviously the case when considering formal concept discovery from noisy datasets. Therefore, we study various extensions of this kind of bi-set towards fault-tolerance. We compare three declarative specifications of fault-tolerant bi-sets by means of a constraint-based mining approach. Our framework enables a better understanding of the needed trade-off between extraction feasibility, completeness, relevance, and ease of interpretation of these fault-tolerant patterns. An original empirical evaluation on both synthetic and real-life medical data is given. It enables a comparison of the various proposals and it motivates further directions of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Antunes, C., Oliveira, A.L.: Constraint relaxations for discovering unknown sequential patterns. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 11–32. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  2. Besson, J., Robardet, C., Boulicaut, J.-F.: Mining formal concepts with a bounded number of exceptions from transactional data. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 33–45. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Besson, J., Robardet, C., Boulicaut, J.-F.: Approximation de collections de concepts formels par des bi-ensembles denses et pertinents. In: Proceedings Cap 2005, pp. 313–328. PUG (2005); An extended and revised version in English is submitted to a journal

    Google Scholar 

  4. Besson, J., Robardet, C., Boulicaut, J.-F., Rome, S.: Constraint-based concept mining and its application to microarray data analysis. Intelligent Data Analysis 9(1), 59–82 (2005)

    Google Scholar 

  5. Bistarelli, S., Bonchi, F.: Interestingness is not a dichotomy: Introducing softness in constrained pattern mining. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 22–33. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Boulicaut, J.-F.: Inductive Databases and Multiple Uses of Frequent Itemsets: The cInQ Approach. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds.) Database Support for Data Mining Applications. LNCS (LNAI), vol. 2682, pp. 1–23. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery journal 7(1), 5–22 (2003)

    Article  MathSciNet  Google Scholar 

  9. Bucila, C., Gehrke, J.E., Kifer, D., White, W.: Dualminer: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery journal 7(4), 241–272 (2003)

    Article  MathSciNet  Google Scholar 

  10. De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4(2), 69–77 (2003)

    Article  MathSciNet  Google Scholar 

  11. Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings ACM SIGKDD 2003, Washington, USA, pp. 89–98. ACM Press, New York (2003)

    Google Scholar 

  12. François, P., Robert, C., Cremilleux, B., Bucharles, C., Demongeot, J.: Variables processing in expert system building: application to the aetiological diagnosis of infantile meningitis. Med Inform 15(2), 115–124 (1990)

    Article  Google Scholar 

  13. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Gionis, A., Mannila, H., Seppänen, J.K.: Geometric and combinatorial tiles in 0–1 data. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 173–184. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Goethals, B., Zaki, M.: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, FIMI 2003. CEUR-WS, Melbourne, USA (2003)

    Google Scholar 

  16. Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39(11), 58–64 (1996)

    Article  Google Scholar 

  17. Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. Journal of Experimental and Theoretical Artificial Intelligence 14(2-3), 189–216 (2002)

    Article  MATH  Google Scholar 

  18. Pei, J., Tung, A.K.H., Han, J.: Fault-tolerant frequent pattern mining: Problems and challenges. In: SIGMOD wokshop DMKD. ACM workshop (2001)

    Google Scholar 

  19. Pensa, R., Boulicaut, J.-F.: From local pattern mining to relevant bi-cluster characterization. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 293–304. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Pensa, R.G., Robardet, C., Boulicaut, J.-F.: A bi-clustering framework for categorical data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 643–650. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  21. Robardet, C., Crémilleux, B., Boulicaut, J.-F.: Characterization of unsupervised clusters by means of the simplest association rules: an application for child’s meningitis. In: Lyon, F. (ed.) Proceedings IDAMAP 2002 co-located with ECAI 2002, pp. 61–66 (2002)

    Google Scholar 

  22. Seppänen, J.K., Mannila, H.: Dense itemsets. In: Proceedings ACM SIGKDD 2004, Seattle, USA, pp. 683–688. ACM Press, New York (2004)

    Google Scholar 

  23. Stumme, G., Taouil, R., Bastide, Y., Pasqier, N., Lakhal, L.: Computing iceberg concept lattices with TITANIC. Journal of Data and Knowledge Engineering 42(2), 189–222 (2002)

    Article  MATH  Google Scholar 

  24. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered sets, pp. 445–470. Reidel, Dordrechtz (1982)

    Chapter  Google Scholar 

  25. Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings ACM SIGKDD 2001, pp. 194–203. ACM Press, New York (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Besson, J., Pensa, R.G., Robardet, C., Boulicaut, JF. (2006). Constraint-Based Mining of Fault-Tolerant Patterns from Boolean Data. In: Bonchi, F., Boulicaut, JF. (eds) Knowledge Discovery in Inductive Databases. KDID 2005. Lecture Notes in Computer Science, vol 3933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733492_4

Download citation

  • DOI: https://doi.org/10.1007/11733492_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33292-3

  • Online ISBN: 978-3-540-33293-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics