Skip to main content

Tiling Databases

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3245))

Abstract

In this paper, we consider 0/1 databases and provide an alternative way of extracting knowledge from such databases using tiles. A tile is a region in the database consisting solely of ones. The interestingness of a tile is measured by the number of ones it consists of, i.e., its area. We present an efficient method for extracting all tiles with area at least a given threshold.

A collection of tiles constitutes a tiling. We regard tilings that have a large area and consist of a small number of tiles as appealing summaries of the large database. We analyze the computational complexity of several algorithmic tasks related to finding such tilings. We develop an approximation algorithm for finding tilings which approximates the optimal solution within reasonable factors. We present a preliminary experimental evaluation on real data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, ch. 12, pp. 307–328. AAAI/MIT Press (1996)

    Google Scholar 

  2. Ausiello, G., Crescenzi, P., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  3. Besson, J., Robardet, C., Boulicaut, J.-F.: Constraint-based mining of formal concepts in transactional data. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 615–624. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  5. Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings of KDD 2003, pp. 89–98 (2003)

    Google Scholar 

  6. Feige, U.: A threshold of ln n for approximating set cover. Journal of the Association for Computing Machinery 45(4), 634–652 (1998)

    MATH  MathSciNet  Google Scholar 

  7. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  8. Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of ICDM 2002, pp. 211–218 (2002)

    Google Scholar 

  9. Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: Peeling the onion. SIGKDD Explorations 2(2), 86–98 (2000), http://www.ecn.purdue.edu/KDDCUP

    Article  Google Scholar 

  10. Kushilevitz, E., Nisan, N.: Communication Complexity, Cambridge (1996)

    Google Scholar 

  11. Mielikäinen, T., Mannila, H.: The pattern ordering problem. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 327–338. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Orlin, J.: Containment in graph theory: covering graphs with cliques. Indigationes Mathematicae 39, 211–128 (1977)

    Google Scholar 

  13. Peeters, R.: The maximum edge biclique is NP-complete. Discrete Applied Mathematics 131, 651–654 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  14. Ron, D., Mishra, N., Swaminathan, R.: On conjunctive clustering. In: Proceedings of COLT 2003, pp. 448–462 (2003)

    Google Scholar 

  15. Seno, M., Karypis, G.: LPMiner: An algorithm for finding frequent itemsets using length-decreasing support constraint. In: Proceedings of ICDM 2001, pp. 505–512 (2001)

    Google Scholar 

  16. Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of KDD 2002, pp. 32–41 (2002)

    Google Scholar 

  17. Wang, J., Karypis, G.: BAMBOO: Accelerating closed itemset mining by deeply pushing the length-decreasing support constraint. In: Proceedings of SIAM DM 2004 (2004)

    Google Scholar 

  18. Zaki, M.J.: Scalable algorithms for association mining. IEEE TKDE 12(3), 372–390 (2000)

    MathSciNet  Google Scholar 

  19. Zaki, M.J., Hsiao, C.-J.: CHARM: An efficient algorithms for closed itemset mining. In: Grossman, R., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) Proceedings of SIAM DM 2002 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Geerts, F., Goethals, B., Mielikäinen, T. (2004). Tiling Databases. In: Suzuki, E., Arikawa, S. (eds) Discovery Science. DS 2004. Lecture Notes in Computer Science(), vol 3245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30214-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30214-8_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23357-2

  • Online ISBN: 978-3-540-30214-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics