Tiling Databases

Geerts, Floris; Goethals, Bart; Mielikäinen, Taneli

doi:10.1007/978-3-540-30214-8_22

Tiling Databases

Floris Geerts²⁰,
Bart Goethals²¹ &
Taneli Mielikäinen²¹

Conference paper

1247 Accesses
128 Citations
6 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3245))

Abstract

In this paper, we consider 0/1 databases and provide an alternative way of extracting knowledge from such databases using tiles. A tile is a region in the database consisting solely of ones. The interestingness of a tile is measured by the number of ones it consists of, i.e., its area. We present an efficient method for extracting all tiles with area at least a given threshold.

A collection of tiles constitutes a tiling. We regard tilings that have a large area and consist of a small number of tiles as appealing summaries of the large database. We analyze the computational complexity of several algorithmic tasks related to finding such tilings. We develop an approximation algorithm for finding tilings which approximates the optimal solution within reasonable factors. We present a preliminary experimental evaluation on real data sets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, ch. 12, pp. 307–328. AAAI/MIT Press (1996)
Google Scholar
Ausiello, G., Crescenzi, P., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, Heidelberg (1999)
MATH Google Scholar
Besson, J., Robardet, C., Boulicaut, J.-F.: Constraint-based mining of formal concepts in transactional data. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 615–624. Springer, Heidelberg (2004)
Chapter Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings of KDD 2003, pp. 89–98 (2003)
Google Scholar
Feige, U.: A threshold of ln n for approximating set cover. Journal of the Association for Computing Machinery 45(4), 634–652 (1998)
MATH MathSciNet Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of ICDM 2002, pp. 211–218 (2002)
Google Scholar
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: Peeling the onion. SIGKDD Explorations 2(2), 86–98 (2000), http://www.ecn.purdue.edu/KDDCUP
Article Google Scholar
Kushilevitz, E., Nisan, N.: Communication Complexity, Cambridge (1996)
Google Scholar
Mielikäinen, T., Mannila, H.: The pattern ordering problem. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 327–338. Springer, Heidelberg (2003)
Chapter Google Scholar
Orlin, J.: Containment in graph theory: covering graphs with cliques. Indigationes Mathematicae 39, 211–128 (1977)
Google Scholar
Peeters, R.: The maximum edge biclique is NP-complete. Discrete Applied Mathematics 131, 651–654 (2003)
Article MATH MathSciNet Google Scholar
Ron, D., Mishra, N., Swaminathan, R.: On conjunctive clustering. In: Proceedings of COLT 2003, pp. 448–462 (2003)
Google Scholar
Seno, M., Karypis, G.: LPMiner: An algorithm for finding frequent itemsets using length-decreasing support constraint. In: Proceedings of ICDM 2001, pp. 505–512 (2001)
Google Scholar
Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of KDD 2002, pp. 32–41 (2002)
Google Scholar
Wang, J., Karypis, G.: BAMBOO: Accelerating closed itemset mining by deeply pushing the length-decreasing support constraint. In: Proceedings of SIAM DM 2004 (2004)
Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE TKDE 12(3), 372–390 (2000)
MathSciNet Google Scholar
Zaki, M.J., Hsiao, C.-J.: CHARM: An efficient algorithms for closed itemset mining. In: Grossman, R., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) Proceedings of SIAM DM 2002 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Foundations of Computer Science, School of Informatics, University of Edinburgh,
Floris Geerts
HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
Bart Goethals & Taneli Mielikäinen

Authors

Floris Geerts
View author publications
You can also search for this author in PubMed Google Scholar
Bart Goethals
View author publications
You can also search for this author in PubMed Google Scholar
Taneli Mielikäinen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Nishi, 819-0395, Fukuoka, Japan
Einoshin Suzuki
Kyushu University, 6–10–1 Hakozaki Higashi-ku, 812–8581, Fukuoka, Japan
Setsuo Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geerts, F., Goethals, B., Mielikäinen, T. (2004). Tiling Databases. In: Suzuki, E., Arikawa, S. (eds) Discovery Science. DS 2004. Lecture Notes in Computer Science(), vol 3245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30214-8_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-30214-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23357-2
Online ISBN: 978-3-540-30214-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics