Integer Linear Programming Models for Constrained Clustering

Mueller, Marianne; Kramer, Stefan

doi:10.1007/978-3-642-16184-1_12

Marianne Mueller²² &
Stefan Kramer²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6332))

Included in the following conference series:

International Conference on Discovery Science

2124 Accesses

Abstract

We address the problem of building a clustering as a subset of a (possibly large) set of candidate clusters under user-defined constraints. In contrast to most approaches to constrained clustering, we do not constrain the way observations can be grouped into clusters, but the way candidate clusters can be combined into suitable clusterings. The constraints may concern the type of clustering (e.g., complete clusterings, overlapping or encompassing clusters) and the composition of clusterings (e.g., certain clusters excluding others). In the paper, we show that these constraints can be translated into integer linear programs, which can be solved by standard optimization packages. Our experiments with benchmark and real-world data investigates the quality of the clusterings and the running times depending on a variety of parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Partition-Based Clustering Using Constraint Optimization

A Unified Framework for Clustering Constrained Data Without Locality Property

Article 20 August 2019

A Survey of Constrained Clustering

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, pp. 487–499 (1994)
Google Scholar
An, A., Khan, S., Huang, X.: Objective and subjective algorithms for grouping association rules. In: Third International Conference on Data Mining, pp. 477–480 (2003)
Google Scholar
Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Algorithms, Applications and Theory. Chapman & Hall/CRC Press, Boca Raton (2008)
MATH Google Scholar
Bonchi, F., Giannotti, F., Pedreschi, D.: A Relational Query Primitive for Constraint-Based Pattern Mining. In: Constraint-Based Mining and Inductive Databases, pp. 14–37 (2004)
Google Scholar
Boulicaut, J.F., Masson, C.: Data mining query languages. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 715–727 (2005)
Google Scholar
Boulicaut, J.F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416 (2005)
Google Scholar
Chaudhuri, S., Sarma, A.D., Ganti, V., Kaushik, R.: Leveraging Aggregate Constraints for Deduplication. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 437–448 (2007)
Google Scholar
Dash Optimization: XPRESS-MP, http://www.dash.co.uk
Davidson, I., Ravi, S.: Clustering with Constraints: Feasibility Issues and the k-Means Algorithm. In: Proceedings of the Fifth SIAM International Conference on Data Mining (SDM 2005), pp. 138–149 (2005)
Google Scholar
Davidson, I., Ravi, S.: The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Mining and Knowledge Discovery 14(1), 25–61 (2007)
Article MathSciNet Google Scholar
Demiriz, A., Bennett, K., Bradley, P.S.: Using assignment constraints to avoid empty clusters in k-means clustering. In: Basu, S., Davidson, I., Wagstaff, K. (eds.) Constrained Clustering: Algorithms, Applications and Theory (2008)
Google Scholar
De Raedt, L.: A Perspective on Inductive Databases. SIGKDD Explorations 4(2), 66–77 (2002)
Article MathSciNet Google Scholar
Dzeroski, S., Todorovski, L., Ljubic, P.: Inductive Queries on Polynomial Equations. In: Boulicaut, J.F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases, pp. 127–154. Springer, Heidelberg (2004)
Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability. Freeman, New York (1979)
MATH Google Scholar
Hapfelmeier, A., Schmidt, J., Mueller, M., Perneczky, R., Kurz, A., Drzezga, A., Kramer, S.: Interpreting PET Scans by Structured Patient Data: A Data Mining Case Study in Dementia Research. In: Eighth IEEE International Conference on Data Mining, pp. 213–222 (2008)
Google Scholar
Nijssen, S., De Raedt, S.: IQL: A Proposal for an Inductive Query Language. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 189–207. Springer, Heidelberg (2007)
Chapter Google Scholar
Saglam, B., Sibel, F., Sayin, S., Turkay, M.: A mixed-integer programming approach to the clustering problem with an application in customer segmentation. European Journal of Operational Research 173(3), 866–879 (2006)
Article MathSciNet MATH Google Scholar
Schrijver, A.: Theory of Linear and Integer Programming. John Wiley&Sons, West Sussex (1998)
MATH Google Scholar
Sese, J., Morishita, S.: Itemset Classified Clustering. In: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 398–409 (2004)
Google Scholar
Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained K-means Clustering with Background Knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577–584 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Technische Universität München, 85748, Garching, Germany
Marianne Mueller & Stefan Kramer

Authors

Marianne Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Bernhard Pfahringer
Department of Computer Science, The University of Waikato, Private Bag 3105, 3240, Hamilton, New Zealand
Geoff Holmes
School of Computer Science and Engineering, The University of New South Wales, 2052, Sydney, Australia
Achim Hoffmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mueller, M., Kramer, S. (2010). Integer Linear Programming Models for Constrained Clustering. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-16184-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16183-4
Online ISBN: 978-3-642-16184-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics