Synonyms
Definition
Clustering
Clustering is the assignment of objects to groups of similar objects (clusters). The objects are typically described as vectors of features (also called attributes). So if one has n attributes, object x is described as a vector (x 1 ,..,x n ). Attributes can be numerical (scalar) or categorical. The assignment can be hard, where each object belongs to one cluster, or fuzzy, where an object can belong to several clusters with a probability. The clusters can be overlapping, though typically they are disjoint. Fundamental in the clustering process is the use of a distance measure.
Distance Measure
In the clustering setting, a distance (or equivalently a similarity) measure is a function that quantifies the similarity between two objects.
Key Points
The choice of a distance measure depends on the nature of the data, and the expected outcome of the clustering process. The most important consideration is the type of the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Everitt B.S., Landau S., Leese M. Cluster Analysis. Wiley, 2001.
Jain A.K., Murty M.N., and Flyn P.J. Data Clustering: A Review. ACM Comput Surv, 31(3):1999.
Theodoridis S. and Koutroubas K. Pattern recognition. Academic Press, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Gunopulos, D. (2009). Cluster and Distance Measure. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_618
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_618
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering