Discovery of Clusters from Proximity Data: An Approach Using Iterative Adjustment of Binary Classifications

Hirano, Shoji; Tsumoto, Shusaku

doi:10.1007/978-3-540-78733-4_15

Shoji Hirano¹⁰ &
Shusaku Tsumoto¹⁰

Part of the book series: Studies in Computational Intelligence ((SCI,volume 123))

433 Accesses

Clustering is a task of forming groups of similar objects based on the predefined proximity (similarity/dissimilarity) measure and grouping criteria. A lot of approaches, for example, agglomerative/divisive hierarchical clustering, k-means and EM algorithms, have been proposed in the literature [1,2] and widely used for exploratory analysis of real-world data. In order to find the best partition of objects that maximizes both inter-cluster homogeneity and between-clusters isolation, clustering methods often employ geometric measures such as the variance of distances. However, it becomes difficult to form appropriate clusters if only a proximity matrix is available as intrinsic information for analysis and the raw attribute values of data are unavailable or inaccessible. This is because the lack of attribute-value information may bring a difficulty in computing the global properties of groups such as centroids. Additionally, the choice of global coherence/isolation measures is limited if the proximity is defined as a subjective or relative measure, because such a measure may not satisfy the triangular inequality for any triplet of objects. Although conventional hierarchical clusterings are known to be able to deal with relative or subjective measures, they involve other problems such as erosion or expansion of data space by intermediate objects between large clusters and the results are dependent on the orders of object handling [1].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

B. S. Everitt, S. Landau, and M. Leese (2001): Cluster Analysis Fourth Edition. Arnold Publishers.
Google Scholar
P. Berkhin (2002): Survey of Clustering Data Mining Techniques. Accrue Software Research Paper. URL: http://www.accrue.com/products/researchpapers.html.
Z. Pawlak (1991): Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht.
MATH Google Scholar
J. W. Grzymala-Busse and M. Noordeen (1988): “CRS – A Program for Clustering Based on Rough Set Theory,” Research report, Department of Computer Science, University of Kansas, TR-88-3, 13.
Google Scholar
J. Neyman and E. L. Scott (1958): “Statistical Approach to Problems of Cosmology,” Journal of the Royal Statistical Society, Series B20: 1–43.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Medical Informatics, Shimane University, School of Medicine, 89-1 Enya-cho, Izumo, Shimane, 693-8501, Japan
Shoji Hirano & Shusaku Tsumoto

Authors

Shoji Hirano
View author publications
You can also search for this author in PubMed Google Scholar
Shusaku Tsumoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Shuichi Iwata
Department of Systems Innovation School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Yukio Ohsawa
Department of Medical Informatics School of Medicine, Shimane University, Enya-cho Izumo City Shimane, 693-8501, Japan
Shusaku Tsumoto
Department of Information Engineering, Maebashi Institute of Technology, 460-1, Kamisadori-Cho, Maebashi-City, 371-0816, Japan
Ning Zhong (Director) (Director)
WICI/BJUT, China
Ning Zhong (Director) (Director)
Research Center on Data Technology and Knowledge Economy, Chinese Academy of Sciences, Beijing, 100080, PR China
Yong Shi (Director) (Director)
College of Information Science and Technology, University of Nebraska, Omaha, NE, 68182, USA
Yong Shi (Director) (Director)
Department of Philosophy, University of Pavia, Piazza Botta 6, 27100, Pavia, Italy
Lorenzo Magnani

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hirano, S., Tsumoto, S. (2008). Discovery of Clusters from Proximity Data: An Approach Using Iterative Adjustment of Binary Classifications. In: Iwata, S., Ohsawa, Y., Tsumoto, S., Zhong, N., Shi, Y., Magnani, L. (eds) Communications and Discoveries from Multidisciplinary Data. Studies in Computational Intelligence, vol 123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78733-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-78733-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78732-7
Online ISBN: 978-3-540-78733-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics