Skip to main content
Log in

Finding the number of clusters in ordered dissimilarities

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

As humans, we have innate faculties that allow us to efficiently segment groups of objects. Computers, to some degree, can be programmed with similar categorical capabilities, which stem from exploratory data analysis. Out of the various subsets of data reasoning, clustering provides insight into the structure and relationships of input samples situated in a number of distributions. To determine these relationships, many clustering methods rely on one or more human inputs; the most important being the number of distributions, c, to seek. This work investigates a technique for estimating the number of clusters from a general type of data called relational data. Several numerical examples are presented to illustrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Anderson DT, Luke RH, Keller JM, Skubic M (2008a) Modeling human activity from voxel person using fuzzy logic. IEEE Trans Fuzzy Syst (to appear)

  • Anderson DT, Luke RH, Keller JM, Skubic M, Rantz M, Aud M (2008b) Linguistic summarization of activities for fall detection using voxel person and fuzzy logic. Comp Vis Image Underst

  • Asuncion A, Newman DJ (2007) UCI machine learning repository. http://archive.ics.uci.edu/ml/

  • Atiquzzaman M (1992) Multiresolution Hough transform—an efficient method of detecting patterns in images. IEEE Trans Pattern Anal Mach Intell 14:1090–1095

    Article  Google Scholar 

  • Baumgartner R, Somorajai R, Summers R, Richter W, Ryner L (2000) Correlator beware: correlation has limited selectivity for fMRI data analysis. NeuroImage 12:240–243

    Article  Google Scholar 

  • Baumgartner R, Somorajai R, Summers R, Richter W (2001) Ranking fMRI time courses by minimum spanning trees: assessing coactivation in fMRI. NeuroImage 13:734–742

    Article  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York

    MATH  Google Scholar 

  • Bezdek JC, Hathaway RJ (2002) VAT: a tool for visual assessment of (cluster) tendency. In: Proceedings of the IEEE joint conference on neural networks

  • Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans Syst Man Cybern 28:301–315

    Article  Google Scholar 

  • Bezdek JC, Hathaway RJ, Huband JM (2005) bigVAT: visual assessment of cluster tendency for large datasets. Pattern Recogn 38:1875–1886

    Article  Google Scholar 

  • Bezdek JC, Hathaway RJ, Huband JM (2006) Visual assessment of clustering tendency for rectangular dissimilarity matrices. IEEE Trans Fuzzy Syst 15:890–903

    Article  Google Scholar 

  • Borg I, Lingoes J (1987) Multidimensional similiarity structure analysis. Springer, New York

    Google Scholar 

  • Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8:679–698

    Article  Google Scholar 

  • Cattell RB (1944) A note on correlation clusters and cluster search methods. Psychometrika 9:169–184

    Article  Google Scholar 

  • Cleveland WS (1993) Visualizing data. Hobart Press, Summit

    Google Scholar 

  • Dhillion I, Modha D, Spranger W (2000) Visualizing class structure of multidimensional data. In: Proceedings of the 30th symposium on the interface: computing science and statistics

  • Everitt BS (1978) Graphical techniques for multivariate data. Heinemann, London

    Google Scholar 

  • Floodgate GD, Hayes PR (1963) The Adansonian taxonomy of some yellow pigmented marine bacteria. J Gen Microbiol 30:237–244

    Google Scholar 

  • Gene Ontology Consortium (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res 32:D258–D261

    Google Scholar 

  • Gonzalez RC, Woods RE (2002) Digital image processing. Prentice-Hall, Upper Saddle River

    Google Scholar 

  • Hathaway RJ, Bezdek JC (1994) NERF c-means: non-Euclidean relational fuzzy clustering. Pattern Recogn 27:429–437

    Article  Google Scholar 

  • Hathaway RJ, Bezdek JC (2006) Visual cluster validity for prototype generator clustering models. Pattern Recogn Lett 24:1563–1569

    Article  Google Scholar 

  • Hathaway RJ, Bezdek JC, Huband JM (2005) Scalable visual assessment of cluster tendency. Pattern Recogn 39:1315–1324

    Article  Google Scholar 

  • Havens TC, Bezdek JC, Keller JM, Popescu M (2008a) Dunn’s cluster validity index as a contrast measure of VAT images. In: Proceedings of the IEEE international conference on pattern recognition

  • Havens TC, Bezdek JC, Keller JM, Popescu M, Huband JM (2008b) Is VAT really single linkage in disguise? Ann Math Artif Intell (in review)

  • Huband JM, Bezdek JC (2008) VCV—visual cluster validity. Pattern Recogn (in review)

  • Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Johnson RA, Wichern DA (1992) Applied multivariate statistical analysis, 3rd edn. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Kendall M, Gibbons JD (1990) Rank correlation methods. Oxford University Press, New York

    MATH  Google Scholar 

  • Ling RF (1973) A computer generated aid for cluster analysis. Commun ACM 16:355–361

    Article  Google Scholar 

  • Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9:757–763

    Article  Google Scholar 

  • Popescu M, Keller JM, Mitchell JA, Bezdek JC (2004) Functional summarization of gene product clusters using gene ontology similarity measures. In: Proceedings of the IEEE international conference on intelligent sensors. Sens Netw Inform Process

  • Saha PK, Udupa JK (2001) Optimum image thresholding via class uncertainty and region homogenity. IEEE Trans Pattern Anal Mach Intell 23:689–706

    Article  Google Scholar 

  • Sledge IJ, Keller JM (2008) Growing neural gas for temporal clustering. In: Proceedings of the IEEE international conference on pattern recognition

  • Sledge IJ, Havens TC, Bezdek JC, Keller JM (2008a) Partitioning ordered dissimilarity data. IEEE Trans Knowl Data Eng (in review)

  • Sledge IJ, Huband JM, Bezdek JC (2008b) Automatic cluster count extraction from unlabeled datasets. In: Proceedings of the IEEE conference on fuzzy systems and knowledge discovery

  • Sledge IJ, Keller JM, Alexander GL (2008c) Emergent trend detection in diurnal activity. In: Proceedings of the IEEE engineering in biology and medicine conference

  • Sledge IJ, Keller JM, Havens TC, Alexander GL, Skubic M (2008d) Temporal activity analysis. In: Proceedings of the association for the advancement of artificial intelligence

  • Sledge IJ, Havens TC, Keller JM, Bezdek JC (2009) Relational generalizations of validity indexes. IEEE Trans Syst Man Cybern (in review)

  • Sneath P (1957) A computer approach to numerical taxonomy. J Gen Microbiol 17:201–226

    Google Scholar 

  • Strehl A, Ghosh J (2000a) A scalable approach to balanced, high-dimensional clustering of market-baskets. In: Proceedings of the international conference on high performance computing

  • Strehl A, Ghosh J (2000b) Value-based customer grouping from large retail data-sets. In: Proceedings of the SPIE conference on data mining and knowledge discovery

  • Theodoridis S, Koutroumbas K (2003) Pattern recognition, 2nd edn. Elsevier, New York

    Google Scholar 

  • Tran-Luu TD (1996) Mathematical concepts and novel heuristic methods for data clustering and visualization. Ph.D. thesis, University of Maryland, College Park

  • Tryon RC (1939) Cluster analysis. Edwards Bros., Ann Arbor

    Google Scholar 

  • Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Reading

    MATH  Google Scholar 

  • Wang W, Zhang Y (2007) On fuzzy cluster validity indices. Fuzzy Sets Syst 158:2095–2117

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was funded by the National Science Foundation under ITR grant number IIS-0428420. The authors would also like to thank the reviewers for their insightful comments that helped to improve the quality of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Isaac J. Sledge.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sledge, I.J., Havens, T.C., Huband, J.M. et al. Finding the number of clusters in ordered dissimilarities. Soft Comput 13, 1125–1142 (2009). https://doi.org/10.1007/s00500-009-0421-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-009-0421-5

Keywords

Navigation