Skip to main content
Log in

Recursive partitioning clustering tree algorithm

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Clustering analysis elicits the natural groupings of a dataset without requiring information about the sample class and has been widely used in various fields. Although numerous clustering algorithms have been proposed and proven to perform reasonably well, no consensus exists about which one performs best in real situations. In this study, we propose a nonparametric clustering method based on recursive binary partitioning that was implemented in a classification and regression tree model. The proposed clustering algorithm has two key advantages: (1) users do not have to specify any parameters before running it; (2) the final clustering result is represented by a set of if–then rules, thereby facilitating analysis of the clustering results. Experiments with the simulations and real datasets demonstrate the effectiveness and usefulness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Agarwal S, Yadav S, Singh K (2012) k-means versus k-means++ clustering technique. In: Students conference on engineering and systems (SCES). IEEE, pp 1–6

  2. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60

    Article  Google Scholar 

  3. Banfield JD, Raftery AE (1993) Model-Based Gaussian and Non-Gaussian Clustering. Biometrics 49(3):803–821

  4. Baraldi A, Alpaydin E (2002) Constructive feedforward ART clustering networks. I. IEEE Trans Neural Netw 13(3):645–661

  5. Belhassen S, Zaidi H (2010) A novel fuzzy C-means algorithm for unsupervised heterogeneous tumor quantification in PET. Med Phys 37:1309

    Article  Google Scholar 

  6. Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data. Springer, Berlin, pp 25–71

  7. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  8. Dasgupta S (2008) The hardness of k-means clustering. Department of Computer Science and Engineering, University of California, San Diego

  9. Davies ER (2004) Machine vision: theory, algorithms, practicalities. Elsevier, Amsterdam

    Google Scholar 

  10. Deepa M, Revathy P, Student PG (2012) Validation of Document Clustering based on Purity and Entropy measures. Int J Adv Res Comput Commun Eng 1(3):147–152

  11. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96:226–231

    Google Scholar 

  12. Frigui H, Krishnapuram R (1999) A robust competitive clustering algorithm with applications in computer vision. IEEE Trans Pattern Anal Mach Intell 21(5):450–465

    Article  Google Scholar 

  13. Fung GA (2001) Comprehensive overview of basic clustering algorithms

  14. Gordon AD (1996) Null models in cluster validation. In: From data to knowledge. Springer Berlin Heidelberg, pp 32–34

  15. Hamerly GJ (2003) Learning structure and concepts in data through data clustering. Doctoral dissertation, University of California, San Diego

  16. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer, New York

    Book  MATH  Google Scholar 

  17. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  18. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc., USA

  19. Jain AK, Murty MN, Flynn MN (1999) ACM computing surveys (CSUR). dl.acm.org

  20. Jolliffe IT (2005) Principal component analysis. John Wiley & Sons, Ltd

  21. Jordan F, Bach F (2004) Learning spectral clustering. Adv Neural Inf Process Syst 16:305–312

  22. Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley, New York

    Google Scholar 

  23. Kriegel HP, Kröger P, Sander J, Zimek A (2011) Density-based clustering. Wiley Interdiscip Rev Data Min Knowl Discov 1(3):231–240

    Article  Google Scholar 

  24. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  25. Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179

    Article  Google Scholar 

  26. Omran MG, Engelbrecht AP, Salman A (2009) Bare bones differential evolution. Eur J Oper Res 196(1):128–139

    Article  MathSciNet  MATH  Google Scholar 

  27. Ronen S, Shenkar O (1985) Clustering countries on attitudinal dimensions: a review and synthesis. Acad Manag Rev 10(3):435–445

  28. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  29. Roy S, Bhattacharyya DK (2005) An approach to find embedded clusters using density based techniques. In: Distributed computing and internet technology. Springer, Berlin, pp 523–535

  30. Tan Y, Hu RF, Yin GF (2008) DBSCAN with multi-thresholds. J Comput Appl 28:745–748

    MATH  Google Scholar 

  31. Turi RH (2001) Clustering-based colour image segmentation. Ph.D. thesis, Monash University

  32. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the editor and the reviewers for their useful comments and suggestions, which were greatly helpful in improving the quality of the paper. This research was supported by Brain Korea PLUS, Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT and Future Planning (2013007724), and the Ministry of Knowledge Economy in Korea under the IT R&D Infrastructure Program supervised by the National IT Industry Promotion Agency (NIPA) [NIPA-2011-(B1110-1101-0002)].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seoung Bum Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, J.H., Park, C.H. & Kim, S.B. Recursive partitioning clustering tree algorithm. Pattern Anal Applic 19, 355–367 (2016). https://doi.org/10.1007/s10044-014-0399-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-014-0399-1

Keywords

Navigation