Skip to main content
Log in

Nomclust 2.0: an R package for hierarchical clustering of objects characterized by nominal variables

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper, we present the second generation of the nomclust R package, which we developed for the hierarchical clustering of data containing nominal variables (nominal data). The package completely covers the hierarchical clustering process, from dissimilarity matrix calculation, over the choice of a clustering method, to the evaluation of the final clusters. Through the whole clustering process, similarity measures, clustering methods, and evaluation criteria developed solely for nominal data are used, which makes this package unique. In the first part of the paper, the theoretical background of the methods used in the package is described. In the second part, the functionality of the package is demonstrated in several examples. The second generation of the package is completely rewritten to be more natural for the workflow of R users. It includes new similarity measures and evaluation criteria. We also added several graphical outputs and support for S3 generic functions. Finally, due to code optimizations, the calculation time of dissimilarity matrix calculation was substantially reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The datasets contained four numbers of variables (four, six, eight, ten), three ranges of categories (2–4, 2–6, 6–10), and the number of cases varied from 300 to 700. Each of the datasets contained four clusters with the middle between-cluster distance. All the combinations were five times replicated.

References

Download references

Acknowledgements

This paper was supported by the Prague University of Economics and Business under grant IGA F4/44/2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zdenek Sulc.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sulc, Z., Cibulkova, J. & Rezankova, H. Nomclust 2.0: an R package for hierarchical clustering of objects characterized by nominal variables. Comput Stat 37, 2161–2184 (2022). https://doi.org/10.1007/s00180-022-01209-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-022-01209-4

Keywords