Combinatorial optimisation and hierarchical classifications

Barthélemy, J.-P.; Brucker, F.; Osswald, C.

doi:10.1007/s10479-007-0174-4

Combinatorial optimisation and hierarchical classifications

Published: 15 May 2007

Volume 153, pages 179–214, (2007)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

J.-P. Barthélemy^2,1,
F. Brucker² &
C. Osswald³

137 Accesses
2 Citations
Explore all metrics

Abstract

This paper is devoted to some selected topics relating Combinatorial Optimization and Hierarchical Classification. It is oriented toward extensions of the standard classification schemes (the hierarchies): pyramids, quasi-hierarchies, circular clustering, rigid clustering and others. Bijection theorems between these models and dissimilarity models allow to state some clustering problems as optimization problems. Within the galaxy of optimization we have especially discussed the following: NP-completeness results and search for polynomial instances; problems solved in a polynomial time (e.g. subdominant theory); design, analysis and applications of algorithms. In contrast with the orientation to “new” clustering problems, the last part discusses some standard algorithmic approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aho, A. V., Hopcroft, J. E., & Ullman, J. D. (1974). The design and analysis of computer algorithms (pp. 74–84). Reading: Addison-Wesley.
Google Scholar
Asprejan, J.-D. (1966). Un algorithme pour construire des classes d’après une matrice de distance. Mashinnyi Perevod i Prikladnaja Lingvistika, 9, 3–18.
Google Scholar
Bandelt, H.-J. (1992). Four-point characterization of the dissimilarity function obtained from indexed closed weak hierarchies. Mathematisches Seminar, Hamburg Universität.
Bandelt, H.-J., & Dress, W. M. (1989). Weak hierarchies associated with a similarity measure—an additive clustering technique. Bulletin of Mathematical Biology, 51(1), 133–166.
Google Scholar
Bandelt, H.-J., & Dress, W. M. (1994). An order theoretic framework for overlapping clustering. Discrete Mathematics, 136, 21–37.
Article Google Scholar
Barthélemy, J.-P. (2003). Classifications binaires. In: Actes des Rencontres de la Société Francophone de Classification (pp. 67–69).
Barthélemy, J.-P., & Brucker, F. (2001). NP-hard approximation problems in overlapping clustering. Journal of Classification, 18(2), 159–183.
Google Scholar
Barthélemy, J.-P., & Brucker, F. (2004). Binary clustering, preprint.
Barthélemy, J.-P., & Guénoche, A. (1988). Les arbres et les représentations de proximité. Paris: Masson.
Google Scholar
Barthélemy, J.-P., Brucker, F., & Osswald, C. (2004). Combinatorial optimisation and hierarchical classifications. 4OR, 2(3), 179–219.
Article Google Scholar
Batbedat, A. (1988). Les isomorphismes hte et hts, après la bijection de Benzécri–Johnson. Metron, 46, 47–59.
Google Scholar
Batbedat, A. (1989). Les dissimilarités médias et arbas. Statistiques et Analyse des Données, 14, 1–18.
Google Scholar
Bayer, R. (1972). Symmetric binary b-trees: data structures and maintenance. Acta Informatica, 1, 290–306.
Article Google Scholar
Benzécri, J.-P. (1973). L’analyse de données. Paris: Dunod.
Google Scholar
Bertrand, P. (2000). Set systems and dissimilarities. European Journal of Combinatorics, 21, 727–743.
Article Google Scholar
Bertrand, P. (2002). Set systems for which each set properly intersects at most one other set—application to pyramidal clustering. Cahiers du Ceremade, 0202.
Bertrand, P., & Janowitz, M. (2003). The k-weak hierarchies: an extension of the weak hierarchical clustering structure. Discrete Applied Maths, 127, 199–220.
Article Google Scholar
Birkhoff, G. (1967). Lattice theory. Providence: American Mathematical Society.
Google Scholar
Borůvka, O. (1926a). O jistém problému minimálniím (about a certain minimal problem). Práce Moravské Přírodovědecké Spolecnosti v Brně, 3, 37–58.
Google Scholar
Borůvka, O. (1926b). Příspěvek k řešení otázky ekonomické stavby elektrovodných sítí (contribution to the solution of the problem of economical construction of electrical networks). Elektrotechnický Obzor, 15, 153–154.
Google Scholar
Brucker, F. (2001). Modèles de classification en classes empiétantes. PhD thesis, EHESS and ENST Bretagne.
Brucker, F. (2002). Sub-dominant theory in numerical taxonomy. ENST-Bretagne, submitted.
Brucker, F. (2003). Réalisations de dissimilarités. In: Actes des rencontres de la société francophone de classification (pp. 7–10).
Brucker, F. (2004). From hypertrees to arboreal quasi-ultrametrics. Discrete Applied Mathematics, to appear.
Brucker, F., Osswald, C., & Barthélemy, J.-P. (2003). Rigid hypergraphs: combinatorial optimization problem in clustering and similarity analysis. In: INOC 2003 Proceedings (pp. 126–133).
Brucker, P. (1978). On the complexity of clustering problems. In: M. Beckman, & H. P. Kunzi (Eds.), Optimization and operations research (pp. 45–54). Heidelberg: Springer.
Google Scholar
Carroll, J. D. (1976). Spatial, non-spatial and hybrid models for scaling. Psychometrika, 41, 439–463.
Article Google Scholar
Carroll, J. D., & Chang, J. J. (1973). A method for fitting a class of hierarchical tree structure models to dissimilarity data, and its application to some body part data of Miller. In: Proc. of the 81st convention of the American psychological association (Vol. 8, pp. 1097–1098).
Carroll, J. D., & Pruzansky, S. (1980). Discrete and hybrid scaling methods. In: E. B. Lauterman, & H. Freger (Eds.), Similarity and choice. Bern: Hans Huber.
Google Scholar
Cavalli-Sforza, L. L., & Edwards, A. W. F. (1967). Phylogenetic analysis models and estimation procedures. American Journal of Human Genetics, 19, 233–257.
Google Scholar
Chandon, J. L., Lemaire, J., & Pouget, J. (1980). Construction de l’ultramétrique la plus proche d’une dissimilarité au sens des moindres carrés. RAIRO/Recherche Opérationnelle, série Recherche Operationelle, 14, 157–170.
Google Scholar
Chen, Z. (1996). Space-conserving agglomerative algorithms. Journal of Classification, 13, 157–168.
Article Google Scholar
Chepoï, V., & Fichet, B. (1997). Recognition of Robinsonian dissimilarities. Journal of Classification, 14, 311–325.
Article Google Scholar
Chepoï, V., & Fichet, B. (2000). L _∞-approximation via subdominants. Journal of Mathematical Psychology, 44(4), 600–616.
Article Google Scholar
Choquet, G. (1938). Étude de certains réseaux de routes. Comptes-rendus de l’Académie des Sciences, 206, 310–313.
Google Scholar
Colonius, H., & Schulze, H. H. (1981). Tree structures for proximity data. British Journal of Mathematical and Statistical Psychology, 34, 167–180.
Google Scholar
Diatta, J. (1996). Une extension de la classification hiérarchique: les quasi-hiérarchies. PhD thesis, Université de Provence.
Diatta, J. (1997). Dissimilarités multivoies et généralisations d’hypergraphes sans triangles. Mathématiques, Informatique et Sciences Humaines, 138, 57–73.
Google Scholar
Diatta, J. (1998). Approximating dissimilarities by quasi-ultrametrics. Discrete Mathematics, 192, 81–86.
Article Google Scholar
Diatta, J., & Fichet, B. (1994). From Asprejan hierarchies and Bandelt-Dress weak-hierarchies to quasi-hierarchies. In: E. Diday et al. (Eds.), New approaches in classification and data analysis (pp. 111–118). Berlin: Springer.
Diday, E. (1971). Une nouvelle méthode en classification automatique et reconnaissance des formes: la méthode des nuées dynamiques. Revue de Statistique Appliquée, 19(2), 19–33.
Google Scholar
Diday, E. (1983). Inversions en classification automatique: applications à la construction adaptative d’indices d’agrégation. Revue de Statistique Appliquée, 31(1), 45–62.
Google Scholar
Diday, E. (1984). Une représentation visuelle des classes empiétantes: les pyramides. Research report 291, INRIA.
Diday, E. (1986). Orders and overlapping clusters in pyramids. In: J. de Leew, et al. (Eds.), Multidimensional data analysis proceedings (pp. 201–234).
Dijkstra, E. (1959). Two problems in connection with graphs. Numerische Mathematik, 1, 269–271.
Article Google Scholar
Duchet, P. (1979). Représentations, Noyaux en théorie des graphes et hypergraphes. PhD thesis, Université Paris VI, doctorat d’état.
Duchet, P. (1984). Classical perfect graphs. An introduction with emphasis on triangulated and interval graphs (pp. 67–96). Topics in perfect graphs. North-Holland, Amsterdam.
Durand, C. (1989). Ordres et graphes pseudo-hiérarchiques: théorie et optimisation algorithmique. PhD thesis, Université de Provence.
Farris, J. S. (1969). On the cophenetic correlation coefficients. Systematic Zoology, 18, 279–285.
Article Google Scholar
Fichet, B. (1984). Sur une extension de la notion de hiérarchie et son équivalence avec quelques matrices de Robinson. In: Actes des “Journées de statistique de la Grande Motte” (pp. 12–12).
Fichet, B. (1986). Data analysis: geometric and algebraic structures. In: Y. A. Prohorov, et al. (Eds.), First world congress of the Bernoulli society proceedings (pp. 123–132). V.N.U. Science Press.
Fichet, B. (2001). Ultramétriques supérieures minimales sous contraintes. In: SFC O1 (pp. 147–150).
Flament, C. (1962). L’analyse de similitude. Cahiers du Centre de Recherche Opérationnelle, 4, 63–97.
Google Scholar
Flament, C. (1976). Hypergraphes et analyse de données. Séminaire INRIA.
Flament, C. (1978). Hypergraphes arborés. Discrete Mathematics, 21, 223–227.
Article Google Scholar
Flament, C., Degenne, A., & Vergès, P. (1979). Analyse de similitude ordinale. Informatique et Sciences Humaines, 40–41, 223–231.
Google Scholar
Florek, K., Kuraszewicz, J., Perkal, J., Steinhaus, H., & Zubrzyki, S. (1951a). Sur la liaison et la division des points d’un ensemble fini. Colloquium Mathematicae, 2, 282–285.
Google Scholar
Florek, K., Kuraszewicz, J., Perkal, J., Steinhaus, H., & Zubrzyki, S. (1951b). Taksonomia wroclawska. Przeglad Antropol., 17, 193–207.
Google Scholar
Garey, M. R., & Johnson, D. S. (1979). Computers and intractability—a guide to the theory of NP -Completeness. New York: Freeman.
Google Scholar
Gilmore, P. C. (1962). Families of sets with faithful graph representation. Technical report, Thomas J. Watson Research Center, Yorktown Heights.
Gordon, A. D. (1999). Classification methods for the exploratory analysis of multivariate data. London: Chapman and Hill.
Google Scholar
Gower, J. C., & Ross, G. J. S. (1969). Minimal spanning trees and single linkage cluster analysis. Applied Statistics, 18, 54–64.
Article Google Scholar
Hansen, P., & Jaumard, B. (1997). Cluster analysis and mathematical programming. Mathematical programming, 79, 191–215.
Google Scholar
Hansen, P., Jaumard, B., & Sanlaville, E. (1994). Partitionning problems in cluster analysis: a review of mathematical approaches. In: E. Diday, et al. (Eds.), New approaches in classification and data analysis (pp. 228–240). Berlin: Springer.
Google Scholar
Hansen, P., Jaumard, B., & Mladenovic, N. (1995). How to choose k entities among n. In: I. Cox, P. Hansen, & B. Julesz (Eds.), Partitioning data sets (pp. 105–116). Providence.
Hartigan, J. A. (1967). Representation of similarity matrices by trees. Journal of the American Mathematical Society, 62, 1140–1158.
Google Scholar
Hartigan, J. A. (1975). Clustering algorithms. Chichester: Wiley.
Google Scholar
Henley, N. M. (1969). A psychological study of the semantics of animal terms. Journal of Verbal Learning and Verbal Behavior, 8, 176–184.
Article Google Scholar
Hubert, L., Arabie, P., & Meulman, J. (1997). Linear and circular unidimensional scaling for symmetric proximity matrices. British Journal of Mathematical and Statistical Psychology, 50, 253–284.
Google Scholar
Hubert, L., Arabie, P., & Meulman, J. (1998). Graph-theoretic representations for proximity matrices through strongly-anti-Robinsonian or circular strongly-anti-Robinsonian matrices. Psychometrica, 63(4), 341–358.
Article Google Scholar
Janowitz, M. F. (1978). An order theoretic model for cluster analysis. SIAM Journal of Applied Mathematics, 34, 55–72.
Article Google Scholar
Janowitz, M. F. (1979). Monotone equivariant cluster methods. SIAM Journal of Applied Mathematics, 37, 148–165.
Article Google Scholar
Janowitz, M. F. (1981). Continuous L-clustering methods. Discrete Applied Mathematics, 3, 100–112.
Article Google Scholar
Jardine, J. P. J., Jardine, N., & Sibson, R. (1967). The structure and construction of taxonomic hierarchies. Mathematical Biosciences, 1, 171–179.
Article Google Scholar
Jardine, N., & Sibson, R. (1971). Mathematical taxonomy, part II. London: Wiley.
Google Scholar
Jarník, V. (1930). O jistém problèmu minimalnim. Práce Moravské Přírodovědecké Spolecnosti v Brně (Acta Societatis Scientiarum Naturalium (Moravicae)), 4, 57–63.
Google Scholar
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Article Google Scholar
Krivanek, A., & Moravek, J. (1986). NP-hard problems in hierarchical-tree clustering. Acta Informatica, 23, 311–323.
Article Google Scholar
Kruskal, J. B. (1956). On the shortest spanning tree of a graph and the travelling salesman problem. Proceedings of the American Mathematical Society, 7, 48–50.
Article Google Scholar
Lance, G. N., & Williams, W. T. (1967a). A general theory of classificatory sorting strategies. The Computer Journal, 9(4), 373–380.
Google Scholar
Lance, G. N., & Williams, W. T. (1967b). A general theory of classificatory sorting strategies. The Computer Journal, 10(3), 271–277.
Article Google Scholar
Leclerc, B. (1981). Description combinatoire des ultramétriques. Mathématiques et Sciences Humaines, 73, 5–37.
Google Scholar
Leclerc, B. (1984). Comment reconnaître un hypergraphe arboré. Cahiers du CAMS.
Leclerc, B. (1985a). Les hiérarchies de parties et leur demi-treillis. Mathématiques et Sciences Humaines, 89, 5–34.
Google Scholar
Leclerc, B. (1985b). La comparaison de hiérarchies : indices et métriques. Mathématiques et Sciences Humaines, 92, 5–40.
Google Scholar
McQuitty, L. L. (1957). Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educational and Psychological Measurement, 17, 207–229.
Article Google Scholar
Osswald, C. (2003a). Classification, analyse de la similitude et hypergraphes. PhD thesis, EHESS and ENST Bretagne.
Osswald, C. (2003b). Dissimilarités circulaires et hypercycles. In: Actes des rencontres de la société francophone de classification (pp. 165–168).
Osswald, C. (2003c). Robustesse aux variations de méthode pour la classification hiérarchique. In: XXXVèmes Journées de Statistiques (pp. 751–754). Lyon, 2003. SFdS.
Prim, R. C. (1957). Shortest connection network and some generalizations. Bell System Technical Journal, 26, 1389–1401.
Google Scholar
Quilliot, A. (1984). Circular representation problem on hypergraphs. Discrete Mathematics, 51, 251–264.
Article Google Scholar
Reingold, E. M., Nievergelt, J., & Deo, N. (1977). Combinatorial algorithms: theory and practice. Englewood Cliffs: Prentice-Hall.
Google Scholar
Robinson, W. S. (1951). A method for chronologically ordering archaeological deposits. American Antiquity, 16, 295–301.
Article Google Scholar
Roux, M. (1968). Un algorithme pour trouver une hiérarchie particulière. PhD thesis, ISUP, Paris.
Roux, M. (1985). Algorithmes de classification. Paris: Masson.
Google Scholar
Sibson, R. (1971). Some observations of a paper by Lance and Williams. The Computer Journal, 14, 156–157.
Article Google Scholar
Sneath, P. H. A. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.
Google Scholar
Sokal, R. R., & Rolf, F. J. (1962). The comparison of dendrograms by objective methods. Taxon, 9, 33–40.
Article Google Scholar
Sokal, R. R., & Sneath, P. H. A. (1963). Principles of numerical taxonomy. San Francisco: Freeman.
Google Scholar
Sorensen, T. (1948). A method to establish groups of equal amplitude in plant sociology based on the dissimilarity of species content and the applications of the analysis of the vegetation of danish common. Biologiske Skrifter, 5(4), 1–34.
Google Scholar
Van Cutsem, B. (Ed.) (1994). Classification and dissimilarity analysis. Lecture notes in statistics (Vol. 93). New York: Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

CAMS, EHESS, UMR CNRS 8557, Paris, France
J.-P. Barthélemy
ENST Bretagne, Dept. LUSSI and TAMCIC, FRE CNRS 2658, Paris, France
J.-P. Barthélemy & F. Brucker
ENSIETA, Laboratoire E³I², Paris, France
C. Osswald

Authors

J.-P. Barthélemy
View author publications
You can also search for this author in PubMed Google Scholar
F. Brucker
View author publications
You can also search for this author in PubMed Google Scholar
C. Osswald
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J.-P. Barthélemy.

Additional information

This article appeared in 4OR 2, 179–219, 2004.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barthélemy, JP., Brucker, F. & Osswald, C. Combinatorial optimisation and hierarchical classifications. Ann Oper Res 153, 179–214 (2007). https://doi.org/10.1007/s10479-007-0174-4

Download citation

Published: 15 May 2007
Issue Date: September 2007
DOI: https://doi.org/10.1007/s10479-007-0174-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combinatorial optimisation and hierarchical classifications

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Optimal classification trees

Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combinatorial optimisation and hierarchical classifications

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Optimal classification trees

Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation