Abstract
The minimum quartet tree cost (MQTC) problem is a graph combinatorial optimization problem where, given a set of \(n \ge 4\) data objects and their pairwise costs (or distances), one wants to construct an optimal tree from the \(3 \cdot {n \atopwithdelims ()4}\) quartet topologies on n, where optimality means that the sum of the costs of the embedded (or consistent) quartet topologies is minimal. The MQTC problem is the foundation of the quartet method of hierarchical clustering, a novel hierarchical clustering method for non tree-like (non-phylogeny) data in various domains, or for heterogeneous data across domains. The MQTC problem is NP-complete and some heuristics have been already proposed in the literature. The aim of this paper is to present a first exact solution approach for the MQTC problem. Although the algorithm is able to get exact solutions only for relatively small problem instances, due to the high problem complexity, it can be used as a benchmark for validating the performance of any heuristic proposed for the MQTC problem.
Similar content being viewed by others
Notes
Note that by construction there is always one and only one path connecting two leaves of a full unrooted binary tree.
References
Cameron PJ (2000a) Sequences realized by oligomorphic permutation groups. J Integer Seq 3. Article: 00.1.5
Cameron PJ (2000b) Some counting problems related to permutation groups. Discrete Math 225(1–3):77–92
Cilibrasi R, Vitányi PMB (2005) Clustering by compression. IEEE Trans Inf Theory 51(4):1523–1545
Cilibrasi R, Vitányi PMB (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383
Cilibrasi R, Vitányi PMB (2011) A fast quartet tree heuristic for hierarchical clustering. Pattern Recognit 44(3):662–677
Cilibrasi R, Vitányi PMB, de Wolf R (2004) Algorithmic clustering of music based on string compression. Comput Music J 28(4):49–67
Consoli S, Darby-Dowman K, Geleijnse G, Korst J, Pauws S (2010) Heuristic approaches for the quartet method of hierarchical clustering. IEEE Trans Knowl Data Eng 22(10):1428–1443
Consoli S, Moreno-Pérez JA (2012) Solving the minimum labelling spanning tree problem using hybrid local search. In: Proceedings of the mini EURO conference XXVIII on variable neighbourhood search (EUROmC-XXVIII-VNS), vol 39. Electronic notes in discrete mathematics, Hergeg Novi, Montenegro, pp 75–82
Consoli S, Stilianakis NI (2015) A VNS-based quartet algorithm for biomedical literature clustering. Electron Notes Discrete Math 47:13–20
Consoli S, Stilianakis NI (2017) A quartet method based on variable neighborhood search for biomedical literature extraction and clustering. Int Trans Oper Res 24(3):537–558
Cyvin SJ, Brunvoll J, Cyvin BN (1995) Enumeration of constitutional isomers of polyenes. J Mol Struct (Theochem) 357(3):255–261
Deza E, Deza MM (2012) Figurate numbers. World Scientific Publishing, Singapore
Diestel R (2000) Graph theory. Springer, New York
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376
Furnas GW (1984) The generation of random, binary unordered trees. J Classif 1(1):187–233
Granados A, Cebrian M, Camacho D, Rodriguez FB (2011) Reducing the loss of information through annealing text distortion. IEEE Trans Knowl Data Eng 23(7):1090–1102
Li M, Vitányi PMB (1997) An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New York
Rains EM, Sloane NJA (1999) On Cayley’s enumeration of alkanes (or 4-valent trees). J Integer Seq 2:1
Sloane NJA, Plouffe S (1995) The encyclopedia of integer sequences. Academic Press, San Diego
Steel MA (1992) The complexity of reconstructiong trees from qualitative characters and subtrees. J Classif 9:91–116
Acknowledgements
The author Dr. Sergio Consoli wants to dedicate this work with deepest respect to the memory of Professor Kenneth Darby-Dowman, a great scientist, an excellent manager, the best supervisor, a wonderful person, a real friend. He is also particularly grateful to Eng. Lucia Cantone, Prof. Fabrizio Consoli, Prof. Pierpaolo Vivo, Prof. Diego Reforgiato Recupero, and Eng. Niccolo’ Nobile, for helpful guidance and support, inspiring discussions, and precious advices during the development of this research work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Consoli, S., Korst, J., Geleijnse, G. et al. An exact algorithm for the minimum quartet tree cost problem. 4OR-Q J Oper Res 17, 401–425 (2019). https://doi.org/10.1007/s10288-018-0394-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10288-018-0394-2
Keywords
- Combinatorial optimization
- Quartet trees
- Minimum quartet tree cost
- Exact solution algorithms
- Cluster analysis
- Graphs