Skip to main content

Minimum-Width Drawings of Phylogenetic Trees

  • Conference paper
  • First Online:
Combinatorial Optimization and Applications (COCOA 2019)

Abstract

We show that finding a minimum-width orthogonal upward drawing of a phylogenetic tree is NP-hard for binary trees with unconstrained combinatorial order and provide a linear-time algorithm for ordered trees. We also study several heuristic algorithms for the unconstrained case and show their effectiveness through experimentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Alternatively, an upward node-link tree with 1-bend edges and fixed node height.

  2. 2.

    W.l.o.g., each node is embedded below its parent; other orientations, such as drawing nodes above their parents, are equivalent to this one via rotation.

  3. 3.

    Degenerate cases to consider include cases when a variable contributes multiple literals to a single clause. We can safely ignore cases when all three identical literals are present (which is not satisfiable) and when positive and negated literals of the same variable are present (since the clause is always satisfied). When a literal is repeated exactly twice, we handle it as a clause of only the two distinct literals.

References

  1. Alam, M.J., Dillencourt, M., Goodrich, M.T.: Capturing Lombardi flow in orthogonal drawings by minimizing the number of segments. In: Hu, Y., Nöllenburg, M. (eds.) GD 2016. LNCS, vol. 9801, pp. 608–610. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50106-2

    Chapter  Google Scholar 

  2. Bachmaier, C., Brandes, U., Schlieper, B.: Drawing phylogenetic trees. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, pp. 1110–1121. Springer, Heidelberg (2005). https://doi.org/10.1007/11602613_110

    Chapter  Google Scholar 

  3. Bannister, M.J., Eppstein, D.: Hardness of approximate compaction for nonplanar orthogonal graph drawings. In: van Kreveld, M., Speckmann, B. (eds.) GD 2011. LNCS, vol. 7034, pp. 367–378. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-25878-7_35

    Chapter  MATH  Google Scholar 

  4. Bhatt, S.N., Cosmadakis, S.S.: The complexity of minimizing wire lengths in VLSI layouts. Inf. Process. Lett. 25(4), 263–267 (1987)

    Article  Google Scholar 

  5. Biedl, T., Mondal, D.: On upward drawings of trees on a given grid. In: Frati, F., Ma, K.-L. (eds.) GD 2017. LNCS, vol. 10692, pp. 318–325. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73915-1_25

    Chapter  Google Scholar 

  6. Benson, R.B., Ketchum, H., Naish, D., Turner, L.E.: A new leptocleidid (sauropterygia, plesiosauria) from the vectis formation (early barremian-early aptian; early cretaceous) of the isle of wight and the evolution of leptocleididae, a controversial clade. J. Syst. Palaeontol. 11, 233–250 (2013)

    Article  Google Scholar 

  7. Boc, A., Diallo, A.B., Makarenkov, V.: T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks. Nucleic Acids Res. 40(W1), W573–W579 (2012)

    Article  Google Scholar 

  8. Brandes, U., Pampel, B.: Orthogonal-ordering constraints are tough. J. Graph Algorithms Appl. 17(1), 1–10 (2013)

    Article  MathSciNet  Google Scholar 

  9. Brunner, W., Matzeder, M.: Drawing ordered (\(k-1\))–ary trees on k–grids. In: Brandes, U., Cornelsen, S. (eds.) GD 2010. LNCS, vol. 6502, pp. 105–116. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-18469-7_10

    Chapter  MATH  Google Scholar 

  10. Carrizo, S.F.: Phylogenetic trees: an information visualisation perspective. In: Proceedings of the 2nd Conference on Asia-Pacific Bioinformatics, pp. 315–320 (2004)

    Google Scholar 

  11. Chan, T.M.: Tree drawings revisited. Discret. Comput. Geom. 1–22 (2018)

    Google Scholar 

  12. Chan, T.M., Goodrich, M.T., Kosaraju, S.R., Tamassia, R.: Optimizing area and aspect ratio in straight-line orthogonal tree drawings. Comput. Geom. 23(2), 153–162 (2002)

    Article  MathSciNet  Google Scholar 

  13. Di Battista, G., Didimo, W., Patrignani, M., Pizzonia, M.: Orthogonal and quasi-upward drawings with vertices of prescribed size. In: Kratochvíyl, J. (ed.) GD 1999. LNCS, vol. 1731, pp. 297–310. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46648-7_31

    Chapter  MATH  Google Scholar 

  14. Frati, F.: Straight-line orthogonal drawings of binary and ternary trees. In: Hong, S.-H., Nishizeki, T., Quan, W. (eds.) GD 2007. LNCS, vol. 4875, pp. 76–87. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77537-9_11

    Chapter  Google Scholar 

  15. Garg, A., Goodrich, M.T., Tamassia, R.: Planar upward tree drawings with optimal area. Int. J. Comput. Geom. Appl. 06(03), 333–356 (1996)

    Article  MathSciNet  Google Scholar 

  16. Gregori, A.: Unit-length embedding of binary trees on a square grid. Inf. Process. Lett. 31(4), 167–173 (1989)

    Article  MathSciNet  Google Scholar 

  17. Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21(1), 19–28 (1991)

    Article  MathSciNet  Google Scholar 

  18. Huelsenbeck, J.P., Ronquist, F., Nielsen, R., Bollback, J.P.: Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294(5550), 2310–2314 (2001)

    Article  Google Scholar 

  19. Huson, D.H., Scornavacca, C.: Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. Syst. Biol. 61(6), 1061–1067 (2012)

    Article  Google Scholar 

  20. Kim, S.K.: Simple algorithms for orthogonal upward drawings of binary and ternary trees. In: Canadian Conference on Computational Geometry (CCCG), pp. 115–120 (1995)

    Google Scholar 

  21. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)

    Article  Google Scholar 

  22. Miller, M.A., Pfeiffer, W., Schwartz, T.: Creating the CIPRES science gateway for inference of large phylogenetic trees. In: Gateway Computing Environments Workshop (GCE), pp. 1–8, November 2010

    Google Scholar 

  23. Müller, J., Müller, K.: TREEGRAPH: automated drawing of complex tree figures using an extensible tree description format. Mol. Ecol. Notes 4(4), 786–788 (2004)

    Article  Google Scholar 

  24. Page, R.D.: Visualizing phylogenetic trees using treeview. Curr. Protoc. Bioinform. (1), 6.2.1–6.2.15 (2003)

    Google Scholar 

  25. Piel, W.H., Chan, L., Dominus, M.J., Ruan, J., Vos, R.A., Tannen, V.: Treebase v. 2: a database of phylogenetic knowledge. e-BioSphere (2009)

    Google Scholar 

  26. Rusu, A., Fabian, A.: A straight-line order-preserving binary tree drawing algorithm with linear area and arbitrary aspect ratio. Comput. Geom. 48(3), 268–294 (2015)

    Article  MathSciNet  Google Scholar 

  27. Shin, C.S., Kim, S.K., Chwa, K.Y.: Area-efficient algorithms for straight-line tree drawings. Comput. Geom. 15(4), 175–202 (2000)

    Article  MathSciNet  Google Scholar 

  28. Stamatakis, A., Meier, H., Ludwig, T.: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2004)

    Article  Google Scholar 

  29. Sukumaran, J., Holder, M.T.: Dendropy: a python library for phylogenetic computing. Bioinformatics 26(12), 1569–1571 (2010)

    Article  Google Scholar 

  30. Warnow, T.: Tree compatibility and inferring evolutionary history. J. Algorithms 16(3), 388–407 (1994)

    Article  MathSciNet  Google Scholar 

  31. Zainon, W.N.W., Calder, P.: Visualising phylogenetic trees. In: Proceedings of 7th Australasian User Interface Conference, pp. 145–152 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Jose Besa .

Editor information

Editors and Affiliations

A Additional Proofs

A Additional Proofs

1.1 A.1 Alignment Nodes

Alignment nodes only need to be able to fully realign satisfied clauses, therefore within the three structures at least one must be of width three and at least one of width one. Therefore if we consider the periodicity of the column at which the edge drops, the maximum possible phase difference in these periods is shown in Fig. 10. Considering this order as an extreme case is sufficient because after using both satisfied literal structures (after \(x_3\) in \(c_1\)) the only remaining widths could be two (which maintains the phase difference) or one (which reduces the phase difference). The same argument can be made after using both unsatisfied literal structures after \(x_2\).

Fig. 10.
figure 10

Set of consecutive clauses, \(c_1=x_2\vee x_3 \vee x_6\) and \(c_2=x_1\vee \overline{x_2} \vee x_6\) requiring the largest realignment, with satisfying assignment \(x=\{false,true,true, ...\}\) (Color figure online)

Without alignment nodes it would be impossible to connect \(x_2\) and \(x_3\) to the next clause, but adding the two row of alignment nodes (shown in blue) between clauses enable them to remain connected. This allows each clause to remain tight and assume width of at most \(2n+1\) whenever they are satisfied, regardless of the previous clause.

1.2 A.2 Pyramidal Structure

Recall Lemma 1: At minimal width, the base and bridge can only assume a pyramidal embedding. We define a pyramidal embedding as the embedding of the base in which nodes closer to the root lie closer to the center and nodes further from the root approach the outer sides, as shown in Fig. 3b.

Proof

We define the filled area of a drawing as the sum of the space used by each node (equal to its width), and the space used by each edge (equal to its length). The length of the edges is fixed, but we can change the filled area by changing the layout to minimize the width of the non-leaf nodes. When the base is in the pyramidal embedding, it fills the least possible area, since every non-leaf node must have width equal to its number of children. We now show that this is the only configuration with minimal area.

We begin from the parents of the base gadget, which belong to a copy of \(w_2\) (Fig. 3a). The two incoming edges from this structure must be next to one another, with no gap between them. The two nodes at the top level of the base each then have both a leaf and a large subtree attached.

The width of each of these top-level nodes must be two, and the only way to achieve this is that each node’s leaf must lie on the inside and its subtree on the outside. Similarly, for each subsequent node along the path, the same argument shows that its leaf must lie on the inside. This proves by induction that the base needs to be in a pyramidal embedding. The bridge then must fit against the base. The bridge nodes with the lowest leaves must be on the outside, with the next lowest leaves next to them, and so on by induction back to the center. This shows that the pyramidal embedding is the unique embedding that minimizes the filled area. Since the levels containing the base and bridge nodes are completely packed with no gaps, this also implies that the pyramidal embedding is the unique embedding that minimizes the width of these levels. \(\blacksquare \)

1.3 A.3 Constraint Graph is a DAG

Recall Lemma 4: The Constraint Graph D of the fixed order n-node phylogenetic tree \(T = (V,E)\) is a DAG with \(3n-1\) vertices and O(n) edges, where \(n = |V|\).

Proof

Our objects are the left and right sides of each vertex in T, and the edges in T. This gives us two vertices in D for each vertex in T, and one for each edge. Since T is a tree, it must have \(n - 1\) edges, so D has \(3n - 1\) vertices. If two objects are horizontally visible, then there is a segment between them that crosses only those two objects. We will use these segments to build a planar embedding of D, which will imply that D has O(n) edges.

Let the collection of segments connecting our objects be S. We first construct a larger planar graph \(D'\), in which the vertices are the endpoints of S. The edges of \(D'\) include all of the segments in S. We also add edges connecting each vertex to the vertices immediately above and below it that represent the same object. By the definition of horizontally visible, each segment in S can only intersect two objects, so none of the segments in S can intersect. The additional edges also cannot intersect, since they are ordered by the height of the vertices. Therefore, \(D'\) is planar.

We then contract all of the edges that connect two vertices in \(D'\) corresponding to the same object. This produces the DAG D. Since D is a contraction of a planar graph, it must also be planar. \(\blacksquare \)

1.4 A.4 Approximation Guarantee for Minimum Area Heuristic

We first describe the running time of the heuristic. For each ordering of the children, the minimum width drawing is calculated using the algorithm from Theorem 2 and the bounding polygon is calculated by traversing the tree once to find the extreme-most branches, running in O(n). We repeat this O(1) times per vertex for a total running time of \(O(n^2)\) for bounded degree trees.

Theorem 4

The minimum area heuristic has an approximation ratio of at least \(\varOmega (\sqrt{n})\).

Proof

Recall, the structures for \(w_k\) as defined in Theorem 3 and further constrain it to be a complete binary tree with all its k leaves in horizontal alignment. Recall the subtrees used in Theorem 3 and note the subtrees used in this tree instead increase the size of their \(w_k\) by 3 each time (with the exception of the first two which have the same \(w_k\)). Furthermore each subtrees nodes end immediately before the first node in the subtree two subtrees away, the latter subtree also has a node aligned with the former’s \(w_k\) leaves. The leaves in \(w_k\) are horizontally aligned with the top node in the next subtree.

Fig. 11.
figure 11

Tree structures causing worst case performance for minimum area heuristic.

Using these definitions Fig. 11 demonstrates a tree structure where a minimum area embedding of the two subtrees in yellow makes it impossible for the entire drawing to admit minimum width and minimum area. The heuristic achieves the right embedding for the subtree but fails to choose the right embedding for the two sibling subtrees. Although the optimal’s embedding uses a larger area for the combination of both siblings with \(w_k\), it occupies an almost rectangular space resulting in a really small area (and width) increase when adding the next subtree. Define each subtree by the size of the \(w_k\) structure inside it, consider the tree with 2k/3 subtrees \(w_k, w_{k+3} ... w_{3k}\). This structure will have an optimal width of \(3k+6\). The minimum area heuristic would instead have two subtrees with their \(w_k\) on opposite sides and the \(w_{k+3}\) overlapping the bottom-most \(w_k\) and every next pair of subtrees overlapping in the same way. The total width achieved by the minimum area heuristic would therefore be \(\sum _{i=0}^{2k/6} (k+6i+5)=2k^2/3+11k/3+5\). The total number of nodes is \(n=\sum _{i=0}^{2k/3}(7+2(k-2i)+1)+6+k =4k^2/9+7k+14\), which we can use to find k in terms of n. We find that \(k\approx \sqrt{n/2}\), and therefore the approximation ratio achieved by the greedy heuristic for this tree is \(\frac{2k^2/3+11k/3+5}{3k+6}\approx \frac{2k}{9}=\varOmega (\sqrt{n})\). \(\blacksquare \)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Besa, J.J., Goodrich, M.T., Johnson, T., Osegueda, M.C. (2019). Minimum-Width Drawings of Phylogenetic Trees. In: Li, Y., Cardei, M., Huang, Y. (eds) Combinatorial Optimization and Applications. COCOA 2019. Lecture Notes in Computer Science(), vol 11949. Springer, Cham. https://doi.org/10.1007/978-3-030-36412-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36412-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36411-3

  • Online ISBN: 978-3-030-36412-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics