Sampling Gene Adjacencies and Geodesic Points of Random Genomes

da Silva, Poly H.; Jamshidpey, Arash; Sankoff, David

doi:10.1007/978-3-031-58072-7_10

Poly H. da Silva^9,11,
Arash Jamshidpey^10,11 &
David Sankoff¹²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 14616))

Included in the following conference series:

RECOMB International Workshop on Comparative Genomics

280 Accesses

Abstract

The breakpoint distance employed in comparative genomics is not a geodesic distance, which makes it difficult to study genomes (i.e. permutations) that are intermediate between two given genomes G and $G'$. An intermediate genome, also called a geodesic point, is a genome whose sum of breakpoint distances to G and $G'$ is equal to the breakpoint distance of G and $G'$. To construct an intermediate genome M, it is necessary to find sets of gene adjacencies I and J selected from G and $G'$ whose union forms M. This means that the set of adjacencies of M is $I\cup J$. Any given set of adjacencies I selected from G may put some constraints on some adjacencies of $G'$ so that they cannot be used in J to construct M or if they can, they must be used in specific ways. For instance, a gene adjacency of $G'$ whose gene extremities are used in the “middle” of segments of I cannot be used to construct M. Based on these constraints, we classify the set of all adjacencies of $G'$ with respect to I into four distinct groups. For two unichromosomal random genomes of the same gene-content, namely $\xi _1$ and $\xi _2$, as the number of genes tends to infinity, we study the limiting behaviour of the frequencies of adjacencies of each type in $\xi _2$ with respect to a random or deterministic set of adjacencies selected from $\xi _1$. We use the limiting results to provide necessary conditions for the size and the shape of the set of adjacencies selected from the first genome for the purpose of constructing an intermediate genome between $\xi _1$ and $\xi _2$. These results can help to shed light on how to construct “accessible breakpoint medians” far from the input genomes (corners).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Compromise or optimize? The breakpoint anti-median

Article Open access 15 December 2016

Comparative genomics meets topology: a novel view on genome median and halving problems

Article Open access 11 November 2016

Center Genome with Respect to the Rank Distance

References

Billingsley, P.: Probability and Measure, 3r edn. John Wiley & Sons, New York (1995)
Google Scholar
Haghighi, M., Sankoff, D.: Medians seek the corners, and other conjectures. BMC Bioinform. 13(19), S5 (2012)
Article Google Scholar
Jamshidpey, A.: Population dynamics in random environment, random walks on symmetric group, and phylogeny reconstruction. Ph.D. thesis, Université d’Ottawa/University of Ottawa (2016)
Google Scholar
Jamshidpey, A., Jamshidpey, A., Sankoff, D.: Sets of medians in the non-geodesic pseudometric space of unsigned genomes with breakpoints. BMC Genomics 15(6), S3 (2014)
Article Google Scholar
Kallenberg, O.: Foundations of Modern Probability. Springer, Cham (2006)
Google Scholar
Larlee, C.A., Zheng, C., Sankoff, D.: Near-medians that avoid the corners; a combinatorial probability approach. BMC Genomics 15(6), S1 (2014)
Article Google Scholar
Sankoff, D., Blanchette, M.: The median problem for breakpoints in comparative genomics. In: Jiang, T., Lee, D.T. (eds.) COCOON 1997. LNCS, vol. 1276, pp. 251–263. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0045092
Chapter Google Scholar
Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5(3), 555–570 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Columbia University, New York, NY, 10027, USA
Poly H. da Silva
Department of Mathematics, Columbia University, New York, NY, 10027, USA
Arash Jamshidpey
Irving Institute for Cancer Dynamics, Columbia University, New York, NY, 10027, USA
Poly H. da Silva & Arash Jamshidpey
Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, K1N 6N5, Canada
David Sankoff

Authors

Poly H. da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Arash Jamshidpey
View author publications
You can also search for this author in PubMed Google Scholar
David Sankoff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arash Jamshidpey .

Editor information

Editors and Affiliations

CNRS, University of Montpellier, Montpellier, France
Celine Scornavacca
Center for Research and Advanced Studies, Irapuato, Mexico
Maribel Hernández-Rosales

Appendix A: Proofs

Here we provide the proofs of Lemma 1, Lemma 2, Theorem 1, Theorem 2, Theorem 3 and Lemma 3.

Proof of Lemma 1. Consider a segment set $I=\{s_1,...,s_k\}$, with k non-empty segments and m adjacencies that is contained in $x\in S_n$. Then $|\Vert \overline{I}_{x}\Vert -k|\le 1$, and therefore we represent the segments of $\overline{I}_{x}$ by $s'_1,...,s'_{k+1}$, where $s'_j$ is non-empty for $2\le j\le k$, and $s'_1$ and $s'_{k+1}$ may be empty. Note that $\sum _{i=1}^{k} |s_i|=m$ and $\sum _{j=1}^{k+1} |s'_j|=n-1-m$ with $|s_i|\ge 1$ for $1\le i\le k$ and $|s'_j|\ge 1$ for $2\le j\le k$. Hence, the number of solutions for these two equations is equal to:

$${m-k+(k-1) \atopwithdelims ()k-1} {n-1-m-(k-1)+(k+1-1)\atopwithdelims ()(k+1-1)}={m-1 \atopwithdelims ()k-1} {n-m\atopwithdelims ()k}$$

In other words, that is the number of ways we can choose k segments with m adjacencies of x. $\square $

Proof of Lemma 2. As the segment set I has m adjacencies and k segments, each permutation containing I has $n-m-k$ external points with respect to I. Therefore, noting that segments have two directions, we have $2^{k}(k+(n-m-k))!$ permutations containing I. $\square $

Proof of Theorem 1. As $\alpha _{m,k}$ is independent of $\mathcal {I}_{m,k}$ and $\mathcal {L}(\mathcal {I}_{m,k})=\mathcal {L}(\mathcal {I}_m\mid \Vert \mathcal {I}_m\Vert =k)$, we have

$$ \mathbbm {E}[\alpha _m\mid \Vert \mathcal {I}_m\Vert =k]=\mathbb {E}[\alpha _{m,k}]=\mathbb {E}[\alpha _{m,k}\mid \mathcal {I}_{m,k}=I]=\mathbb {E}[\alpha (\xi ,I)]. $$

So for the first part, we only need to compute $\mathbbm {E}[\alpha _m\mid \Vert \mathcal {I}_m\Vert =k]$. To this end, note that there are $n-m-k$ external points (gens), $m-k$ internal points, and 2k end points in any segment set with m adjacencies and k segments. Sampling a random adjacency from $\mathcal {I}_m$, conditional on $\Vert \mathcal {I}_m\Vert =k$, the chance to have a 2-free-end, 1-free-end, trivial segment adjacency, is respectively

$$\frac{(n-m-k)(n-m-k-1)}{n(n-1)}, \ \ \frac{4k(n-m-k)}{n(n-1)}, \ \ \frac{2k(2k-1)}{n(n-1)},$$

while the chance to have a 0-free-end adjacency is given by

$$ \frac{2(m-k)(n-m+k)+(m-k)(m-k-1)}{n(n-1)}=\frac{(m-k)(2n-m+k-1)}{n(n-1)}. $$

Now, for $i=1,...,n-1$, let $\hat{\alpha }_{m,i}$ be a random variable such that $\hat{\alpha }_{m,i}=1$ if the $i^{th}$ adjacency of $\xi $, i.e. $\{\xi _i,\xi _{i+1}\}$, is 2-free-end w.r.t. $\mathcal {I}_m$ and $\hat{\alpha }_{m,i}=0$ otherwise. Then, for every $i=1,...,n-1$, we have

$$\begin{aligned} {\displaystyle \mathbbm {P}(\hat{\alpha }_{m,i}=1\mid \Vert \mathcal {I}_m\Vert =k)= \frac{(n-m-k)(n-m-k-1)}{n(n-1)}}, \end{aligned}$$

implying that $\mathbbm {E}[\alpha _m \mid \Vert \mathcal {I}_m\Vert =k]$ is equal to

$$ \sum \limits _{i=1}^{n-1}\mathbbm {P}(\hat{\alpha }_{m,i}=1 \mid \Vert \mathcal {I}_m\Vert =k) = \frac{(n-m-k)(n-m-k-1)}{n}. $$

The other conditional expected values in the statement of the theorem are computed similarly. For the second part of the theorem, averaging over the possible number of segment sets, we have

$$\begin{aligned} \mathbbm {E}[\alpha _m]= & {} \frac{1}{n}\mathbbm {E}\left[ (n-m-\Vert \mathcal {I}_m \Vert )(n-m-\Vert \mathcal {I}_m \Vert -1)\right] \\ = & {} \frac{(n-m)(n-m-1)}{n}+\frac{2m-2n+1}{n}\mathbbm {E}\Vert \mathcal {I}_m\Vert +\frac{1}{n}\mathbbm {E}\Vert \mathcal {I}_m\Vert ^2. \end{aligned}$$

Since $\Vert \mathcal {I}_m\Vert \sim H(n-1,m-n,m)$, its moments are given in (1). Therefore, after some simplification, we obtain

$$\begin{aligned} \mathbbm {E}[\alpha _m]=\frac{(n- m) (1 + m - n)^2 (n - m-2)}{(n-2) (n-1) n}. \end{aligned}$$

Similarly,

$$ \mathbbm {E}[\beta _m]=\frac{1}{n}\mathbbm {E}\left[ 4\Vert \mathcal {I}_m\Vert (n-m-\Vert \mathcal {I}_m\Vert )\right] =\frac{4m (n-m) (1 + m - n)^2}{(n-2) (n-1) n}, $$

$$ \mathbbm {E}[\gamma _m]=\frac{1}{n}\mathbbm {E}\left[ 2\Vert \mathcal {I}_m\Vert (2\Vert \mathcal {I}_m\Vert -1)\right] =\frac{2m (n-m) (2m (n-m) + n)}{(n-2) (n-1) n}, $$

and

$$\begin{aligned} \mathbbm {E}[\delta _m]= & {} \frac{1}{n}\mathbbm {E}\left[ (m-\Vert \mathcal {I}_m\Vert )(2n-m+\Vert \mathcal {I}_m\Vert -1)\right] \\ = & {} \frac{m(m-1)(2 n^2- 6 n- m^2+ 3m +2)}{n (2 - 3 n + n^2)}. \end{aligned}$$

$\square $

Proof of Theorem 2. There are two options for choosing two adjacencies of $\xi $. They are either consecutive, $\{\xi _i,\xi _{i+1}\},\{\xi _{i+1},\xi _{i+2}\}$, or nonconsecutive, $\{\xi _i,\xi _{i+1}\}$, $\{\xi _{j},\xi _{j+1}\}$ for $i+1< j$. If we select two consecutive adjacencies of $\xi $ at random, the chances that both are 2-free end, both are 1-free end, and both are trivial segment adjacencies are respectively given by

$$ \frac{(n-m-k)_{[3]}}{n_{[3]}}, \ \ \frac{2k(n-m-k)_{[2]}+(n-m-k)(2k)_{[2]}}{n_{[3]}}, \ \ \frac{(2k)_{[3]}}{n_{[3]}}, $$

while the chance that both are 0-free end is

$$ \frac{(m-k)_{[3]}+3(n-m+k)(m-k)_{[2]}+(m-k)(n-m+k)_{[2]}}{n_{[3]}}. $$

Similarly, if we pick two nonconsecutive adjacencies of $\xi $ at random, the chances that both are 2-free end, both are 1-free end, and both are trivial segment adjacencies are respectively given by

$$ \frac{(n-m-k)_{[4]}}{n_{[4]}}, \ \ \frac{4(n-m-k)_{[2]}(2k)_{[2]}}{n_{[4]}}, \ \ \frac{(2k)_{[4]}}{n_{[4]}}, $$

and finally the chance that both are 0-free end is readily obtained

$$ \frac{(m-k)_{[4]}+4(n-m+k)(m-k)_{[3]}+4(n-m+k)_{[2]}(m-k)_{[2]}}{n_{[4]}}. $$

Now, for the first part of the theorem, as before we only need to compute the left of

$$ Var(\alpha _m\mid \Vert \mathcal {I}_m\Vert =k)=Var(\alpha _{m,k})=Var(\alpha (\xi ,I)). $$

For $i=1,\dots ,n-1$, recall the definition of $\hat{\alpha }_{m,i}$ from the proof of Theorem 1, and let $\hat{\alpha }_{m,k,i}$ be random variable such that $\hat{\alpha }_{m,k,i}=1$ if the $i^{th}$ adjacency of $\xi $, i.e. $\{\xi _i,\xi _{i+1}\}$, is 2-free-end w.r.t. $\mathcal {I}_{m,k}$ and $\hat{\alpha }_{m,k,i}=0$ otherwise. Then, for every $i=1,...,n-1$

$$\begin{aligned} \mathbbm {E}[\alpha _{m,k}^2]= & {} \sum \limits _i\mathbbm {E}[\hat{\alpha }_{m,k,i}^2]+2\sum \limits _{i> j}\mathbbm {E}[\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}]\\ = & {} \sum \limits _i \mathbbm {P}(\hat{\alpha }_{m,k,i}^2=1)+2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}=1)\\ = & {} \sum \limits _i \mathbbm {P}(\hat{\alpha }_{m,k,i}=1)+2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}=1)\\ = & {} \mathbbm {E}[\alpha _{m,k}]+2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}=1). \end{aligned}$$

Note that

$$\begin{aligned} \sum \limits _{i>j+1}\mathbbm {P}(\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}=1)= & {} \sum \limits _{i>j+1}\frac{(n-m-k)_{[4]}}{n_{[4]}} =\frac{(n-m-k)_{[4]}}{2n(n-1)}, \end{aligned}$$

and

$$\begin{aligned} \sum \limits _{i=j+1}\mathbbm {P}(\hat{\alpha }_{m,k,i} \hat{\alpha }_{m,k,j}=1)= & {} \sum \limits _{i=j+1}\frac{(n-m-k)_{[3]}}{n_{[3]}} =\frac{(n-m-k)_{[3]}}{n(n-1)}. \end{aligned}$$

Hence,

$$\begin{aligned} Var(\alpha _{m,k}) = & {} \mathbbm {E}[\alpha _{m,k}](1-\mathbbm {E}[\alpha _{m,k}])+\frac{(n-m-k)_{[3]}(n-m-k-1)}{n(n-1)}. \end{aligned}$$

Exactly the same calculations give $Var(\alpha (\xi ,I))$. Similarly we can compute $Var(\beta _{m,k})=Var(\beta (\xi ,I))$, $Var(\gamma _{m,k})=Var(\gamma (\xi ,I))$ and $~Var(\delta _{m,k})=~ $ $Var(\delta (\xi ,I))$. Now to compute $Var(\alpha _m)$, write $\mathbbm {E}[\alpha _m^2]$ as

$$\begin{aligned} {} & {} \sum \limits _i\mathbbm {E}[\hat{\alpha }_{m,i}^2]+2\sum \limits _{i> j}\mathbbm {E}[\hat{\alpha }_{m,i} \hat{\alpha }_{m,j}] =\sum \limits _i \mathbbm {P}(\hat{\alpha }_{m,i}^2=1)\!+\!2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,i} \hat{\alpha }_{m,j}=1)\\ {} & {} =\sum \limits _i \mathbbm {P}(\hat{\alpha }_{m,i}=1)+2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,i} \hat{\alpha }_{m,j}=1) =\mathbbm {E}[\alpha ]+2\sum \limits _{i> j}\mathbbm {P}(\hat{\alpha }_{m,i} \hat{\alpha }_{m,j}=1). \end{aligned}$$

Letting $A_{m,k}=A_{m,k}^{(n)}:=\{\Vert \mathcal {I}_m^{(n)}\Vert =k\}$, note that

$$\begin{aligned} \sum \limits _{i>j+1}\mathbbm {P}(\hat{\alpha }_{m,i}\cdot \hat{\alpha }_{m,j}=1)= & {} \sum \limits _{i>j+1}\sum \limits _{k=1}^{m}\frac{(n-m-k)_{[4]}}{n_{[4]}}\mathbbm {P}(A_{m,k})\\ = & {} \sum \limits _{k=1}^{m}\frac{(n-m-k)_{[4]}}{2n(n-1)}\mathbbm {P}(A_{m,k}), \end{aligned}$$

and

$$\begin{aligned} \sum \limits _{i=j+1}\mathbbm {P}(\hat{\alpha }_{m,i} \hat{\alpha }_{m,j}=1)= & {} \sum \limits _{i=j+1}\sum \limits _{k=1}^{m}\frac{(n-m-k)_{[3]}}{n_{[3]}}\mathbbm {P}(A_{m,k})\\ = & {} \sum \limits _{k=1}^{m}\frac{(n-m-k)_{[3]}}{n(n-1)}\mathbbm {P}(A_{m,k}). \end{aligned}$$

Therefore from (1)

$$\begin{aligned} Var(\alpha _m)= & {} \mathbbm {E}[\alpha _m](1-\mathbbm {E}[\alpha _m])+\frac{1}{n(n-1)}\mathbbm {E}[(n-m-\Vert \mathcal {I}_m\Vert )_{[3]}(n-m-\Vert \mathcal {I}_m\Vert -1)]\\ = & {} \mathbbm {E}[\alpha _m](1-\mathbbm {E}[\alpha _m]) +\frac{(n-m)_{[4]}(n-m-1)_{[2]} \left( (n-m)^2-5 n+4+7m)\right) }{n_{[5]}(n-1)}\\ = & {} \left( 1-\frac{m}{n}\right) ^4\left( \frac{m}{n}\right) ^2 \left( 8+\frac{m}{n}\left( 5\frac{m}{n}-12\right) \right) n+o(n).\\ \end{aligned}$$

In the same way, we can show that

$$\begin{aligned} Var(\beta _m)= & {} \mathbbm {E}[\beta _m](1-\mathbbm {E}[\beta _m])\\ {} & {} + \, \frac{1}{n(n-1)}\mathbbm {E}[16\Vert \mathcal {I}_m\Vert ^2(n-m-\Vert \mathcal {I}_m\Vert )^2+4\Vert \mathcal {I}_m\Vert ^3-4\Vert \mathcal {I}_m\Vert (n-m)^2]\\ = & {} \mathbbm {E}[\beta _m](1-\mathbbm {E}[\beta _m])+\left( \frac{4m (m-n) (m-n+1)^2}{n_{[5]}(n-1)}\right) \times \\ {} & {} \left\{ (1-4m) n^3+(4m (3m+5)-3) n^2\right. \\ {} & {} \left. - \, (m+1) (3m (4m+11)+1) n+4 (m+1)^2 (m (m+4)+1)\right\} \\ = & {} 4\left( 1-\frac{m}{n}\right) ^3\left( \frac{m}{n}\right) ^2\left( 8-\frac{m}{n}\left( 31+\frac{4m}{n}\left( \frac{5m}{n}-11\right) \right) \right) n+o(n), \end{aligned}$$

$$\begin{aligned} Var(\gamma _m)= & {} \mathbbm {E}[\gamma _m](1-\mathbbm {E}[\gamma _m])+\frac{1}{n(n-1)}\mathbbm {E}[(2\Vert \mathcal {I}_m\Vert )_{[3]}(2\Vert \mathcal {I}_m\Vert -1)]\\ \!\!=\! & {} \mathbbm {E}[\gamma _m](1-\mathbbm {E}[\gamma _m])\\ {} & {} + \, \frac{4 m_{[2]} (n-m)_{[2]} }{n_{[5]}(n-1)}\times \left\{ m^4-8m^3 n+4m^2 \left( n^2+n+3\right) \right. \\ {} & {} \left. - \, 4m n (n+3)+n (n+9)-4\right\} \\ \!\!=\! & {} 4\left( 1-\frac{m}{n}\right) ^2\left( \frac{m}{n}\right) ^2\left[ 1-\frac{4m}{n}\left( 1-\frac{m}{n}\right) \left( 1+\frac{5m}{n}\left( 1-\frac{m}{n}\right) \right) \right] n+o(n),\\ \end{aligned}$$

and finally,

$$ \begin{aligned} Var(\delta _m)&=\mathbbm {E}[\delta _m](1-\mathbbm {E}[\delta _m])\\ &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!+\frac{1}{2n(n-1)}\mathbbm {E}[(m-\Vert \mathcal {I}_m\Vert )_{[2]}(2n-m+\Vert \mathcal {I}_m\Vert -2)_{[2]}] +\frac{(n-2)}{n}\mathbbm {E}[m-\Vert \mathcal {I}_m\Vert ]\\ &=\mathbbm {E}[\delta _m](1-\mathbbm {E}[\delta _m])+\frac{m_{[2]}}{n_{[5]}(n-1)} \times \\ & \left\{ (m-5) m \left( m \left( m^3-10m^2+m+40\right) +4\right) \right. \\ &+4 (m-4) (m+1) n^4+2 (9-23 (m-3) m) n^3\\ &+2 (m (m (51-2 (m-8) m)-235)+50) n^2 \\ &\left. +\,2m (m (13 (m-8) m+121)+170) n+2 n^5-152 n+48\right\} \\ &=\left( \frac{m}{n}\right) ^2\left( 1-\left( \frac{m}{n}\right) ^2\right) ^2\left( 4+\frac{m}{n}\left( \frac{5m}{n}-8\right) \right) n+o(n). \end{aligned} $$

$\square $

Proof of Theorem 3. First observe that, by Theorem 1, as $n\rightarrow \infty $,

$$ \begin{array}{lll} {\displaystyle \mathbbm {E}[\frac{\tilde{\alpha }_n}{n}]\rightarrow (1-c)^4}, &{}&{}{\displaystyle \mathbbm {E}[\frac{\bar{\alpha }_n}{n}] , \ \mathbbm {E}[\frac{\alpha (\xi ^{(n)},\hat{I}_n)}{n}]\rightarrow (1-c-c')^2},\\ \\ {\displaystyle \mathbbm {E}[\frac{\tilde{\beta }_n}{n}]\rightarrow 4c(1-c)^3},&{}&{}{\displaystyle \mathbbm {E}[\frac{\bar{\beta }_n}{n}] , \ \mathbbm {E}[\frac{\beta (\xi ^{(n)},\hat{I}_n)}{n}]\rightarrow 4c'(1-c-c')},\\ \\ {\displaystyle \mathbbm {E}[\frac{\tilde{\gamma }_n}{n}]\rightarrow 4c^2(1-c)^2},&{}&{}{\displaystyle \mathbbm {E}[\frac{\bar{\gamma }_n}{n}] , \ \mathbbm {E}[\frac{\gamma (\xi ^{(n)},\hat{I}_n)}{n}]\rightarrow 4c'^2}, \end{array} $$

$$ \begin{array}{lll} {\displaystyle \mathbbm {E}[\frac{\tilde{\delta }_n}{n}]\rightarrow c^2(2-c)^2},&{}&{}{\displaystyle \mathbbm {E}[\frac{\bar{\delta }_n}{n}] , \ \mathbbm {E}[\frac{\delta (\xi ^{(n)},\hat{I}_n)}{n}]\rightarrow (c-c')(2-c+c')}.\\ \\ \end{array} $$

Also, following Theorem 2, the variances of all these sequences converge to 0. Hence, the convergence in $L^2$ and in probability holds. $\square $

Proof of Lemma 3. Suppose $\{a,b\} \in F(x,I)\setminus \mathcal {A}_\pi $. As $a,b\in Ext(I)$ and therefore the neighbours of a in $\pi $ should be from set $\mathcal {N}_x(a)\setminus \{b\}$ and the neighbours of b in $\pi $ should be from set $\mathcal {N}_x(b)\setminus \{a\}$, we have $|\mathcal {N}_\pi (a)|,|\mathcal {N}_\pi (b)| \le 1$. But $|\mathcal {N}_\pi (a)|$ and $|\mathcal {N}_\pi (b)|$ cannot be 0, since in that case a or b cannot be connected to the rest of the numbers to construct $\pi $, and therefore $|\mathcal {N}_\pi (a)|=|\mathcal {N}_\pi (b)|=1$ which means that a and b are extremities of permutation $\pi $, i.e. $\{\pi _1,\pi _n\}=\{a,b\}$. In other words, there may exist at most one adjacency $\{a,b\}\in F(x,I)\setminus \mathcal {A}_\pi $. This proves part (a). For part (b), suppose $\pi '\in \overline{[id,x]}$ and there exists adjacency $\{a,b\}$ such that $\{a,b\}\in F(x,I)\setminus \mathcal A_{\pi '}$. As we showed above $\{\pi '_1,\pi '_n\}=\{a,b\}$. Also, as a and b are connected in $\pi '$ through a segment of $\pi '$ containing at least one segment of I and this means that there exists at least one ${<}1,2{>}$-adjacency (${<}1,2{>}$-segment) in the segment of $\pi '$ connecting a to b, namely e, and hence e is not in F(x, I). Therefore, we can construct a new permutation $\pi $ by cutting e in $\pi '$ and joining a to b. This proves part (b). $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

da Silva, P.H., Jamshidpey, A., Sankoff, D. (2024). Sampling Gene Adjacencies and Geodesic Points of Random Genomes. In: Scornavacca, C., Hernández-Rosales, M. (eds) Comparative Genomics. RECOMB-CG 2024. Lecture Notes in Computer Science(), vol 14616. Springer, Cham. https://doi.org/10.1007/978-3-031-58072-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-58072-7_10
Published: 15 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58071-0
Online ISBN: 978-3-031-58072-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sampling Gene Adjacencies and Geodesic Points of Random Genomes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Compromise or optimize? The breakpoint anti-median

Comparative genomics meets topology: a novel view on genome median and halving problems

Center Genome with Respect to the Rank Distance

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A: Proofs

Appendix A: Proofs

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us