Identification of Variant Compositions in Related Strains Without Reference

Rautiainen, Mikko; Salmela, Leena; Mäkinen, Veli

doi:10.1007/978-3-319-38827-4_13

Mikko Rautiainen¹⁸,
Leena Salmela¹⁸ &
Veli Mäkinen¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9702))

Included in the following conference series:

International Conference on Algorithms for Computational Biology

597 Accesses
1 Citations

Abstract

Current DNA sequencing technologies do not read an entire chromosome from end to end but instead produce sets of short reads, i.e. fragments of the genome. Haplotype assembly is the problem of assigning each read to the correct chromosome in the set of chromosomes in a homologous group, with the aid of the reference sequence. In this paper, we extend an existing exact algorithm for haplotype assembly of diploid species (Patterson et al., 2014) to the reference-free, polyploid case. A reference-free method does not exploit a reference genomic sequence of a species and thus we cannot exploit a known linear order for the reads and resulting variant positions. Therefore we obtain an unordered variant composition as a result. This setting can be also applied to the study of relative abundances of related bacterial strains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Aguiar, D., Istrail, S.: Haplotype assembly in polyploid genomes and identical by descent shared tracts. Bioinformatics 29(13), i352–i360 (2013). http://bioinformatics.oxfordjournals.org/content/29/13/i352.abstract
Article Google Scholar
Astrovskaya, I., et al.: Inferring viral quasispecies spectra from 454 pyrosequencing reads. BMC Bioinform. 12(Suppl. 6), S1 (2011)
Article Google Scholar
Bayzid, S., et al.: HMEC: a heuristic algorithm for individual haplotyping with minimum error correction. ISRN Bioinform. 2013, 10 (2013)
Article Google Scholar
Berger, E., et al.: Haptree: a novel bayesian framework for single individual polyplotyping using NGS data. PLOS Comput. Biol. 10(3), e1003502 (2014)
Article Google Scholar
Chen, Z., Deng, F., Wang, L.: Exact algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 29(16), 1938–1945 (2013)
Article Google Scholar
Cilibrasi, R., van Iersel, L., Kelk, S., Tromp, J.: On the complexity of several haplotyping problems. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 128–139. Springer, Heidelberg (2005). http://dx.doi.org/10.1007/11557067_11
Chapter Google Scholar
Correll, D.S.: The Potato and Its Wild Relatives. Texas Research Foundation, Renner (1962)
Google Scholar
Das, S., Vikalo, H.: SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics 16(1), 260 (2015). http://dx.org/10.1186/s12864-015-1408-5
Article Google Scholar
Deng, F., Cui, W., Wang, L.: A highly accurate heuristic algorithm for the haplotype assembly problem. BMC Genomics 14(Suppl. 2), S2 (2013)
Article Google Scholar
He, D., et al.: Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 26(12), i183–i190 (2010). http://bioinformatics.oxfordjournals.org/content/26/12/i183.abstract
Article Google Scholar
Junttila, E.: Patterns in permuted binary matrices. Ph.D. thesis, University of Helsinki (2011)
Google Scholar
Kuleshov, V.: Probabilistic single-individual haplotyping. Bioinformatics 30(17), i379–i385 (2014). http://bioinformatics.oxfordjournals.org/content/30/17/i379.abstract
Article Google Scholar
Lin, S., et al.: Haplotype inference in random population samples. Am. J. Hum. Genet. 71(5), 1129–1137 (2002)
Article Google Scholar
Lippert, R., et al.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinform. 3(1), 23–31 (2002). http://bib.oxfordjournals.org/content/3/1/23.abstract
Article Google Scholar
Mäkinen, V., et al.: Interval scheduling maximizing minimum coverage. CoRR abs/1508.07820 (2015). http://arxiv.org/abs/1508.07820
Neigenfind, J., et al.: Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT. BMC Genomics 9, 356 (2008)
Article Google Scholar
Patterson, M., et al.: Whatshap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22(6), 498–509 (2015)
Article Google Scholar
Rautiainen, M.: Identification of variant compositions in related strains without reference. Master’s thesis, University of Helsinki (2016)
Google Scholar
Stephens, J.C., et al.: Haplotype variation and linkage disequilibrium in 313 human genes. Science 293(5529), 489–493 (2001). http://www.sciencemag.org/content/293/5529/489.abstract
Article Google Scholar
Su, S.Y., et al.: Inference of haplotypic phase and missing genotypes in polyploid organisms and variably copy number genomic regions. BMC Bioinform. 9, 513 (2008)
Article Google Scholar
Tewhey, R., et al.: The importance of phase information for human genetics. Nat. Rev. Genet. 12, 215–223 (2011)
Article Google Scholar
Uricaru, R., et al.: Reference-free detection of isolated SNPs. Nucleic Acids Res. 43(2), e11 (2014)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Academy of Finland (grants 267591 to L.S. and 284598 (CoECGR)).

Author information

Authors and Affiliations

Department of Computer Science, Helsinki Institute for Information Technology, University of Helsinki, Helsinki, Finland
Mikko Rautiainen, Leena Salmela & Veli Mäkinen

Authors

Mikko Rautiainen
View author publications
You can also search for this author in PubMed Google Scholar
Leena Salmela
View author publications
You can also search for this author in PubMed Google Scholar
Veli Mäkinen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Veli Mäkinen .

Editor information

Editors and Affiliations

CETA-Ciemat, Trujillo, Spain
María Botón-Fernández
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
University of Extremadura, Cáceres, Spain
Sergio Santander-Jiménez
University of Extremadura, Cáceres, Spain
Miguel A. Vega-Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rautiainen, M., Salmela, L., Mäkinen, V. (2016). Identification of Variant Compositions in Related Strains Without Reference. In: Botón-Fernández, M., Martín-Vide, C., Santander-Jiménez, S., Vega-Rodríguez, M.A. (eds) Algorithms for Computational Biology. AlCoB 2016. Lecture Notes in Computer Science(), vol 9702. Springer, Cham. https://doi.org/10.1007/978-3-319-38827-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-38827-4_13
Published: 12 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38826-7
Online ISBN: 978-3-319-38827-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics