Skip to main content

Identification of Variant Compositions in Related Strains Without Reference

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9702))

Included in the following conference series:

Abstract

Current DNA sequencing technologies do not read an entire chromosome from end to end but instead produce sets of short reads, i.e. fragments of the genome. Haplotype assembly is the problem of assigning each read to the correct chromosome in the set of chromosomes in a homologous group, with the aid of the reference sequence. In this paper, we extend an existing exact algorithm for haplotype assembly of diploid species (Patterson et al., 2014) to the reference-free, polyploid case. A reference-free method does not exploit a reference genomic sequence of a species and thus we cannot exploit a known linear order for the reads and resulting variant positions. Therefore we obtain an unordered variant composition as a result. This setting can be also applied to the study of relative abundances of related bacterial strains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Aguiar, D., Istrail, S.: Haplotype assembly in polyploid genomes and identical by descent shared tracts. Bioinformatics 29(13), i352–i360 (2013). http://bioinformatics.oxfordjournals.org/content/29/13/i352.abstract

    Article  Google Scholar 

  2. Astrovskaya, I., et al.: Inferring viral quasispecies spectra from 454 pyrosequencing reads. BMC Bioinform. 12(Suppl. 6), S1 (2011)

    Article  Google Scholar 

  3. Bayzid, S., et al.: HMEC: a heuristic algorithm for individual haplotyping with minimum error correction. ISRN Bioinform. 2013, 10 (2013)

    Article  Google Scholar 

  4. Berger, E., et al.: Haptree: a novel bayesian framework for single individual polyplotyping using NGS data. PLOS Comput. Biol. 10(3), e1003502 (2014)

    Article  Google Scholar 

  5. Chen, Z., Deng, F., Wang, L.: Exact algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 29(16), 1938–1945 (2013)

    Article  Google Scholar 

  6. Cilibrasi, R., van Iersel, L., Kelk, S., Tromp, J.: On the complexity of several haplotyping problems. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 128–139. Springer, Heidelberg (2005). http://dx.doi.org/10.1007/11557067_11

    Chapter  Google Scholar 

  7. Correll, D.S.: The Potato and Its Wild Relatives. Texas Research Foundation, Renner (1962)

    Google Scholar 

  8. Das, S., Vikalo, H.: SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics 16(1), 260 (2015). http://dx.org/10.1186/s12864-015-1408-5

    Article  Google Scholar 

  9. Deng, F., Cui, W., Wang, L.: A highly accurate heuristic algorithm for the haplotype assembly problem. BMC Genomics 14(Suppl. 2), S2 (2013)

    Article  Google Scholar 

  10. He, D., et al.: Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 26(12), i183–i190 (2010). http://bioinformatics.oxfordjournals.org/content/26/12/i183.abstract

    Article  Google Scholar 

  11. Junttila, E.: Patterns in permuted binary matrices. Ph.D. thesis, University of Helsinki (2011)

    Google Scholar 

  12. Kuleshov, V.: Probabilistic single-individual haplotyping. Bioinformatics 30(17), i379–i385 (2014). http://bioinformatics.oxfordjournals.org/content/30/17/i379.abstract

    Article  Google Scholar 

  13. Lin, S., et al.: Haplotype inference in random population samples. Am. J. Hum. Genet. 71(5), 1129–1137 (2002)

    Article  Google Scholar 

  14. Lippert, R., et al.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinform. 3(1), 23–31 (2002). http://bib.oxfordjournals.org/content/3/1/23.abstract

    Article  Google Scholar 

  15. Mäkinen, V., et al.: Interval scheduling maximizing minimum coverage. CoRR abs/1508.07820 (2015). http://arxiv.org/abs/1508.07820

  16. Neigenfind, J., et al.: Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT. BMC Genomics 9, 356 (2008)

    Article  Google Scholar 

  17. Patterson, M., et al.: Whatshap: weighted haplotype assembly for future-generation sequencing reads. J. Comput. Biol. 22(6), 498–509 (2015)

    Article  Google Scholar 

  18. Rautiainen, M.: Identification of variant compositions in related strains without reference. Master’s thesis, University of Helsinki (2016)

    Google Scholar 

  19. Stephens, J.C., et al.: Haplotype variation and linkage disequilibrium in 313 human genes. Science 293(5529), 489–493 (2001). http://www.sciencemag.org/content/293/5529/489.abstract

    Article  Google Scholar 

  20. Su, S.Y., et al.: Inference of haplotypic phase and missing genotypes in polyploid organisms and variably copy number genomic regions. BMC Bioinform. 9, 513 (2008)

    Article  Google Scholar 

  21. Tewhey, R., et al.: The importance of phase information for human genetics. Nat. Rev. Genet. 12, 215–223 (2011)

    Article  Google Scholar 

  22. Uricaru, R., et al.: Reference-free detection of isolated SNPs. Nucleic Acids Res. 43(2), e11 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Academy of Finland (grants 267591 to L.S. and 284598 (CoECGR)).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Veli Mäkinen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Rautiainen, M., Salmela, L., Mäkinen, V. (2016). Identification of Variant Compositions in Related Strains Without Reference. In: Botón-Fernández, M., Martín-Vide, C., Santander-Jiménez, S., Vega-Rodríguez, M.A. (eds) Algorithms for Computational Biology. AlCoB 2016. Lecture Notes in Computer Science(), vol 9702. Springer, Cham. https://doi.org/10.1007/978-3-319-38827-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-38827-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-38826-7

  • Online ISBN: 978-3-319-38827-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics