skip to main content
10.1145/2488551.2488593acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article

Parallel efficient aligner of pyrosequencing reads

Published: 15 September 2013 Publication History

Abstract

In bioinformatics, in the context of resequencing projects, the efficient and accurate mapping of reads to a reference genome is a critical problem. One instance of this problem is the local alignment of pyrosequencing reads produced by the 454 GS FLX system against a reference sequence, an instance for which the software tool TAPyR (Tool for the Alignment of Pyrosequencing Reads) was developed. TAPyR implements a methodology to efficiently solve this problem, which proved to yield results of a quality (both in terms of content and execution speed) higher than those of mainstream applications. With the goal of further improving this platform's results, we produced a parallel implementation of the query and reference sequence access procedures of the original version. Through the use of multithreading, this new version, P-TAPyR, produces considerable reductions in the processing time of queries, scaling with the amount of hardware-supported threads (not accounting for hyper-threading) available. For larger data sets, we were able to observe running times roughly 26 times faster than serial execution with 30 executing threads, showing an experimental (progressively-decreasing) execution serial fraction of 0.8% (determined by the Karp-Rabin Metric described in a posterior section). Herein we present the modifications made to this software tool to allow for parallel querying of reads against an indexed reference which, scales proportionally to the amount of available physical cores.

References

[1]
G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. AFIPS spring joint computer conference.
[2]
Ben Langmead, Cole Trapnell, Mihai Pop, and Steven L Salzberg. Ultrafast and memory-efficient alignment of short dna sequences to the human genome. Genome Biology, 10(3), 2009.
[3]
Francisco Fernandes, Paulo G. S. da Fonseca, Luis M. S. Russo, Arlindo L. Oliveira, and Ana T. Freitas. Efficient alignment of pyrosequencing reads for re-sequencing applications. BMC, 12(163), 2011.
[4]
Heng Li and Richard Durbin. Fast and accurate short read alignment with burrows wheeler transform. Bioinformatics, 25(14):1754--1760, 2009.
[5]
Jairo Balart, Alejandro Duran, Marc Gonzalez, Xavier Martorell, Eduard Ayguade, and Jesus Labarta. Experiences parallelizing a web server with openmp.
[6]
Liu CM, Wong T, Wu E, Luo R, Yiu SM, Li Y, Wang B, Yu C, Chu X, Zhao K, Li R, and Lam TW. SOAP3: ultra-fast GPU-based parallel alignment tool for short reads. Bioinformatics, 28(6):878--879, 3 2012.
[7]
M. Burrows, D. J. Wheeler, M. Burrows, and D. J. Wheeler. A block-sorting lossless data compression algorithm. Technical report, 1994.
[8]
Mihai Pop. Genome assembly reborn: recent computational challenges. Briefings in Bioinformatics, 10(4):354--366, 2009.
[9]
Ning Z., Cox A. J., and Mullikin J. C. SSAHA: a fast search method for large dna databases. Genome research, 11(10):1752--9, 2001.
[10]
M. J. Quinn. Parallel Programming in C with MPI and OpenMP, International Edition. McGraw Hill, 2003.
[11]
Ruiqiang Li, Chang Yu, Yingrui Li, Tak-Wah Lam, Siu-Ming Yiu, and J. Karsten Kristiansen. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics, 25(15):1966--1967, 2009.
[12]
A. S. Tanenbaum. Modern Operating Systems, 3rd Edition. Prentice Hall, 2007.
[13]
Ulrich Drepper. What every programmer should know about memory. Red Hat, Inc., 11 2007.

Index Terms

  1. Parallel efficient aligner of pyrosequencing reads

                        Recommendations

                        Comments

                        Information & Contributors

                        Information

                        Published In

                        cover image ACM Other conferences
                        EuroMPI '13: Proceedings of the 20th European MPI Users' Group Meeting
                        September 2013
                        289 pages
                        ISBN:9781450319034
                        DOI:10.1145/2488551
                        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                        Sponsors

                        • ARCOS: Computer Architecture and Technology Area, Universidad Carlos III de Madrid

                        In-Cooperation

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        Published: 15 September 2013

                        Permissions

                        Request permissions for this article.

                        Check for updates

                        Author Tags

                        1. TAPyR
                        2. mapper
                        3. parallel
                        4. query
                        5. resequencing
                        6. sequence

                        Qualifiers

                        • Research-article

                        Funding Sources

                        Conference

                        EuroMPI '13
                        Sponsor:
                        • ARCOS
                        EuroMPI '13: 20th European MPI Users's Group Meeting
                        September 15 - 18, 2013
                        Madrid, Spain

                        Acceptance Rates

                        EuroMPI '13 Paper Acceptance Rate 22 of 47 submissions, 47%;
                        Overall Acceptance Rate 66 of 139 submissions, 47%

                        Contributors

                        Other Metrics

                        Bibliometrics & Citations

                        Bibliometrics

                        Article Metrics

                        • 0
                          Total Citations
                        • 45
                          Total Downloads
                        • Downloads (Last 12 months)2
                        • Downloads (Last 6 weeks)0
                        Reflects downloads up to 13 Jan 2025

                        Other Metrics

                        Citations

                        View Options

                        Login options

                        View options

                        PDF

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader

                        Media

                        Figures

                        Other

                        Tables

                        Share

                        Share

                        Share this Publication link

                        Share on social media