MPI-HMMER-Boost: Distributed FPGA Acceleration

Walters, John Paul; Meng, Xiandong; Chaudhary, Vipin; Oliver, Tim; Yeow, Leow Yuan; Schmidt, Bertil; Nathan, Darran; Landman, Joseph

doi:10.1007/s11265-007-0062-9

John Paul Walters¹,
Xiandong Meng²,
Vipin Chaudhary³,
Tim Oliver⁴,
Leow Yuan Yeow⁴,
Bertil Schmidt⁵,
Darran Nathan⁴ &
…
Joseph Landman⁶

247 Accesses
17 Citations
3 Altmetric
Explore all metrics

Abstract

HMMER, based on the profile Hidden Markov Model (HMM) is one of the most widely used sequence database searching tools, allowing researchers to compare HMMs to sequence databases or sequences to HMM databases. Such searches often take many hours and consume a great number of CPU cycles on modern computers. We present a cluster-enabled hardware/software-accelerated implementation of the HMMER search tool hmmsearch. Our results show that combining the parallel efficiency of a cluster with one or more high-speed hardware accelerators (FPGAs) can significantly improve performance for even the most time consuming searches, often reducing search times from several hours to minutes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Introduction to Bioinformatics

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Article Open access 19 January 2019

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

Article Open access 12 April 2024

References

Swissprot protein sequence database. http://www.ebi.ac.uk/swissprot/, 2006.
Uniref sequence database. http://www.ebi.ac.uk/uniref/, 2006.
S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D. J. Lipman, “Basic Local Alignment Search Tool,” J Mol Biol, vol. 215, no. 3, October 1990, pp. 403–410.
Google Scholar
A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats and S.R. Eddy, “The Pfam Protein Families Database,” Nucleic Acid Res., vol. 32, 2004, pp. 38–141.
Article Google Scholar
G. Burns, R. Daoud and J. Vaigl, “LAM: An Open Cluster Environment for MPI,” in Proc. of Supercomputing Symposium, 1994, pp. 379–386.
G. Chukkapalli, C. Guda and S. Subramaniam, “SledgeHMMER: A Web Server for Batch Searching the Pfam Database,” Nucleic Acids Res., vol. 32, 2004(Web Server issue).
R. Durbin, S. Eddy, A. Krogh and A. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.
S. Eddy, “HMMER: Profile HMMs for Protein Sequence Analysis,” http://hmmer.wustl.edu, 2006.
S.R. Eddy, “Profile Hidden Markov Models,” Bioinformatics, vol. 14, no. 9, 1998.
The MPI Forum, “MPI: A Message Passing Interface,” Proc. of the Supercomputing Conference, 1993, pp. 878–883.
W. Gropp, E. Lusk, N. Doss and A. Skjellum, “A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard,” Parallel Comput., vol. 22, no. 6, September 1996, pp. 789–828.
Article MATH Google Scholar
W.D. Gropp and E. Lusk, “User’s Guide for mpich, a Portable Implementation of MPI,” Mathematics and Computer Science Division, Argonne National Laboratory, 1996. ANL-96/6.
D.R. Horn, M. Houston and P. Hanrahan, “Clawhmmer: A Streaming Hmmer-Search Implementation,” in SC ’05: The International Conference on High Performance Computing, Networking and Storage, 2005.
H.H.J. Hum, O. Maquelin, K.B. Theobald, X. Tian, G.R. Gao and L.J. Hendren, “A Study of the Earth-Manna Multithreaded System,” Int. J. Parallel Program., vol. 24, no. 4, 1996, pp. 319–348.
Google Scholar
Intel Corporation. “SSE2: Streaming SIMD (Single Instruction Multiple Data) Second Extensions,” http://www.intel.com, 2006.
J. Landman, J. Ray and J.P. Walters, “Accelerating Hmmer Searches on Opteron Processors with Minimally Invasive Recoding,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 2 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 628–636.
E. Lindahl, “Altivec-Accelerated HMM Algorithms,” http://lindahl.sbc.su.se/, 2005.
R.P. Maddimsetty, J. Buhler, R. Chamberlain, M. Franklin and B. Harris, “Accelerator Design for Protein Sequence Hmm Search,” in Proc. of the 20th ACM International Conference on Supercomputing (ICS06), ACM, 2006, pp. 287–296.
Myricom, “Mpich-Gm Software,” http://www.myri.com/scs/download-mpichgm.html.
NCBI, “Position-specific iterated BLAST,” http://www.ncbi.nlm.nih.gov/BLAST/.
S. Needleman and C. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of two Sequences,” J. Mol. Biol., vol. 48, no. 3, 1970.
Google Scholar
T.F. Oliver, B. Schmidt, J. Yanto and D.L. Maskell, “Acclerating the Viterbi Algorithm for Profile Hidden Markov Models Using Reconfigurable Hardware,” Lect. Notes Comput. Sci., vol. 3991, 2006, pp. 522–529.
Article Google Scholar
Pfam, “The PFAM HMM Library: A Large Collection of Multiple Sequence Alignments and Hidden Markov Models Covering Many Common Protein Families,” http://pfam.wustl.edu, 2006.
Progeniq, “BioBoost Accelerator Platform,” http://www.progeniq.com/, 2006.
T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Mol. Biol., vol. 147, 1981.
V.S. Sunderam, “PVM: A Framework for Parallel Distributed Computing,” Concurrency: Pract. Exper., vol. 2, no. 4, 1990, pp. 315–339.
Article Google Scholar
TimeLogic BioComputing Solutions, “DecypherHMM,” http://www.timelogic.com/, 2006.
A.J. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm,” IEEE Trans. Inf. Theory, vol. IT-13, 1967, pp. 260–269.
Article MATH Google Scholar
J.P. Walters, J. Landman and V. Chaudhary, “Optimized Cluster-Enabled Hmmer Searches,” in To appear in Grids for Bioinformatics and Computational Biology, E.G. Talbi and A. Zomaya (Eds.), Wiley & Sons, 2007.
J.P. Walters, B. Qudah and V. Chaudhary, “Accelerating the Hmmer Sequence Analysis Suite Using Conventional Processors,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 1 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 289–294.
B.Wun, J. Buhler and P. Crowley, “Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor,” in PACT ’05: Proc. of the 2005 International Conference on Parallel Architectures and Compilation Techniques, 2005.
W. Zhu, Y. Niu, J. Lu and G.R. Gao,” Implementing Parallel Hmm-Pfam on the Earth Mulithreaded Architecture,” in The 2nd IEEE Computer Society Bioinformatics Conference, 2003.

Download references

Author information

Authors and Affiliations

Institute for Scientific Computing, Wayne State University, Detroit, MI, 48202, USA
John Paul Walters
Electrical and Computer Engineering Department, Wayne State University, Detroit, MI, 48202, USA
Xiandong Meng
Department of Computer Science and Engineering University at Buffalo, The State University of New York, Buffalo, NY, 14260, USA
Vipin Chaudhary
Progeniq Pte Ltd., 8 Prince George’s Park, 118407, Singapore, Singapore
Tim Oliver, Leow Yuan Yeow & Darran Nathan
UNSW Asia, 1 Kay Siang Road, 248922, Queenstown, Singapore
Bertil Schmidt
Scalable Informatics LLC, 2433 Woodmont, Canton, MI, 48188, USA
Joseph Landman

Authors

John Paul Walters
View author publications
You can also search for this author in PubMed Google Scholar
Xiandong Meng
View author publications
You can also search for this author in PubMed Google Scholar
Vipin Chaudhary
View author publications
You can also search for this author in PubMed Google Scholar
Tim Oliver
View author publications
You can also search for this author in PubMed Google Scholar
Leow Yuan Yeow
View author publications
You can also search for this author in PubMed Google Scholar
Bertil Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Darran Nathan
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Landman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John Paul Walters.

Additional information

John Paul Walters: This research was supported in part by NSF IGERT grant 9987598 and the Institute for Scientific Computing at Wayne State University.

Vipin Chaudhary: This research was supported in part by NSF IGERT grant 9987598, the Institute for Scientific Computing at Wayne State University, MEDC/Michigan Life Science Corridor, and NYSTAR.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walters, J.P., Meng, X., Chaudhary, V. et al. MPI-HMMER-Boost: Distributed FPGA Acceleration. J VLSI Sign Process Syst Sign Im 48, 223–238 (2007). https://doi.org/10.1007/s11265-007-0062-9

Download citation

Received: 15 December 2006
Revised: 05 March 2007
Accepted: 17 March 2007
Published: 03 August 2007
Issue Date: September 2007
DOI: https://doi.org/10.1007/s11265-007-0062-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MPI-HMMER-Boost: Distributed FPGA Acceleration

Abstract

Access this article

Similar content being viewed by others

Introduction to Bioinformatics

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MPI-HMMER-Boost: Distributed FPGA Acceleration

Abstract

Access this article

Similar content being viewed by others

Introduction to Bioinformatics

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation