Abstract
HMMER, based on the profile Hidden Markov Model (HMM) is one of the most widely used sequence database searching tools, allowing researchers to compare HMMs to sequence databases or sequences to HMM databases. Such searches often take many hours and consume a great number of CPU cycles on modern computers. We present a cluster-enabled hardware/software-accelerated implementation of the HMMER search tool hmmsearch. Our results show that combining the parallel efficiency of a cluster with one or more high-speed hardware accelerators (FPGAs) can significantly improve performance for even the most time consuming searches, often reducing search times from several hours to minutes.
Similar content being viewed by others
References
Swissprot protein sequence database. http://www.ebi.ac.uk/swissprot/, 2006.
Uniref sequence database. http://www.ebi.ac.uk/uniref/, 2006.
S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D. J. Lipman, “Basic Local Alignment Search Tool,” J Mol Biol, vol. 215, no. 3, October 1990, pp. 403–410.
A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats and S.R. Eddy, “The Pfam Protein Families Database,” Nucleic Acid Res., vol. 32, 2004, pp. 38–141.
G. Burns, R. Daoud and J. Vaigl, “LAM: An Open Cluster Environment for MPI,” in Proc. of Supercomputing Symposium, 1994, pp. 379–386.
G. Chukkapalli, C. Guda and S. Subramaniam, “SledgeHMMER: A Web Server for Batch Searching the Pfam Database,” Nucleic Acids Res., vol. 32, 2004(Web Server issue).
R. Durbin, S. Eddy, A. Krogh and A. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.
S. Eddy, “HMMER: Profile HMMs for Protein Sequence Analysis,” http://hmmer.wustl.edu, 2006.
S.R. Eddy, “Profile Hidden Markov Models,” Bioinformatics, vol. 14, no. 9, 1998.
The MPI Forum, “MPI: A Message Passing Interface,” Proc. of the Supercomputing Conference, 1993, pp. 878–883.
W. Gropp, E. Lusk, N. Doss and A. Skjellum, “A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard,” Parallel Comput., vol. 22, no. 6, September 1996, pp. 789–828.
W.D. Gropp and E. Lusk, “User’s Guide for mpich, a Portable Implementation of MPI,” Mathematics and Computer Science Division, Argonne National Laboratory, 1996. ANL-96/6.
D.R. Horn, M. Houston and P. Hanrahan, “Clawhmmer: A Streaming Hmmer-Search Implementation,” in SC ’05: The International Conference on High Performance Computing, Networking and Storage, 2005.
H.H.J. Hum, O. Maquelin, K.B. Theobald, X. Tian, G.R. Gao and L.J. Hendren, “A Study of the Earth-Manna Multithreaded System,” Int. J. Parallel Program., vol. 24, no. 4, 1996, pp. 319–348.
Intel Corporation. “SSE2: Streaming SIMD (Single Instruction Multiple Data) Second Extensions,” http://www.intel.com, 2006.
J. Landman, J. Ray and J.P. Walters, “Accelerating Hmmer Searches on Opteron Processors with Minimally Invasive Recoding,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 2 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 628–636.
E. Lindahl, “Altivec-Accelerated HMM Algorithms,” http://lindahl.sbc.su.se/, 2005.
R.P. Maddimsetty, J. Buhler, R. Chamberlain, M. Franklin and B. Harris, “Accelerator Design for Protein Sequence Hmm Search,” in Proc. of the 20th ACM International Conference on Supercomputing (ICS06), ACM, 2006, pp. 287–296.
Myricom, “Mpich-Gm Software,” http://www.myri.com/scs/download-mpichgm.html.
NCBI, “Position-specific iterated BLAST,” http://www.ncbi.nlm.nih.gov/BLAST/.
S. Needleman and C. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of two Sequences,” J. Mol. Biol., vol. 48, no. 3, 1970.
T.F. Oliver, B. Schmidt, J. Yanto and D.L. Maskell, “Acclerating the Viterbi Algorithm for Profile Hidden Markov Models Using Reconfigurable Hardware,” Lect. Notes Comput. Sci., vol. 3991, 2006, pp. 522–529.
Pfam, “The PFAM HMM Library: A Large Collection of Multiple Sequence Alignments and Hidden Markov Models Covering Many Common Protein Families,” http://pfam.wustl.edu, 2006.
Progeniq, “BioBoost Accelerator Platform,” http://www.progeniq.com/, 2006.
T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Mol. Biol., vol. 147, 1981.
V.S. Sunderam, “PVM: A Framework for Parallel Distributed Computing,” Concurrency: Pract. Exper., vol. 2, no. 4, 1990, pp. 315–339.
TimeLogic BioComputing Solutions, “DecypherHMM,” http://www.timelogic.com/, 2006.
A.J. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm,” IEEE Trans. Inf. Theory, vol. IT-13, 1967, pp. 260–269.
J.P. Walters, J. Landman and V. Chaudhary, “Optimized Cluster-Enabled Hmmer Searches,” in To appear in Grids for Bioinformatics and Computational Biology, E.G. Talbi and A. Zomaya (Eds.), Wiley & Sons, 2007.
J.P. Walters, B. Qudah and V. Chaudhary, “Accelerating the Hmmer Sequence Analysis Suite Using Conventional Processors,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 1 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 289–294.
B.Wun, J. Buhler and P. Crowley, “Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor,” in PACT ’05: Proc. of the 2005 International Conference on Parallel Architectures and Compilation Techniques, 2005.
W. Zhu, Y. Niu, J. Lu and G.R. Gao,” Implementing Parallel Hmm-Pfam on the Earth Mulithreaded Architecture,” in The 2nd IEEE Computer Society Bioinformatics Conference, 2003.
Author information
Authors and Affiliations
Corresponding author
Additional information
John Paul Walters: This research was supported in part by NSF IGERT grant 9987598 and the Institute for Scientific Computing at Wayne State University.
Vipin Chaudhary: This research was supported in part by NSF IGERT grant 9987598, the Institute for Scientific Computing at Wayne State University, MEDC/Michigan Life Science Corridor, and NYSTAR.
Rights and permissions
About this article
Cite this article
Walters, J.P., Meng, X., Chaudhary, V. et al. MPI-HMMER-Boost: Distributed FPGA Acceleration. J VLSI Sign Process Syst Sign Im 48, 223–238 (2007). https://doi.org/10.1007/s11265-007-0062-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0062-9