Large Scale Protein Sequence Alignment Using FPGA Reprogrammable Logic Devices

Dydel, Stefan; Bała, Piotr

doi:10.1007/978-3-540-30117-2_5

Stefan Dydel¹⁹ &
Piotr Bała¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3203))

Included in the following conference series:

International Conference on Field Programmable Logic and Applications

1299 Accesses
20 Citations
3 Altmetric

Abstract

In this paper we show how to significantly accelerate Smith-Waterman protein sequence alignment algorithm using reprogrammable logic devices – FPGAs (Field Programmable Gate Array). Due to perfect sensitivity, the Smith-Waterman algorithm is important in a field of computational biology but computational complexity makes it impractical for large database searches when running on general purpose computers.

Current approach allows for aminoacid sequence alignment with full substitution matrix which leads to more complex formula than used in DNA alignment and is much more memory demanding. We propose different parellization scheme than commonly used systolic arrays, leading to full utilization of PUs (Processing Units), regardless of sequence length. FPGA based implementation of Smith-Waterman algorithm can accelerate sequence alignment on a Pentium desktop computer by two orders of magnitude comparing to standard OSEARCH program from FASTA package.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yu, C.W., Kwong, K.H., Lee, K.H., Leong, P.H.W.: A Smith-Waterman Systolic Cell. In: Proceedings of the Tenth International Workshop on Field Programmable Logic and Applications (FPL 2003), Lisbon, pp. 375–384 (2003)
Google Scholar
West, B., Chamberlain, R.D., Indeck, R., Zhang, Q.: An FPGA-based Search Engine for Unstructured Database. In: Proc. of 2nd Workshop on Application Specific Processors (December 2003)
Google Scholar
Weaver, N., Markovskiy, Y., Patel, Y., Wawrzynek, J.: Post Placement C-slow Retiming for the Xilinx Virtex FPGA. In: 11th ACM Symposium of Field Programmable Gate Arrays, FPGA (2003)
Google Scholar
Guccione, S.A., Keller, E.: Gene matching using JBits. In: Field-Programmable Logic and Applications, Reconfigurable Computing 12th International Conference, September 2-4, pp. 1168–1171 (2002)
Google Scholar
Yamaguchi, Y., Maruyama, T., Konagaya, A.: High Speed Homology Search with FPGAs. In: Pacific Symposium on Biocomputing, vol. 7, pp. 271–282 (2002)
Google Scholar
Rognes, T., Seeberg, E.: Six-fold speedup of Smith-Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8), 699–706 (2000)
Article Google Scholar
Lavenier, D.: Speeding up genome computations with a systolic accelerator. SIAM News 31(8) (October 1998)
Google Scholar
Hirshber, J.D., Hughey, R., Karplus, K., Kestrel: A Programmable Array for Sequence Analysis. In: Proc. Int. Conf. Application-Specific Systems, Architectures, and Processors, August 19-21, pp. 25–35. IEEE CS, Los Alamitos (1996)
Google Scholar
Lavenier, D.: SAMBA: Systolic Accelerators for Molecular Biological Applications, IRISA Report (PI-988) (March 1996)
Google Scholar
Hoang, D.T.: Searching genetic databases on splash 2. In: Proceedings 1993 IEEE Workshop on Field-Programmable Custom Computing Machines, pp. 185–192 (1993)
Google Scholar
Hoang, D.T.: FPGA Implementation of Systolic Sequence Alignment. In: International Workshop on Field Programmable Logic and Applications, Vienna, Austria, August 31-September 2 (1992)
Google Scholar
Lipton, R.J., Lopresti, D.: A systolic array for rapid string comparison. In: Proceedings of the Chapel Hill Conference on VLSI, pp. 363–376 (1985)
Google Scholar
Paracel, inc., http://www.paracel.com
Sencel’s search software, http://www.sencel.com
Celera genomics, inc., http://www.celera.com
Crochemore, M., Iliopoulos, C., Pinzon, Y., Reid, J.: A Fast and Practical Bit-Vector Algorithm for the Longest Common Subsequence Problem. Information Processing Letters 80(6), 279–285 (2001)
Article MATH MathSciNet Google Scholar
Smith, T.F., Waterman, M.S.: Identifcation of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)
Article Google Scholar
Waterman, M.S.: Introduction to Computational Biology: Sequences, Maps and Genomes. Chapman and Hall, London (1995)
Google Scholar
Pearson, W.R.: Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11(3), 635–650 (1991)
Article Google Scholar
Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85(8), 2444–2448 (1988)
Article Google Scholar
Pearson, W.R.: Rapid and sensitive sequence comparison with fastp and fasta. Methods in Enzymology 183, 63–98 (1990)
Article Google Scholar
Ma, B., Tromp, J., Li, M.: PatternHunter: Faster and More Sensitive Homology Search. Bioinformatics 18(3), 440–445 (2002)
Article Google Scholar
Hertz, G.Z., Stormo, G.D.: Identifing DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7/8), 563–577 (1999)
Article Google Scholar
Davidson, A.: A Fast Pruning Algorithm for Optimal Sequence Alignment. In: Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001), pp. 49–56. IEEE Comput. Soc., Los Alamitos (2001)
Chapter Google Scholar
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. matl. Acad. Sci. USA 89, 10915–10919 (1992)
Article Google Scholar
Timelogic home page, http://www.timelogic.com
Xilinx home page, http://www.xilinx.com
Synplicity home page, http://www.synplicity.com
Opencores home page, http://www.opencores.org

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, N. Copernicus University, Chopina 12/8, 87-100, Torun, Poland
Stefan Dydel & Piotr Bała

Authors

Stefan Dydel
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Bała
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ITIV - Universitaet Karlsruhe (TH),
Jürgen Becker
University of Paderborn,
Marco Platzner
IMEC, Kapeldreef 75, Leuven, Leuven, Belgium
Serge Vernalde

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dydel, S., Bała, P. (2004). Large Scale Protein Sequence Alignment Using FPGA Reprogrammable Logic Devices. In: Becker, J., Platzner, M., Vernalde, S. (eds) Field Programmable Logic and Application. FPL 2004. Lecture Notes in Computer Science, vol 3203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30117-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-30117-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22989-6
Online ISBN: 978-3-540-30117-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics