Skip to main content
Log in

A lightweight BLASTP and its implementation on CUDA GPUs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The BLAST server in the National Center for Biotechnology Information in the USA receives tens of thousands of queries per day on average. However, the service is always the same for every query even though query lengths vary significantly. In fact, the lengths of a large portion of protein sequences are less than 500. On the other hand, the hit detection process consumes the most of the execution time of BLAST and its core architecture is a lookup table. Following the above reasons, we propose a lightweight BLASTP for servicing not-too-long queries, where a hybrid query-index table is proposed accordingly. Each table entry consists of four bytes that can store up to three query positions. Therefore, a sequence word usually requires only one memory fetch to retrieve its hit information. Furthermore, additional dummy entries are embedded into the table and interleaved with original entries. The entries without any hits and dummy entries both can be used to buffer spilled query positions. The above features result in a much smaller lookup table with a higher utilization rate and a lower cache miss ratio. Experimental results show that the lightweight BLASTP outperforms CUDA-BLASTP with speedups ranging from 1.82 to 3.37 based on the first two critical phases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

(Adapted from Trends in Genetics. Please refer to Ref. [25] for more detailed information)

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Mount David W (2001) Bioinformatics–sequence and genome analysis. CSHL, New York, pp 75–85

    Google Scholar 

  2. Waterman MS (1981) Identification of common molecular subsequence. Mol Biol 147:195–197

    Article  Google Scholar 

  3. Gotoh O (1982) An improved algorithm for matching biological sequences. J Mol Biol 162(3):705–708

    Article  Google Scholar 

  4. Altschul SF et al (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    Article  Google Scholar 

  5. Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402

    Article  Google Scholar 

  6. Ye W et al (2017) H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs. Bioinformatics 33(8):1130–1138

    Google Scholar 

  7. Rangwala H et al (2005) Massively parallel BLAST for the Blue Gene/L. In: High Availability and Performance Workshop

  8. Basic local alignment search tool. https://blast.ncbi.nlm.nih.gov/Blast.cgi/

  9. Zhang J et al (2016) muBLASTP: database-indexed protein sequence search on multicore CPUs. BMC Bioinform 17(1):443

    Article  Google Scholar 

  10. Camacho C et al (2009) BLAST + : architecture and applications. BMC Bioinformatics 10(1):421

    Article  Google Scholar 

  11. Oehmen CS, Baxter DJ (2013) ScalaBLAST 2.0: rapid and robust BLAST calculations on multiprocessor systems. Bioinformatics 29(6):797–798

    Article  Google Scholar 

  12. Darling AE, Carey L, Feng WC (2003) The design, implementation, and evaluation of mpiBLAST. No. LA-UR-03-2862. Los Alamos National Laboratory

  13. de Castro MR et al (2017) SparkBLAST: scalable BLAST processing using in-memory operations. BMC Bioinform 18(1):318

    Article  Google Scholar 

  14. Matsunaga A, Tsugawa M, Fortes J (2008) Cloudblast: combining mapreduce and virtualization on distributed resources for bioinformatics applications. In: IEEE Fourth International Conference on eScience, 2008. eScience’08. IEEE

  15. Zhang J, Wang H, Feng W (2017) cublastp: fine-grained parallelization of protein sequence search on cpu + gpu. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 14(4):830–843

    Article  Google Scholar 

  16. Zhang J et al (2014) cuBLASTP: Fine-grained parallelization of protein sequence search on a GPU. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE

  17. Xiao S, Lin H, Feng W (2011) Accelerating protein sequence search in a heterogeneous computing system. In: 2011 IEEE International Parallel and Distributed Processing Symposium. IEEE

  18. Vouzis PD, Sahinidis NV (2010) GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2):182–188

    Article  Google Scholar 

  19. Liu W, Schmidt B, Muller-Wittig W (2011) CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware. IEEE/ACM Trans Comput Biol Bioinform 8(6):1678–1684

    Article  Google Scholar 

  20. Zhao K, Chu X (2014) G-BLASTN: accelerating nucleotide alignment by graphics processors. Bioinformatics 30(10):1384–1391

    Article  Google Scholar 

  21. Wan N et al (2009) A preliminary exploration on parallelized BLAST algorithm using GPU. Comput Eng Sci 31:98–112

    Google Scholar 

  22. FSA-BLAST. http://fsa-blast.sourceforge.net/

  23. Cameron M, Williams HE, Cannane A (2006) A deterministic finite automaton for faster protein hit detection in BLAST. J Comput Biol 13(4):965–978

    Article  MathSciNet  Google Scholar 

  24. Glasco D (2012) An analysis of BLASTP implementation on NVIDIA GPUs

  25. Zhang J (2000) Protein-length distributions for the three domains of life. Trends Genet 16(3):107–109

    Article  Google Scholar 

  26. CUDA toolkit documentation. https://docs.nvidia.com/cuda/

  27. CUDA GPUs. https://developer.nvidia.com/cuda-gpus

  28. NCBI Genbank. ftp://ftp.ncbi.nlm.nih.gov/genbank/

Download references

Acknowledgements

This work was supported in part by a grant from Ministry of Science and Technology, Taiwan, under the Contract No. MOST106-2221-E-018-010.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Chin Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, LT., Wei, KC., Wu, CC. et al. A lightweight BLASTP and its implementation on CUDA GPUs. J Supercomput 77, 322–342 (2021). https://doi.org/10.1007/s11227-020-03267-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03267-1

Keywords

Navigation