ASW: Accelerating Smith–Waterman Algorithm on Coupled CPU–GPU Architecture

Zou, Huihui; Tang, Shanjiang; Yu, Ce; Fu, Hao; Li, Yusen; Tang, Wenjie

doi:10.1007/s10766-018-0617-3

ASW: Accelerating Smith–Waterman Algorithm on Coupled CPU–GPU Architecture

Published: 01 December 2018

Volume 47, pages 388–402, (2019)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Huihui Zou¹,
Shanjiang Tang¹,
Ce Yu¹,
Hao Fu¹,
Yusen Li² &
…
Wenjie Tang³

630 Accesses
9 Citations
Explore all metrics

Abstract

Smith–Waterman algorithm (SW) is a popular dynamic programming algorithm widely used in bioinformatics for local biological sequence alignment. Due to the $O(n^2)$ high time and space complexity of SW and growing size of biological data, it is crucial to accelerate SW for high performance. In view of the GPU high efficiency in science computation, many existing studies (e.g., CUDAlign, CUDASW++) speedup SW with GPU. However, the strong data dependency makes SW communication intensive, and the previous works fail to fully leverage the heterogeneous capabilities of the GPU machine for either the neglect of the CPU ability or the low bandwidth of PCI-e. In this paper, we propose ASW, which aims at accelerating SW algorithm with accelerated processing unit (APU), a heterogeneous processor integrates CPU and GPU in a single die and share the same memory. This coupled CPU–GPU architecture is more suitable for frequent data exchanging due to the elimination of PCI-e bus. For the full utilization of both CPU and GPU in APU system, ASW partitions the whole SW matrix into blocks and dynamically dispatches each block to CPU and GPU for the concurrent execution. A DAG-based dynamic scheduling method is presented to dispatch the workload automatically. Moreover, we also design a time cost model to determine the partition granularity in the matrix division phase. We have evaluated ASW on AMD A12 platform and our results show that ASW achieves a good performance of 7.2 GCUPS (gigacells update per second).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
Article Google Scholar
Branover, A., Foley, D., Steinman, M.: AMD fusion APU: Llano. IEEE Micro 32(2), 28–37 (2012)
Article Google Scholar
De Oliveira Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., Melo, A.C.M.: Cudalign 4.0: incremental speculative traceback for exact chromosome-wide alignment in GPU clusters. IEEE Trans. Parallel Distrib. Syst. 27(10), 2838–2850 (2016)
Article Google Scholar
De Oliveira Sandes, E.F., Miranda, G., De Melo, A.C., Martorell, X., Ayguade, E.: CUDAlign 3.0: parallel biological sequence comparison in large GPU clusters. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 160–169. IEEE (2014)
He, J., Lu, M., He, B.: Revisiting co-processing for hash joins on the coupled CPU–GPU architecture. Proc. VLDB Endow. 6(10), 889–900 (2013)
Article Google Scholar
He, J., Zhang, S., He, B.: In-cache query co-processing on coupled CPU–GPU architectures. Proc. VLDB Endow. 8(4), 329–340 (2014)
Article Google Scholar
Lipman, D.J., Pearson, W.R.: Rapid and sensitive protein similarity searches. Science 227(4693), 1435–1441 (1985)
Article Google Scholar
Liu, Y., Tran, T.T., Lauenroth, F., Schmidt, B.: SWAPHI-LS: Smith–Waterman algorithm on Xeon Phi coprocessors for long DNA sequences. In: IEEE International Conference on CLUSTER Computing, pp. 257–265 (2014)
Liu, Y., Wirawan, A., Schmidt, B.: Cudasw++ 3.0: accelerating Smith–Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinform. 14(1), 117 (2013)
Article Google Scholar
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Article Google Scholar
Rucci, E., García, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: OSWALD: OpenCL Smith–Waterman on Altera’s FPGA for large protein databases. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 3, pp. 208–213. IEEE (2015)
Rucci, E., Garcia, C., Botella, G., Giusti, A.D., Naiouf, M., Prieto-Matias, M.: Accelerating Smith–Waterman alignment of long DNA sequences with OpenCL on FPGA. In: International Conference on Bioinformatics and Biomedical Engineering, pp. 500–511 (2017)
Ryzen APU. https://www.amd.com/en/products/apu/amd-ryzen-5-2400g (2018). Accessed 12 Feb 2018
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Article Google Scholar
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66–73 (2010)
Article Google Scholar
Tang, S., He, B., Zhang, S., Niu, Z.: Elastic multi-resource fairness: balancing fairness and efficiency in coupled CPU–GPU architectures. In: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 875–886. IEEE (2016)
Tang, S., Yu, C., Sun, J., Lee, B.S., Zhang, T., Xu, Z., Wu, H.: EasyPDP: an efficient parallel dynamic programming runtime system for computational biology. IEEE Trans. Parallel Distrib. Syst. 23(5), 862–872 (2012)
Article Google Scholar
Zhang, F., Zhai, J., He, B., Zhang, S., Chen, W.: Understanding co-running behaviors on integrated CPU/GPU architectures. IEEE Trans. Parallel Distrib. Syst. 28(3), 905–918 (2017)
Article Google Scholar
Zhang, K., Hu, J., He, B., Hua, B.: DIDO: dynamic pipelines for in-memory key-value stores on coupled CPU-GPU architectures. In: IEEE International Conference on Data Engineering, pp. 671–682 (2017)
Zhang, F., Wu, B., Zhai, J., He, B., Chen, W.: FinePar: irregularity-aware fine-grained workload partitioning on integrated architectures. In: Proceedings of the 2017 International Symposium on Code Generation and Optimization, pp. 27–38. IEEE Press (2017)

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Jilin Province (CN) (61602336), National Natural Science Foundation of China (61370010, 61702527) and Natural Science Foundation of Tianjin City (18JCZDJC30800).

Author information

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, China
Huihui Zou, Shanjiang Tang, Ce Yu & Hao Fu
School of Computing, Nankai University, Tianjin, China
Yusen Li
College of Systems Engineering, National University of Defense Technology, Changsha, Hunan, China
Wenjie Tang

Authors

Huihui Zou
View author publications
Search author on:PubMed Google Scholar
Shanjiang Tang
View author publications
Search author on:PubMed Google Scholar
Ce Yu
View author publications
Search author on:PubMed Google Scholar
Hao Fu
View author publications
Search author on:PubMed Google Scholar
Yusen Li
View author publications
Search author on:PubMed Google Scholar
Wenjie Tang
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Shanjiang Tang, Ce Yu or Wenjie Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, H., Tang, S., Yu, C. et al. ASW: Accelerating Smith–Waterman Algorithm on Coupled CPU–GPU Architecture. Int J Parallel Prog 47, 388–402 (2019). https://doi.org/10.1007/s10766-018-0617-3

Download citation

Received: 24 September 2018
Accepted: 16 November 2018
Published: 01 December 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s10766-018-0617-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ASW: Accelerating Smith–Waterman Algorithm on Coupled CPU–GPU Architecture

Abstract

Access this article

Subscribe and save

Buy Now

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now