Accelerating Viterbi algorithm on graphics processing units

Hanif, Muhammad Kashif; Zimmermann, Karl-Heinz

doi:10.1007/s00607-017-0557-6

Accelerating Viterbi algorithm on graphics processing units

Published: 19 May 2017

Volume 99, pages 1105–1123, (2017)
Cite this article

Computing Aims and scope Submit manuscript

Muhammad Kashif Hanif¹ &
Karl-Heinz Zimmermann²

398 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

Viterbi algorithm is used in different scientific applications including biological sequence alignment, speech recognition, and probabilistic inference. However, high computational complexity of the Viterbi algorithm is a major concern. Accelerating the Viterbi algorithm is important, especially when the number of states or the length of the sequences increase significantly. In this paper, a parallel solution to improve the performance of Viterbi algorithm is presented. This is achieved by formulating a matrix product based algorithm. This algorithm has been mapped to a NVIDIA graphics processing unit. The performance for different parameters and realizations are compared. The results depicts matrix product is not a viable option for small number of states. However, matrix product solution using shared memory for large number of states gains good performance when compared with the serial version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ahn C, Kim J, Ju J, Choi J, Choi B, Choi S (2011) Implementation of an SDR platform using GPU and its application to a \(2\times 2\) mimo wimax system. Analog Integr Circuits Signal Process 69(2–3):107–117
Article Google Scholar
Buck I, Foley T, Horn D, Sugerman J, Fatahalian K, Houston M, Hanrahan P (2004) Brook for GPUs: stream computing on graphics hardware. ACM Trans Graph 23(3):777–786
Article Google Scholar
Chan TM (2007) More algorithms for all-pairs shortest paths in weighted graphs. In: Proceedings of the thirty-ninth annual ACM symposium on theory of computing, STOC’07, pp 590–598. ACM
Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9(3):251–280
Article MathSciNet MATH Google Scholar
Du Z, Yin Z, Bader DA (2010) A tile-based parallel Viterbi algorithm for biological sequence alignment on GPU with CUDA. In: Proceedings of the 24th IEEE international symposium on parallel and distributed processing, IPDPS’10, pp 1–8. IEEE
Durbin R, Eddy SR, Krogh A, Mitchison GJ (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge
Book MATH Google Scholar
Eddy SR (1995) Multiple alignment using hidden Markov models. In: Proceeding of international conference on intelligent systems for molecular biology, pp 114–120
Fink GA (2008) Markov models for pattern recognition: from theory to applications. Springer, Berlin
MATH Google Scholar
Forney GD (1973) The Viterbi algorithm. Proc IEEE 61:268–278
Article MathSciNet Google Scholar
Ganesan N, Chamberlain RD, Buhler J, Taufer M (2010) Accelerating HMMER on GPUs by implementing hybrid data and task parallelism. In: Proceedings of the first ACM international conference on bioinformatics and computational biology, pp 418–421
Hanif MK (2014) Mapping dynamic programming algorithms on graphics processing units. Ph.D. thesis, Institute of Computer Technology, Hamburg University of Technology
Hanif MK, Zimmermann KH (2012) Graphics card processing: accelerating profile–profile alignment. Cent Eur J Comput Sci 2:367–388
Google Scholar
Horn DR, Houston M, Hanrahan P (2005) ClawHMMER: a streaming HMMer-search implementation. In: Proceedings of the 2005 ACM/IEEE conference on supercomputing, SC’05. IEEE Computer Society
Humayun A, Asif M, Hanif MK (2017) Btas: A library for tropical algebra. Int J Comput Sci Inf Secur 14:220–225
Google Scholar
Kim J, Hyeon S, Choi S (2010) Implementation of an SDR system using graphics processing unit. IEEE Commun Mag 48(3):156–162
Article Google Scholar
Li J, Chen S, Li Y (2009) The fast evaluation of hidden Markov models on GPU. In: IEEE international conference on intelligent computing and intelligent systems, ICIS’09, vol 4, pp 426–430
Li R, Dou Y, Li Y, Wang S (2013) A fully parallel truncated Viterbi decoder for software defined radio on GPUS. In: 2013 IEEE wireless communications and networking conference (WCNC), pp 4305–4310. IEEE
Li R, Dou Y, Zou D (2014) Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA. Concurr Comput Pract Exp 26(3):821–840
Article Google Scholar
Lifshits Y, Mozes S, Weimann O, Ziv-Ukelson M (2009) Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54(3):379–399
Article MathSciNet MATH Google Scholar
Lin CS, Liu WL, Yeh WT, Chang LW, Hwu WMW, Chen SJ, Hsiung PA (2011) A tiling-scheme Viterbi decoder in software defined radio for GPUs. In: 2011 7th international conference on wireless communications, networking and mobile computing (WiCOM), pp 1–4. IEEE
Liu C (2009) CuHMM: a CUDA implementation of hidden Markov model training and classification. Technical report, Johns Hopkins University
MATLAB (2010) version 7.10.0 (R2010a). The MathWorks Inc., Natick, MA
Mozes S, Weimann O, Ziv-Ukelson M (2007) Speeding up HMM decoding and training by exploiting sequence repetitions. In: 18th annual symposium combinatorial pattern matching, CPM 2007, Lecture Notes in Computer Science, vol 4580, pp 4–15. Springer
Nath R, Tomov S, Dongarra J (2010) An improved Magma Gemm for Fermi graphics processing units. Int J High Perform Comput Appl 24(4):511–515
Article Google Scholar
Nielsen J, Sand A (2011) Algorithms for a parallel implementation of hidden Markov models with a small state space. In: Proceedings of the 25th IEEE international symposium on parallel and distributed processing, IPDPS’11, pp 452–459. IEEE Computer Society
NVIDIA (2015) NVIDIA CUDA Compute Unified Device Architecture Programming Guide
Pachter L, Alexandersson M, Cawley S (2002) Applications of generalized pair hidden Markov models to alignment and gene finding problems. J Comput Biol 9(2):389–399
Article Google Scholar
Pachter L, Sturmfels B (2005) Algebraic statistics for computational biology. Cambridge University Press, Cambridge
Book MATH Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp 257–286
Rabiner LR, Juang BH (1986) An introduction to hidden Markov models. IEEE Trans Acoust Speech Signal Process Mag 3:4–16
Google Scholar
Sand A, Kristiansen M, Pedersen CNS, Mailund T (2013) zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm. BMC Bioinform 14:339
Article Google Scholar
Strassen V (1969) Gaussian elimination is not optimal. Numer Math 13:354–356
Article MathSciNet MATH Google Scholar
Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2):260–269
Article MATH Google Scholar
Walters JP, Balu V, Kompalli S, Chaudhary V (2009) Evaluating the use of GPUs in liver image segmentation and HMMER database searches. In: Proceedings of the 23rd IEEE international symposium on parallel and distributed processing, IPDPS’09, pp 1–12. IEEE Computer Society
Zhang D, Zhao R, Han L, Wang T, Qu J (2009) An Implementation of Viterbi algorithm on GPU. In: Proceedings of the First IEEE international conference on information science and engineering, ICISE’09, pp 121–124
Zimmermann K-H (2016) Algebraic statistics. TUBdok, Hamburg University of Technology

Download references

Author information

Authors and Affiliations

Department of Computer Science, Government College University, Faisalabad, Pakistan
Muhammad Kashif Hanif
Institute of Embedded Systems, Hamburg University of Technology, 21071, Hamburg, Germany
Karl-Heinz Zimmermann

Authors

Muhammad Kashif Hanif
View author publications
You can also search for this author in PubMed Google Scholar
Karl-Heinz Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Kashif Hanif.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hanif, M.K., Zimmermann, KH. Accelerating Viterbi algorithm on graphics processing units. Computing 99, 1105–1123 (2017). https://doi.org/10.1007/s00607-017-0557-6

Download citation

Received: 20 April 2016
Accepted: 10 May 2017
Published: 19 May 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s00607-017-0557-6

Keywords

Mathematics Subject Classification

68W10

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating Viterbi algorithm on graphics processing units

Abstract

Access this article

Similar content being viewed by others

MSA-GPU: Exact Multiple Sequence Alignment Using GPU

An Efficient Cache-oblivious Parallel Viterbi Algorithm

Robustifying the Viterbi Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Accelerating Viterbi algorithm on graphics processing units

Abstract

Access this article

Similar content being viewed by others

MSA-GPU: Exact Multiple Sequence Alignment Using GPU

An Efficient Cache-oblivious Parallel Viterbi Algorithm

Robustifying the Viterbi Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation