Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 294))

Abstract

Pattern discovery is one of the fundamental tasks in bioinformatics and pattern recognition is a powerful technique for searching sequence patterns in the biological sequence databases. The significant increase in the number of DNA and protein sequences expands the need for raising the performance of pattern matching algorithms. For this purpose, heterogeneous architectures can be a good choice due to their potential for high performance and energy efficiency. In this paper we present an efficient implementation of Aho- Corasick (AC) and PFAC (Parallel Failureless Aho-Corasick) algorithm on a heterogeneous CPU/GPU architecture. We progressively redesigned the algorithms and data structures to fit on the GPU architecture. Our results on different protein sequence data sets show 15% speedup comparing to the original implementation of the PFAC algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Villa, O., Tumeo, A.: Accelerating DNA analysis applications on GPU clusters. In: 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76 (2010)

    Google Scholar 

  2. Dios, F., Daneshtalab, M., Ebrahimi, M., Carabaño, J.: An Exploration of Heterogeneous Systems. In: 8th IEEE 8th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), pp. 1–7 (2013)

    Google Scholar 

  3. Aho, A.V., Corasick, M.J.: Efficient String Matching: An Aid to Bibliographic Search. ACM 18, 333–340 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  4. Tsai, S.-Y., Liu, C.-H., Chang, S.-C., Shyu, J.-M., Lin, C.-H.: Accelerating String Matching Using Multi-threaded Algorithm on GPU. IEEE (2010)

    Google Scholar 

  5. Villa, O., Sciuto, D., Tumeo, A.: Efficient Pattern Matching on GPUs for Intrusion Detection System, pp. 87–88. ACM (May 2010)

    Google Scholar 

  6. Zha, X., Sahni, S.: Multipattern String Matching on A GPU. In: IEEE, pp. 277–282 (2011)

    Google Scholar 

  7. Zha, X., Sahni, S.: GPU-to-GPU and Host-to-Host Multipattern String Matching on A GPU. Computer and Information Science and Engineering. University of Florida, Florida (2011)

    Google Scholar 

  8. Moore, J.S., Boyer, R.S.: A Fast String Searching Algorithm. Communications of the ACM 20, 762–772 (1997)

    Google Scholar 

  9. Lee, M., Hong, S., Shin, M., Tran, N.-P.: Memory Efficient Parallelization for Aho-Corasick Algorithm on a GPU. In: IEEE 14th International Conference on High Performance Computing and Communications, pp. 432–438 (2012)

    Google Scholar 

  10. Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: High Performance Network Intrusion Detection Using Graphics Processors. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 116–134. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Chen, H., Shi, S., Peng, J.: The GPU-based string matching system in adavanced AC algorithm. In: 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 1158–1163 (2010)

    Google Scholar 

  12. Villa, O., Chavarría-Miranda, D.G., Tumeo, A.: Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures. IEEE Transactions on Parallel and Distributed Systems 23, 436–443 (2012)

    Article  Google Scholar 

  13. Liu, C.-H., Chien, L.-S., Chang, S.-C., Lin, C.-H.: Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPU. IEEE Transactions on Computers 62(10), 1906–1916 (2013)

    Article  MathSciNet  Google Scholar 

  14. Motwani, M., Saxena, A., Haseeb, S.: Serial and Parallel Bayesian Spam Filtering using Aho-Corasick and PFAC. International Journal of Computer Applications 74, 9–14 (2013)

    Google Scholar 

  15. Rasool, A., Khare, N., Agarwal, C.: PFAC Implementation Issues and their Solutions on GPGPU’s using OpenCL. International Journal of Computer Applications 72, 52–58 (2013)

    Article  Google Scholar 

  16. Villa, O., Tumeo, A.: Accelerating DNA analysis applications on GPU Clusters. In: IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76 (2010)

    Google Scholar 

  17. Tsai, S.-Y., Liu, C.-H., Chang, S.-C., Shyu, J.-M., Li, C.-H.: "Accelerating String Matching Using Multi-threaded Algorithm on GPU. In: IEEE Globecom (2010)

    Google Scholar 

  18. Moore, J.S., Boyer, R.S.: A fast string searching algorithm. Communications of the ACM 20, 762–772 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shima Soroushnia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Soroushnia, S., Daneshtalab, M., Plosila, J., Liljeberg, P. (2014). Heterogeneous Parallelization of Aho-Corasick Algorithm. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07581-5_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07580-8

  • Online ISBN: 978-3-319-07581-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics