Skip to main content

Massively Parallel Sequence Alignment with BLAST Through Work Distribution Implemented Using PCJ Library

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10393))

Abstract

This article presents massively parallel execution of the BLAST algorithm on supercomputers and HPC clusters using thousands of processors. Our work is based on the optimal splitting up the set of queries running with the non-modified NCBI-BLAST package for sequence alignment. The work distribution and search management have been implemented in Java using a PCJ (Parallel Computing in Java) library. The PCJ-BLAST package is responsible for reading sequence for comparison, splitting it up and start multiple NCBI-BLAST executables. We also investigated a problem of parallel I/O and thanks to PCJ library we deliver high throughput execution of BLAST. The presented results show that using Java and PCJ library we achieved very good performance and efficiency. In result, we have significantly reduced time required for sequence analysis. We have also proved that PCJ library can be used as an efficient tool for fast development of the scalable applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.ncbi.nlm.nih.gov/BLAST [Accessed: June 8, 2017].

  2. 2.

    http://pcj.icm.edu.pl [Accessed: March 20, 2016].

References

  1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)

    Article  Google Scholar 

  2. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  3. Braun, R.C., Pedretti, K.T., Casavant, T.L., Scheetz, T.E., Birkett, C.L., Roberts, C.A.: Parallelization of local BLAST service on workstation clusters. Future Gener. Comput. Syst. 17(6), 745–754 (2001)

    Article  MATH  Google Scholar 

  4. Cofer, H.: SGI® High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE™ and SGI UV™ Systems. Np: Silicon Graphics International (2012)

    Google Scholar 

  5. Chi, E.H.H., Shoop, E., Carlis, J., Retzel, E., Riedl, J.: Efficiency of shared-memory multiprocessors for a genetic sequence similarity search algorithm. Technical report, University of Minnesota, CS Department, vol. TR97-05 (1997)

    Google Scholar 

  6. Darling, A., Carey, L., Feng, W.C.: The design, implementation, and evaluation of mpiBLAST. In: Proceedings of ClusterWorld Conference and Expo in Conjunction with the 4th International Conference on Linux Clusters: The HPC Revolution 2003, San Jose, CA, pp. 13–15 (2003)

    Google Scholar 

  7. Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST(r): a parallel implementation of BLAST built on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 0183. IEEE (2002)

    Google Scholar 

  8. Mathog, D.R.: Parallel BLAST on split databases. Bioinformatics 19(14), 1865–1866 (2003)

    Article  Google Scholar 

  9. Lin, H., Ma, X., Chandramohan, P., Geist, A., Samatova, N.: Efficient data access for parallel BLAST. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2005), Washington, DC, USA. IEEE Computer Society (2005)

    Google Scholar 

  10. Nowicki, M., Górski, Ł., Grabrczyk, P., Bała, P.: PCJ - Java library for high performance computing in PGAS model. In: International Conference on High Performance Computing and Simulation, HPCS 2014, pp. 202–209. IEEE (2014)

    Google Scholar 

  11. Numrich, R.W., Reid, J.: Co-array Fortran for parallel programming. ACM SIGPLAN Fortran Forum 17(2), 1–31 (1998). ACM

    Article  Google Scholar 

  12. Carlson, W.W., Draper, J.M., Culler, D.E., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and language specification (Vol. 576). Technical report CCS-TR-99-157, IDA Center for Computing Sciences (1999)

    Google Scholar 

  13. Hilfinger, P., Bonachea, D., Datta, K., Gay, D., Graham, S., Liblit, B., Pike, G., Su, J., Yelick, K.: Titanium language reference manual. UC Berkeley Technical report, UCB/EECS-2005-15, Berkeley, California, USA (2005)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank CHIST-ERA consortium for financial support under HPDCJ project (Polish part funded by NCN grant 2014/14/Z/ST6/00007) and NordForsk for the support within NIASC consortium. The performance tests have been performed using ICM University of Warsaw computational facilities.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Marek Nowicki or Piotr Bała .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nowicki, M., Bzhalava, D., Bała, P. (2017). Massively Parallel Sequence Alignment with BLAST Through Work Distribution Implemented Using PCJ Library. In: Ibrahim, S., Choo, KK., Yan, Z., Pedrycz, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science(), vol 10393. Springer, Cham. https://doi.org/10.1007/978-3-319-65482-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65482-9_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65481-2

  • Online ISBN: 978-3-319-65482-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics