Skip to main content

Resolving Load Balancing Issues in BWA on NUMA Multicore Architectures

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8385))

Abstract

Running BWA in multithreaded mode on a multi-socket server results in poor scaling behaviour. This is because the current parallelisation strategy does not take into account the load imbalance that is inherent to the properties of the data being aligned, e.g. varying read lengths and numbers of mutations. Additional load imbalance is also caused by the BWA code not anticipating certain hardware characteristics of multi-socket multicores, such as the non-uniform memory access time of the different cores. We show that rewriting the parallel section using Cilk removes the load imbalance, resulting in a factor two performance improvement over the original BWA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We always refer to the latest version of BWA, i.e. the bwa-0.6.2 download on [2].

  2. 2.

    This is actually configured via the -t parameter.

  3. 3.

    Averages for 5 runs. Same distribution of reads across cores for each run.

  4. 4.

    Graphs omitted due to space restrictions, see our technical report [10].

References

  1. Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)

    Article  Google Scholar 

  2. Burrows-Wheeler Aligner. http://bio-bwa.sourceforge.net/

  3. Leiserson, C.E.: The Cilk++ concurrency platform. J. Supercomput. 51(3), 244–257 (2010). (Kluwer Academic Publishers)

    Article  Google Scholar 

  4. Intel Cilk Plus. http://software.intel.com/en-us/intel-cilk-plus

  5. Farragina, P., Manzini, G.: Opportunistic data structures with applications. In: 41st IEEE Annual Symposium on Foundations of Computer Science, pp. 390–398. IEEE Computer Society, Los Alamitos (2000)

    Google Scholar 

  6. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25:1–R25:10 (2009). (Article: R25)

    Google Scholar 

  7. Li, R., Yu, C., et al.: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)

    Article  Google Scholar 

  8. Genomes Project. http://www.1000genomes.org/

  9. Peters, D., Luo, X., Qiu, K., Liang, P.: Speeding up large-scale next generation sequencing data analysis with pBWA. J. Appl. Bioinform. Comput. Biol. 1(1), 1–6 (2012)

    Google Scholar 

  10. Herzeel, C., Costanza, P., Ashby, T., Wuyts, R.: Performance analysis of BWA alignment. Technical report, ExaScience Life Lab (2013)

    Google Scholar 

Download references

Acknowledgments

This work is funded by Intel, Janssen Pharmaceutica and by the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charlotte Herzeel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Herzeel, C., Ashby, T.J., Costanza, P., De Meuter, W. (2014). Resolving Load Balancing Issues in BWA on NUMA Multicore Architectures. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8385. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55195-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55195-6_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55194-9

  • Online ISBN: 978-3-642-55195-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics