Skip to main content

Victim Selection Policies for Intel TBB: Overheads and Energy Footprint

  • Conference paper
Book cover Architecture of Computing Systems – ARCS 2014 (ARCS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8350))

Included in the following conference series:

Abstract

With the wide adoption of Chip Multiprocessors (CMPs), software developers need to switch to parallel programming to reach the performance potential of CMPs and maximize their energy efficiency. Management overheads due to parallelization can cause sub-linear speedups and increase the energy consumption of parallel programs. In this paper, we investigate the parallelization overheads of Intel TBB with a particular focus on its victim selection policy. We implement an “all knowing” oracle victim selection scheme as well as a pseudo-random scheme and compare them against TBB’s default random selection policy. We also break down TBB’s parallelization overheads and report how basic operations like task spawning, task stealing and task de-queuing impact the energy footprint. Our experiments show that failed task stealing is by far the highest energy consumer. In fact, the oracle victim selection policy can reduce the application energy footprint by 13.6% compared to TBB’s default policy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Borkar, S., Chien, A.A.: The Future of Microprocessors. Commun. ACM 54(5) (May 2011)

    Google Scholar 

  2. Fuller, S., Millett, L.: Computing Performance: Game Over or Next Level? Computer 44(1) (2011)

    Google Scholar 

  3. Cilk++: A quick, easy and reliable way to improve threaded performance, http://software.intel.com/en-us/articles/intel-cilk-plus/ (accessed September 15, 2013)

  4. Leijen, D., Schulte, W., Burckhardt, S.: The Design of a Task Parallel Library. In: Proc. of the 24th Conf. on Object Oriented Programming, Systems Languages and Applications (2009)

    Google Scholar 

  5. Faxén, K.F.: Wool - A Work Stealing Library. SIGARCH Computer Architecture News 36(5) (2008)

    Google Scholar 

  6. Intel Corporation. Intel Threading Building Blocks Reference Manual, http://threadingbuildingblocks.org/ (accessed September 15, 2013)

  7. Bienia, C.: Benchmarking Modern Multiprocessors. PhD thesis, Princeton University (January 2011)

    Google Scholar 

  8. Patterson, D.: The Trouble With Multicore. IEEE Spectrum 47(7) (2010)

    Google Scholar 

  9. Pan, H., Hindman, B., Asanović, K.: Composing Parallel Software Efficiently with Lithe. In: Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation (2010)

    Google Scholar 

  10. Carlson, T.E., Heirman, W., Eeckhout, L.: Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-Core Simulations. In: Int’l Conf. for High Performance Computing, Networking, Storage and Analysis (2011)

    Google Scholar 

  11. Genbrugge, D., Eyerman, S., Eeckhout, L.: Interval Simulation: Raising the Level of Abstraction in Architectural Simulation. In: Proc. of the IEEE 16th Int’l Symp. on High Performance Computer Architecture (2010)

    Google Scholar 

  12. Miller, J., Kasture, H., Kurian, G., Gruenwald, C., Beckmann, N., Celio, C., Eastep, J., Agarwal, A.: Graphite: A Distributed Parallel Simulator for Multicores. In: Proc. of the IEEE 16th Int’l Symp. on High Performance Computer Architecture (2010)

    Google Scholar 

  13. Li, S., Ahn, J., Strong, R., Brockman, J., Tullsen, D., Jouppi, N.: McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multi-Core and Many-Core Architectures. In: Proc. of the 42nd Annual IEEE/ACM International Symp. on Microarchitecture (2009)

    Google Scholar 

  14. Contreras, G., Martonosi, M.: Characterizing and Improving the Performance of Intel Threading Building Blocks. In: IEEE Int’l Symp. on Workload Characterization (2008)

    Google Scholar 

  15. Li, J., Martínez, J.: Power-Performance Considerations of Parallel Computing on Chip Multiprocessors. ACM Trans. Archit. Code Optim. 2 (2005)

    Google Scholar 

  16. Bhattacharjee, A., Martonosi, M.: Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors. In: Proc. of the 36th Annual Int’l Symp. on Computer Architecture (2009)

    Google Scholar 

  17. Podobas, A., Brorsson, M., Faxén, K.F.: A Comparison of Some Recent Task-based Parallel Programming Models. In: Third Workshop on Programmability Issues for Multi-Core Computers (2009)

    Google Scholar 

  18. Iordan, A.C., Jahre, M., Natvig, L.: On the Energy Footprint of Task Based Parallel Applications. In: Proc. of the Int’l Conf. on High Performance Computing & Simulation (2013)

    Google Scholar 

  19. Faxén, K.F.: Efficient Work Stealing for Fine Grained Parallelism. In: 39th Int’l Conf. on Parallel Processing (2010)

    Google Scholar 

  20. Vandierendonck, H., Pratikakis, P., Nikolopoulos, D.S.: Parallel Programming of General-Purpose Programs Using Task-Based Programming Models. In: Proc. of the 3rd USENIX Conference on Hot Topic in Parallelism, HotPar 2011 (2011)

    Google Scholar 

  21. Chen, X., Chen, W., Li, J., Zheng, Z., Shen, L., Wang, Z.: Characterizing Fine-Grain Parallelism on Modern Multicore Platform. In: IEEE 17th Int’l Conf. on Parallel and Distributed Systems (2011)

    Google Scholar 

  22. Marowka, A.: TBBench: A Micro-Benchmark Suite for Intel Threading Building Blocks. JIPS 8(2) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Iordan, A.C., Jahre, M., Natvig, L. (2014). Victim Selection Policies for Intel TBB: Overheads and Energy Footprint. In: Maehle, E., Römer, K., Karl, W., Tovar, E. (eds) Architecture of Computing Systems – ARCS 2014. ARCS 2014. Lecture Notes in Computer Science, vol 8350. Springer, Cham. https://doi.org/10.1007/978-3-319-04891-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-04891-8_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-04890-1

  • Online ISBN: 978-3-319-04891-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics