ABSTRACT
A growing number of energy optimization solutions operate at the application runtime level. Despite delivering promising results, these application-scoped optimizations are fundamentally greedy: They assume to have an exclusive access to power management and often perform poorly when multiple power-managing applications co-exist, or different threads of the same application share power management hardware. In this paper, we introduce AEQUITAS, a first step to address this critical yet largely overlooked problem. The insight behind AEQUITAS is that co-existing applications may view power-managing hardware as a shared resource and coordinate power management decisions. As a concrete instance of this philosophy, we evaluated our ideas on top of a state-of-the-art energy-efficient work-stealing runtime. Experiments show that without AEQUITAS, multiple co-existing power-managing application runtimes suffer up to 32% performance loss and negate all power savings. With AEQUITAS, the beneficial energy-performance tradeoff reported in the single-application setting (12.9% energy savings and 2.5% performance loss) can be retained, but in a much more challenging setting where multiple power-managing runtimes co-exist on parallel architectures and multiple CPU cores share the same power domain.
- Acar, U. A., Chargueraud, A., and Rainey, M. Scheduling parallel programs by work stealing with private deques. In PPoPP '13 (2013), pp. 219--228. Google ScholarDigital Library
- Baek, W., and Chilimbi, T. M. Green: a framework for supporting energy-conscious programming using controlled approximation. In PLDI'10 (2010), pp. 198--209. Google ScholarDigital Library
- Bartenstein, T., and Liu, Y. D. Green streams for data-intensive software. In ICSE'13 (2013). Google ScholarDigital Library
- Blelloch, G., Fineman, J., Gibbons, P., Kyrola, A., Shun, J., Tangwonsan, K., and Simhadri, H. V. Problem based benchmark suite, 2012.Google Scholar
- Blumofe, R. D. Executing Multithreaded Programs Efficiently. PhD thesis, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 1995. Google ScholarDigital Library
- Blumofe, R. D., and Leiserson, C. E. Scheduling multithreaded computations by work stealing. J. ACM 46, 5 (1999), 720--748. Google ScholarDigital Library
- Burd, T. D., and Brodersen, R. W. Design issues for dynamic voltage scaling. In ISLPED'00 (2000), pp. 9--14. Google ScholarDigital Library
- Cohen, M., Zhu, H. S., Emgin, S. E., and Liu, Y. D. Energy types. In OOPSLA'12 (2012). Google ScholarDigital Library
- Cong, G., Kodali, S., Krishnamoorthy, S., Lea, D., Saraswat, V., and Wen, T. Solving large, irregular graph problems using adaptive work-stealing. In ICPP'08 (2008), pp. 536--545. Google ScholarDigital Library
- Corporation, N. I. Ni labview signalexpress, January 2014.Google Scholar
- Ding, X., Wang, K., Gibbons, P. B., and Zhang, X. Bws: balanced work stealing for time-sharing multicores. In EuroSys'12 (2012), pp. 365--378. Google ScholarDigital Library
- Fei, Y., Zhong, L., and Jha, N. An energy-aware framework for coordinated dynamic software management in mobile computers. In MASCOTS'04 (2004), pp. 306--317. Google ScholarDigital Library
- Flinn, J., and Satyanarayanan, M. Energy-aware adaptation for mobile applications. In SOSP'99 (1999), pp. 48--63. Google ScholarDigital Library
- Frigo, M., Leiserson, C. E., and Randall, K. H. The implementation of the cilk-5 multithreaded language. In PLDI'98 (1998), pp. 212--223. Google ScholarDigital Library
- Horowitz, M., Indermaur, T., and Gonzalez, R. Low-power digital design. In Low Power Electronics, 1994. Digest of Technical Papers., IEEE Symposium (1994), pp. 8--11.Google ScholarCross Ref
- Hsu, C.-H., and Kremer, U. The design, implementation, and evaluation of a compiler algorithm for cpu energy reduction. In PLDI'03 (2003), pp. 38--48. Google ScholarDigital Library
- Intel. Intel cilk plus, 2015.Google Scholar
- Intel. Intel threading building blocks (intel tbb), 2015.Google Scholar
- Isci, C., and Martonosi, M. Runtime power monitoring in high-end processors: Methodology and empirical data. In MICRO'03 (2003), p. 93. Google ScholarDigital Library
- Kandemir, M., Vijaykrishnan, N., Irwin, M. J., and Ye, W. Influence of compiler optimizations on system power. In DAC'00 (2000), pp. 304--307. Google ScholarDigital Library
- Kansal, A., Saponas, S., Brush, A. B., McKinley, K. S., Mytkowicz, T., and Ziola, R. The latency, accuracy, and battery (lab) abstraction: Programmer productivity and energy efficiency for continuous mobile context sensing. In OOPSLA'13 (2013), pp. 661--676. Google ScholarDigital Library
- Koufaty, D., Reddy, D., and Hahn, S. Bias scheduling in heterogeneous multi-core architectures. In EuroSys'10 (2010), pp. 125--138. Google ScholarDigital Library
- Kumar, V., Blackburn, S. M., and Grove, D. Friendly barriers: Efficient work-stealing with return barriers. In VEE '14 (2014), pp. 165--176. Google ScholarDigital Library
- Kumar, V., Frampton, D., Blackburn, S. M., Grove, D., and Tardieu, O. Work-stealing without the baggage. In OOPSLA'12 (2012), pp. 297--314. Google ScholarDigital Library
- Lea, D. A java fork/join framework. In Proceedings of the ACM 2000 conference on Java Grande (2000), JAVA'00, pp. 36--43. Google ScholarDigital Library
- Leijen, D., Schulte, W., and Burckhardt, S. The design of a task parallel library. In OOPSLA'09 (2009), pp. 227--242. Google ScholarDigital Library
- Liu, K., Pinto, G., and Liu, Y. D. Data-oriented characterization of application-level energy optimization. In FASE '15.Google Scholar
- Liu, Y. D. Energy-efficient synchronization through program patterns. In Proceedings of GREENS'12 (2012). Google ScholarDigital Library
- Marlow, S., Peyton Jones, S., and Singh, S. Runtime support for multicore haskell. In ICFP'09 (2009), pp. 65--78. Google ScholarDigital Library
- Michael, M. M., Vechev, M. T., and Saraswat, V. A. Idempotent work stealing. In PPoPP'09 (2009), pp. 45--54. Google ScholarDigital Library
- Morrison, A., and Afek, Y. Fence-free work stealing on bounded tso processors. In ASPLOS '14 (2014), pp. 413--426. Google ScholarDigital Library
- Pinheiro, E., Bianchini, R., Carrera, E. V., and Heath, T. Load balancing and unbalancing for power and performance in cluster-based systems. In Workshop on compilers and operating systems for low power (2001), vol. 180, Barcelona, Spain, pp. 182--195.Google Scholar
- Pinto, G., Castor, F., and Liu, Y. D. Understanding energy behaviors of thread management constructs. In OOPSLA'14 (October 2014). Google ScholarDigital Library
- Ribic, H., and Liu, Y. D. Energy-efficient work-stealing language runtimes. In ASPLOS'14 (2014), pp. 513--528. Google ScholarDigital Library
- Roy, A., Rumble, S. M., Stutsman, R., Levis, P., Mazières, D., and Zeldovich, N. Energy management in mobile devices with the cinder operating system. In EuroSys'11 (2011), pp. 139--152. Google ScholarDigital Library
- Sampson, A., Dietl, W., Fortuna, E., Gnanapragasam, D., Ceze, L., and Grossman, D. Enerj: Approximate data types for safe and general low-power computation. In PLDI'11 (2011). Google ScholarDigital Library
- Shen, X., Zhong, Y., and Ding, C. Locality phase prediction. In ASPLOS'04 (2004), pp. 165--176. Google ScholarDigital Library
- Sherwood, T., Perelman, E., and Calder, B. Basic block distribution analysis to find periodic behavior and simulation points in applications. In PACT'01 (2001), pp. 3--14. Google ScholarDigital Library
- Sherwood, T., Perelman, E., Hamerly, G., and Calder, B. Automatically characterizing large scale program behavior. In ASPLOS'02 (2002), pp. 45--57. Google ScholarDigital Library
- Sherwood, T., Sair, S., and Calder, B. Phase tracking and prediction. In ISCA'03 (2003), pp. 336--349. Google ScholarDigital Library
- Taliver, Heath, T., Pinheiro, E., Hom, J., Kremer, U., and Bianchini, R. Code transformations for energy-efficient device management. IEEE Transactions on Computers 53 (2004), 2004. Google ScholarDigital Library
- Varatkar, G., and Marculescu, R. Communication-aware task scheduling and voltage selection for total systems energy minimization. In ICCAD'03 (2003), pp. 510--517. Google ScholarDigital Library
- Weiser, M., Welch, B., Demers, A., and Shenker, S. Scheduling for reduced cpu energy. In OSDI'94 (1994). Google ScholarDigital Library
- Xie, F., Martonosi, M., and Malik, S. Compile-time dynamic voltage scaling settings: opportunities and limits. In PLDI'03 (2003), pp. 49--62. Google ScholarDigital Library
- Yuan, W., and Nahrstedt, K. Energy-efficient soft real-time cpu scheduling for mobile multimedia systems. In SOSP'03 (2003), pp. 149--163. Google ScholarDigital Library
- Zeng, H., Ellis, C. S., Lebeck, A. R., and Vahdat, A. Ecosystem: managing energy as a first class operating system resource. In ASPLOS'02 (2002), pp. 123--132. Google ScholarDigital Library
- Zhang, Y., Hu, X., and Chen, D. Task scheduling and voltage selection for energy minimization. In Design Automation Conference, 2002. Proceedings. 39th (2002), pp. 183--188. Google ScholarDigital Library
- Zhu, H. S., Lin, C., and Liu, Y. D. A programming model for sustainable software. In ICSE'15 (2015). Google ScholarDigital Library
Recommendations
“Cool” Load Balancing for High Performance Computing Data Centers
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose ...
Energy-efficient work-stealing language runtimes
ASPLOS '14Work stealing is a promising approach to constructing multithreaded program runtimes of parallel programming languages. This paper presents HERMES, an energy-efficient work-stealing language runtime. The key insight is that threads in a work-stealing ...
Synchronization-Aware Energy Management for VFI-Based Multicore Real-Time Systems
Voltage and frequency island (VFI) was recently adopted as an effective energy management technique for multicore processors. For a set of periodic real-time tasks that access shared resources running on a VFI-based multicore system with dynamic voltage ...
Comments