On reducing energy management delays in disks
Introduction
The positive financial and environmental implications of energy conservation for stand-alone servers and workstations is significant [5], [9], [17], [38]. Consequently, organizations are resorting to overnight shut down of workstations, unneeded servers, and cooling systems to reduce energy consumption and the associated costs [44]. However, it is not always possible to power down systems where users frequently work remotely and at all hours, e.g., in academic institutions and global businesses. In these environments, dynamic management techniques are employed to allow energy optimization during periods of active system use.
Dynamic energy management relies on the fact that even when the system is actively used, there are periods of inactivity that can be exploited to power down devices. It is critical to be able to predict such idle periods accurately [19], [26] to improve energy efficiency. However, even accurate idle period detection and device shutdowns still expose power-on delays (such as disk wake-up delay); this exposure can affect system performance, irritate users, and reduce energy savings because systems now have to operate for longer durations to satisfy user requests. In the worst case, irate users may choose to disable energy management rather than deal with excessive delays. Therefore, the challenge lies in realizing energy savings by keeping the system powered down for as long as possible, yet reducing the performance impact associated with energy management delays, e.g., the response latency on powering the device up when needed.
The majority of existing approaches rely on powering-on the device to satisfy an arriving request. If the system waits until a request arrives, the delays are fully exposed. Alternatively, an accurate prediction of upcoming I/O requests and speculative powering-on of devices [11] can reduce power-on delays exposed to the execution path. However, those approaches still require powering on of the device and incur the resultant energy consumption cost. To eliminate any energy overhead and delays associated with powering on of the device, we can utilize available surrogate resources to satisfy the incoming request [36].
Disks can be significant energy consumers [51], [45], [42] and require a long time to spin up from a low-power mode. Furthermore, enterprise environments have hundreds of workstations, each containing a hard disk that is spinning needlessly during the periods of inactivity. Therefore, disk energy management practices, e.g., shutting down idle disks [19], are almost ubiquitous in some form or the other. In addition to energy management being prevalent, enterprise environments usually consist of similarly configured workstations to simplify deployment and maintenance. As a result, we focus on alleviating disk energy management delays by utilizing existing alternative sources, which are common in enterprise environments, to retrieve data to satisfy I/O requests destined for a disk in low-power mode. Finally, the largest benefit is achieved if the alternative source of data retrieval is the main memory in another system. Even a medium-size enterprise with as few as a hundred machines will have an application simultaneously running on at least two machines more than 99% of the time [46]. Therefore, obtaining data from alternative sources such as remote memories is easily achievable. We must note that there are security risks posed by sharing a binary across machines. Techniques such as self-certifying binaries [34] or secure binary sharing [23] can address these concerns.
Hard disks use a significant amount, 13% [25], of the system power, making them a prime candidate for energy conservation. However, turning off or operating disks in low power mode, poses delays and causes performance degradation. Fortunately, SARD is able to alleviate delays by utilizing other systems in p2p environment and improve energy efficiency. SARD can also co-exist with other energy management schemes. Merging SARD with self-learning energy management schemes [20], [41], [49], [24] may not be necessary as SARD already prevents system delays, but higher energy efficiency provided by those predictors can improve energy efficiency while SARD will maintain the performance.
Traditional energy management schemes like powering-off devices may lead to long delays and ultimately displeasing the end users, particularly in enterprise computing systems. Explicit delay management is critical in a successful energy management scheme, as reduction in delay has two benefits. First, the user experience is improved since the user will see fewer delays due to disk spinning up. As a result, the user is more likely to use energy management techniques as opposed to turning energy management off to prevent the irritating delays. Second, the shorter delays will allow the user to accomplish the task quicker, which increases the efficiency of the system.
Hiding latency is crucial in enabling a higher rate of adoption of existing power-saving techniques and leads to higher energy savings. As an additional bonus, servicing I/Os from alternate sources provides opportunities for keeping the local disks in low-power mode longer, and may reduce energy consumption. We avoid spinning up a disk when a user request arrives, thus hiding spin-up delays from the user, by exploiting the arrangement of local disks/file servers in enterprise environments. We present System-wide Alternative Retrieval of Data (SARD), a scheme to retrieve application binaries from loosely connected group of workstations. Rather than accessing the local disk or a central file server, SARD serves user requests by accessing the data from a peer-to-peer (p2p) network of workstations. SARD is enabled by the common practice of uniform configuration of enterprise computers, which is done to simplify system deployment and maintenance. Consequently, the application binaries are identical across multiple systems allowing SARD to satisfy a user request from a peer node.
Our approach shares some similarity to cooperative caching[13], [22]; however, SARD is distinct in its goal of reducing energy management delays. It uses a group of loosely-connected peers and eschews any additional hardware or significant software overhaul. SARD is designed to be minimally invasive to existing systems and: (1) does not require any custom buffering or shared memory infrastructure; (2) does not interfere with energy management of other systems; (3) does not require additional hardware resources; (4) requires few kernel modifications; and (5) allows participants to be loosely coupled and free to leave and join the system. Diskless workstations can offers similar features but are not scalable given the I/O-intensive nature of applications today. Furthermore, SARD exploits standard virtual memory mechanism and does not require any custom buffering that may increase memory pressure, neither does it require any shared virtual memory; hence, there is no extra overhead on memory systems. Using a p2p based solution eliminates the requirement of fixed configurations and no peer membership constraints are imposed. System management is simple in SARD because individual machines can join and leave the peer group freely. In addition, p2p infrastructure allows us to select only those peers that will not be adversely affected by serving SARD requests, something the alternatives such as LAN multicast [15], etc., cannot support. Thus, we achieve a low-overhead design that effectively helps minimize energy consumption in enterprise environments.
We evaluate SARD using a trace-driven simulator and an actual implementation in a real system. We observe 71% average reduction in energy management delays. Compared to the widely-used timeout-based disk energy management scheme, SARD achieves an additional, 5.1% average reduction in energy consumption as well.
We note here that SARD is a practical approach to reduce energy management delays in scenarios where hard disks, with their noticeable spin up delays, are the primary storage devices. Newer types of disks such as flash based solid state drives (SSDs) and other non-mechanical storage can potentially reduce or even eliminate spin-up delays. In addition, SSDs consume much lesser energy compared to hard disks. In such cases, disk energy management schemes may be rendered redundant. However, for now, the $/GB cost of SSDs remains high enough that we envision mechanical disks to continue as the main storage medium. Thus, techniques such as SARD will be useful and relevant in the foreseeable future.
Section snippets
Energy management is prevalent
With energy costs emerging as a significant fraction of operating budgets, large enterprises are pursuing energy management of workstations and disks. Monitors and computers are powered down during after-work hours. If necessary, remote wake-up technologies such as Wake-on-LAN [2] are employed to power-on machines. Keeping machines switched-off is the over-arching goal, and management tasks such as software updates and virus checks are scheduled accordingly. Dynamic energy management is
A survey of current disk energy management techniques
In this section, we present a survey of various disk energy management techniques in use today and examine if and how management delays are handled in them. Table 1 provides a summary of our findings with focus on what delays are exposed to the users. We do not claim this survey to be exhaustive; rather, it is meant to be representative of the wide flavor of prevalent energy management techniques and it sheds light on the spectrum of approaches to handling energy management delays.
Design of SARD
SARD is targeted at desktops and workstations in enterprise setups where all machines are owned and controlled centrally and have similar software configurations. Machines in an enterprise environment running SARD are joined together to form a p2p overlay network. This allows the machines to join and leave the network freely and allows for a decentralized design. The SARD module running on each machine advertises the applications running in memory, thus enabling other participants to learn what
Evaluation
We evaluate SARD in three ways: implementation based experiments to study the impact on performance, simulation based experiments to study the impact on energy savings for typical desktop applications, and a case study of SARD under real usage conditions.
Unless otherwise noted, the experiments are performed using Dell PCs with Intel 2.4 GHz dual core processor, 4 GB RAM, a high-end Seagate 250 GB hard disk, and connected using 1 Gbps Ethernet.
Remote data retrieval
We have presented one design approach to support remote data-retrieval in SARD. Approaches used in such projects as I/O offloading [32] and p2p-based backup storage systems [10] can also provide mechanisms that can help achieve the remote serving goals of SARD. These works are complementary to SARD, but SARD differs from them in its main focus on reducing energy management delays.
Shutdown prediction techniques
Disks have been found to be significant source of power consumption [51]. Numerous timeout-based techniques [16]
Conclusion
We have presented the design and evaluation of SARD—a p2p based minimally invasive software solution for reducing delays associated with energy management techniques in enterprise environments. Machines are often configured uniformly with identical software binaries and operating system versions in enterprises. In lieu of waking up a powered-down local disk, SARD leverages this uniformity by retrieving requested application binaries from remote nodes where the binary is already in memory. The
Acknowledgments
This material is based upon work supported by the NSF under Grant No: CNS-1016408, CNS-1016793, CCF-0746832, and CNS-1016198.
Krish K.R. is a Ph.D. student in the Computer Science Department at Virginia Polytechnic Institute and State University. He received his M.S. degree from SUNY, Albany and his B.E. degree from Anna university, Chennai. His research interest is in distributed systems with an emphasis on storage for system’s performance, reliability, availability and energy efficiency.
References (51)
- et al.
A self-organizing flock of Condors
Journal of Parallel and Distributed Computing
(2006) - Y. Agarwal, S. Hodges, R. Chandra, J. Scott, P. Bahl, R. Gupta, Somniloquy: augmenting network interfaces to reduce PC...
- AMD, Magic packet technical white paper, 2008....
- et al.
A case for networks of workstations
IEEE Micro
(1995) - et al.
Policy optimization for dynamic power management
IEEE Transactions on Computer-Aided Design
(1999) - R. Bianchini, R. Rajamony, Power and energy management for server systems, Tech. Rep. DCS-TR-528, Department of...
- et al.
The performance impact of kernel prefetching on buffer cache replacement algorithms
IEEE Transactions on Computers
(2007) - et al.
Kosha: a peer-to-peer enhancement for the network file system
Journal of Grid Computing
(2006) - J. Chase, D. Anderson, P. Thackar, A. Vahdat, R. Boyle, Managing energy and server resources in hosting centers, in:...
- et al.
Pastiche: making backup cheap and easy
SIGOPS Operating Systems Review
(2002)
Multicast routing in datagram internetworks and extended LANs
ACM Transactions on Computer Systems
Program counter-based prediction techniques for dynamic power management
IEEE Transactions on Computers
Energy dissipation in general purpose microprocessors
IEEE Journal of Solid-State Circuits
Using content-derived names for configuration management
A predictive system shutdown method for energy saving of event driven computation
ACM Transactions on Design Automation of Electronic Systems
Cited by (1)
Energy efficient VM migration revisited: SLA assurance and minimum service disruption with available hosts
2016, 2015 12th International Conference on High-Capacity Optical Networks and Enabling/Emerging Technologies, HONET-ICT 2015
Krish K.R. is a Ph.D. student in the Computer Science Department at Virginia Polytechnic Institute and State University. He received his M.S. degree from SUNY, Albany and his B.E. degree from Anna university, Chennai. His research interest is in distributed systems with an emphasis on storage for system’s performance, reliability, availability and energy efficiency.
Guanying Wang is a Ph.D. student in the Computer Science Department at Virginia Tech. He received his B.S. and M.S. degrees from Zhejiang University in Hangzhou, China in 2002 and 2005 respectively. His research interest is in distributed systems and storage systems, with emphasis on performance evaluation and prediction. He was a recipient of Best Paper Award from MASCOTS 2009 conference.
Puranjoy Bhattacharjee got his Masters in Computer Science from Virginia Tech. He works for Amazon.com. His worked on bioinformatics and distributed systems during his graduate school days. His current interests lie in the areas of distributed computing and fault tolerant systems.
Ali R. Butt is an Associate Professor of CS at Virginia Tech. His research interests are in experimental computer systems, especially in data-intensive high-performance computing (HPC) and the impact of technologies such as massive multi-cores, Cloud Computing, and asymmetric architectures on HPC. Ali is a recipient of the NSF CAREER Award (2008), an IBM Faculty Award (2008), an IBM Shared University Research Award (2009), a Virginia Tech College of Engineering “Outstanding New Assistant Professor” Award (2009), a best paper award (MASCOTS 2009), and a NetApp Faculty Fellowship (2011). Ali was an invited participant (2009, 2012) and an organizer (2010) for the NAE’s US Frontiers of Engineering Symposium. He is a member of USENIX and ASEE, and a Senior Member of ACM and IEEE.
Chris Gniady received his Ph.D. degree in electrical and computer engineering from Purdue University in 2005. He is an associate professor in computer science at the University of Arizona. His research interests include performance and energy optimizations of devices. His research focuses on energy optimizations of computers, network devices, mobile devices and large storage systems. He also received the US National Science Foundation CAREER Award in 2009. He has served on numerous program committees and organized workshops related to energy management. He is a member of USENIX, the ACM, and the IEEE.