Abstract
Under the circumstance of big data, traditional storage systems face the big challenge of energy consumption. Switching some storage nodes, which do not experience workloads, to a low-power state is a typical approach to reduce the consumption of energy. This method divides the storage nodes into an active group and a low-power one. That is, the frequently accessed data are stored into the active group which maintains the nodes in an active state to offer service, and the cold data accessed infrequently are stored into the low-power group. The storage nodes in this low-power group are normally called cold nodes, because they can be switched to a low-power state to save energy for a certain amount of time. In cold nodes, one fact, which is often neglected, is that the placement of cold data has a significant impact on the system performance and power consumption. To some extent, switching a storage node from a low-power state to an active state incurs a crucial delay and energy consumption. This paper proposes to aggregate and store the correlated cold data in the same cold node within the low-power group. Now that the correlated data are normally accessed together, our approach can greatly reduce the number of power state transitions and lengthen the idle periods that the cold nodes experience. On the other hand, it can also minimize the performance degradation and power consumption. Experimental results demonstrate that this method effectively reduces the energy consumption while maintaining system performance at an acceptable level in contrast to some state-of-the-art methods.
Similar content being viewed by others
Notes
Twitter Blog, “Dispatch from the Denver Debate.” http://blog.twitter.com/2012/10/dispatch-from-denver-debate.html.
IBM, “What Is Big Data” http://www-01.ibm.com/software/data/bigdata.
IDC, “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things.” http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm.
References
Hu C, Yuhui D (2015) An energy-aware file relocation strategy based on file-access frequency and correlations. In: Proceedings of the 15th International Conference on Algorithms and Architectures for Parallel Processing, Springer, pp 640–653
Scardapane S, Wang D, Panella M (2016) A decentralized training algorithm for echo state networks in distributed big data applications. Neural Netw 78:65–74
Brown R (2008) Report to congress on server and data center energy efficiency: public law 109-431, Lawrence Berkeley National Laboratory
Wan J, Qu X, Wang J, Xie C (2015) ThinRAID: thinning down RAID array for energy conservation. IEEE Trans Parallel Distrib Syst 26(10):2903–2915
Pinheiro E, Bianchini R, Carrera EV, Heath T (2003) Dynamic cluster reconfiguration for power and performance. In: Benini L, Kandemir M, Ramanujam J (eds) Compilers and operating systems for low power. Springer, Berlin, pp 75–93
Thereska E, Donnelly A, Narayanan D (2011) Sierra: practical power-proportionality for data center storage. In: Proceedings of the Sixth Conference on Computer Systems, ACM, pp 169–182
Entrialgo J, Medrano R, Garca DF, Garca J (2015) Autonomic power management with self-healing in server clusters under QoS constraints. Computing 98(9):1–24
Maccio VJ, Down DG (2015) On optimal policies for energy-aware servers. Perform Eval 90:36–52
Ferreira AM, Pernici B (2016) Managing the complex data center environment: an integrated energy-aware framework. Computing 96(7):709–749
Chase JS, Anderson DC, Thakar PN, Vahdat AM, Doyle RP (2001) Managing energy and server resources in hosting centers. ACM SIGOPS Oper Syst Rev 35(5):103–116
Krioukov A et al (2011) Napsac: design and implementation of a power-proportional web cluster. ACM SIGCOMM Comput Commun Rev 41(1):102–108
Okamura H, Miyata S, Dohi T (2016) A markov decision process approach to dynamic power management in a cluster system. IEEE Access 3:3039–3047
Deng Y, Hu Y, Meng X, Zhu Y, Zhang Z, Han J (2014) Predictively booting nodes to minimize performance degradation of a power-aware web cluster. Clust Comput 17(4):1309–1322
Zhang L, Deng Y, Zhu W, Peng J, Wang F (2015) Skewly replicating hot data to construct a power-efficient storage cluster. J Netw Comput Appl 50:168–179
EMC VNX Virtual Provisioning Applied Technology, White Paper, EMC Corporation (2013)
Staelin C, Garcia-Molina H (1990) Clustering active disk data to improve disk performance. Technical Report CSTR-283-90, Department of Computer Science, Princeton University
Cherkasova L, Ciardo G (2000) Characterizing temporal locality and its impact on web server performance. Technical Report HPL-2000-82, Hewlett Packard Laboratories, July 2000
Gomez ME, Santonja V (2002) Characterizing temporal locality in I/O workload. In: Proceedings of the International Symposium on Performance Evaluation of Computer and Telecommunication Systems
Pareto Principle, http://en.wikipedia.org/wiki/Pareto_principle
Narayanan D, Donnelly A, Rowstron A (2008) Write off-loading: practical power management for enterprise storage. ACM Trans Storage 4(3):256–267
Weddle C et al (2007) PARAID: a gear-shifting power-aware RAID. ACM Trans Storage 3(3):13
Mao B et al (2008) GRAID: a green RAID storage architecture with improved energy efficiency and reliability. In: Proceedings of the 16th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2008), IEEE
Bui DM, Nguyen HQ, Yoon Y, Jun S, Amin MB, Lee S (2015) Gaussian process for predicting CPU utilization and its application to energy efficiency. Appl Intell 43(4):874–891
Deng Y (2011) What is the future of disk drives, death or rebirth? ACM Comput Surv 43(3):23
Patterson DA, Gibson G, Katz RH (1988) A case for redundant arrays of inexpensive disks (RAID). In: Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data (SIGMOD ’88), pp 109–116
Tait CD, Duchamp D (1991) Detection and exploitation of file working sets. In: Proceedings of the 11th International Conference on Distributed Computing Systems, pp 2–9
Lei H, Duchamp D (1997) An analytical approach to file prefetching. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference (UATEC ’97)
Kroeger TM, Long DDE (1999) The case for efficient file access pattern modeling. In: Proceedings of the 7th Workshop on Hot Topics in Operating Systems, IEEE, pp 14–19
Kroeger TM, Long DDE (2001) Design and implementation of a predictive file prefetching algorithm. In: Proceedings of the General Track: 2001 USENIX Annual Technical Conference, pp 105–118
Ishii Y, Inaba M, Hiraki K (2011) Access map pattern matching for high performance data cache prefetch. J Instr Level Parallelism 13:1–24
Wu Y, Otagiri K, Watanabe Y, Yokota H (2011) A file search method based on intertask relationships derived from access frequency and rmc operations on files. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA ’11), pp 364–378
He J, Sun XH, Thakur R (2012) Knowac, I/O prefetch via accumulated knowledge. In: Proceedings of the 2012 IEEE International Conference on CLUSTER Computing, pp 429–437
Jiang S, Ding X, Xu Y, Davis K (2013) A prefetching scheme exploiting both data layout and access history on disk. ACM Trans Storage 9(3):1–23
Xia P, Feng D, Jiang H, Tian L, Wang F (2008) FARMER: a novel approach to file access correlations mining and evaluation reference model for optimizing peta-scale file system performance. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing, ACM
Agrawal R, Imieliski T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Record 22(2):207–216
Iritani M, Yokota H (2012) Effects on performance and energy reduction by file relocation based on file-access correlations. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops (EDBT-ICDT ’12), ACM, pp 79–86
Aye KN, Thein T (2015) A platform for big data analytics on distributed scale-out storage system. Int J Big Data Intell 2(2):127–141
Lin W, Wu W, Wang H, Wang JZ, Hsu CH (2016) Experimental and quantitative analysis of server power model for cloud data centers. Future Gener Comput Syst. https://doi.org/10.1016/j.future.2016.11.034
Sarwesh P et al (2017) Effective integration of reliable routing mechanism and energy efficient node placement technique for low power IoT networks. Int J Grid High Perform Comput 9(4):16–35
Xie J, Deng Y, Min G, Zhou Y (2017) An incrementally scalable and cost-efficient interconnection structure for datacenters. IEEE Trans Parallel Distrib Syst 28(6):1578–1592
Deng Y (2009) Deconstructing network attached storage systems. J Netw Comput Appl 32(5):1064–1072
Li Z, Chen Z, Srinivasan SM, Zhou Y (2004) C-Miner: mining block correlations in storage systems. In: Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST ’04), pp 173–186
Acknowledgements
This work is supported by the National Natural Science Foundation (NSF) of China under Grant (No. 61572232), in part by the Science and Technology Planning Project of Guangzhou under Grant 201604016100, in part by the Science and Technology Planning Project of Nansha (2016CX007), in part by the Open Research Fund of Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences (CARCH201705). The corresponding author is Yuhui Deng from Jinan University.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper [1] appears in the Proceedings of the 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP2015). We markedly elaborate the motivations and related work, expand the relevant algorithms, broaden the architecture analysis and enrich the experiments in this paper.
Rights and permissions
About this article
Cite this article
Hu, C., Deng, Y. Aggregating correlated cold data to minimize the performance degradation and power consumption of cold storage nodes. J Supercomput 75, 662–687 (2019). https://doi.org/10.1007/s11227-018-2366-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2366-x