Abstract
Hot data is very important for optimizing modern computer systems. For example, the identified hot data can be employed to extend the lifespan of flash memory. However, it is very challenging to effectively identify hot data with low memory consumption and low runtime overhead. This paper proposes a Hot Data Catcher (HDCat) which can effectively identify hot data in large-scale I/O streams by leveraging enhanced temporal locality. HDCat only maintains a hot data queue and a candidate hot data queue to record the data access pattern by tracking limited data set, thus effectively reducing the memory consumption. Furthermore, HDCat adopts a D-bit counter and a recency-bit to leverage both the frequency and recency contained in the data stream. Additionally, HDCat can significantly reduce the conversion between hot data and cold data. Real traces are used to evaluate the proposed approach. Experimental results demonstrate that HDCat significantly outperforms the state-of-the-art Multi-hash algorithm and the two-level LRU algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chang, L.P., Kuo, T.W.: An adaptive striping architecture for flash memory storage systems of embedded systems. In: IEEE Real-time Embedded Technology Applications Symposium, pp. 187–196 (2002)
Chang, L.P., Kuo, T.W.: Efficient management for large-scale flash-memory storage systems with resource conservation. ACM Trans. Storage 1(4), 381–418 (2005)
Chang, L.P., Kuo, T.W., Lo, S.W.: Real-time garbage collection for flash-memory storage systems of real-time embedded systems. ACM Trans. Embed. Comput. Syst. 3(4), 837–863 (2004)
Chiang, M.L., Paul, C.H.L., Chang, R.C.: Managing flash memory in personal communication devices. In: Proceedings of the 1997 International Symposium on Consumer Electronics (ISCE 1997), pp. 177–182 (1997)
Debnath, B., Subramanya, S., Du, D., Lilja, D.J.: Large block clock (lb-clock): a write caching algorithm for solid state disks. In: IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems, MASCOTS 2009, pp. 1–9 (2009)
Deng, Y.: What is the future of disk drives, death or rebirth? ACM Comput. Surv. 43(3), 194–218 (2011)
Deng, Y., Wang, F., Na, H.: EED: energy efficient disk drive architecture. Inf. Sci. 178(22), 4403–4417 (2008)
Hsieh, J.W., Chang, L.P., Kuo, T.W.: Efficient identification of hot data for flash memory storage systems. ACM Trans. Storage (TOS) TOS Homepage 2, 22–40 (2006)
Jo, H., Kang, J.U., Park, S.Y., Kim, J.S., Lee, J.: FAB: flash-aware buffer management policy for portable media players. IEEE Trans. Consum. Electron. 52(2), 485–493 (2006)
Kim, H., Ahn, S.: BPLRU: a buffer management scheme for improving random writes in flash storage. In: FAST, pp. 239–252 (2008)
Narayanan, D., Donnelly, A.: Write off-loading: practical power management for enterprise storage. Trans. Storage 4(3), 1–23 (2008)
Park, D., Debnath, B., Du, D.: CFTL: a convertible flash translation layer adaptive to data access patterns. In: SIGMETRICS, pp. 365–366 (2010)
Park, S.Y., Jung, D., Kang, J.U., Kim, J.S., Lee, J.: CFLRU: a replacement algorithm for flash memory. In: CASES 2006: Proceedings of the 2006 International Conference on Compilers, Architecture, pp. 234–241 (2006)
Parkz, D., Nam, Y.J., Debnath, B., Du, D.H.C., Kim, Y., Kim, Y.: An on-line hot data identification for flash-based storage using sampling mechanism. ACM SIGAPP Appl. Comput. Rev. 13(1), 51–64 (2013)
Zhang, L., Deng, Y., Zhu, W., Zhou, J., Wang, F.: Skewly replicating hot data to construct a power-efficient storage cluster. J. Netw. Comput. Appl. 50, 168–179 (2015)
Acknowledgments
This work is supported by the National Natural Science Foundation (NSF) of China under Grant (No. 61572232, and No. 61272073), the key program of Natural Science Foundation of Guangdong Province (No. S2013020012865), the Open Research Fund of Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences (CARCH201401), and the Fundamental Research Funds for the Central Universities, and the Science and Technology Planning Project of Guangdong Province (No. 2013B090200021).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, J., Deng, Y., Huang, Z. (2015). HDCat: Effectively Identifying Hot Data in Large-Scale I/O Streams with Enhanced Temporal Locality. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9529. Springer, Cham. https://doi.org/10.1007/978-3-319-27122-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-27122-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27121-7
Online ISBN: 978-3-319-27122-4
eBook Packages: Computer ScienceComputer Science (R0)