Abstract
Flash-based solid-state drives (SSDs) are a key component in most computer systems, thanks to their ability to support parallel I/O at sub-millisecond latency and consistently high throughput. At the same time, due to the limitations of the flash media, they perform writes out-of-place, often incurring a high internal overhead which is referred to as write amplification. Minimizing this overhead has been the focus of numerous studies by the systems research community for more than two decades. The abundance of system-level optimizations for reducing SSD write amplification, which is typically based on experimental evaluation, stands in stark contrast to the lack of theoretical algorithmic results in this problem domain. To bridge this gap, we explore the problem of reducing write amplification from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm. In the online setting, we first consider algorithms that have no prior knowledge about the input and show that in this case, the greedy algorithm is optimal. Then, we design an online algorithm that uses predictions about the input. We show that when predictions are relatively accurate, our algorithm significantly improves over the greedy algorithm. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.
- Agarwal, R., Marrow, M. A closed-form expression for write amplification in NAND flash. In 2010 IEEE Globecom Workshops (2010), IEEE, Miami, FL, USA, 1846--1850.Google Scholar
- Bux, W., Iliadis, I. Performance of greedy garbage collection in flash-based solid-state drives. Perform. Eval. 67, 11 (Nov. 2010), 1172--1186.Google ScholarDigital Library
- Chakraborttii, C., Litz, H. Reducing write amplification in flash by death-time prediction of logical block addresses. In Proceedings of the 14th ACM International Conference on Systems and Storage (2021), ACM, Haifa, Israel.Google ScholarDigital Library
- Desnoyers, P. Analytic models of SSD write performance. ACM Trans. Storage 10, 2 (Mar. 2014) 1--25.Google ScholarDigital Library
- Diwan, A., Pal, S., Ranade, A. Fragmented coloring of proper interval and split graphs. Discrete Appl. Math. 193 (2015), 110--118.Google ScholarDigital Library
- He, J., Kannan, S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. The unwritten contract of solid state drives. In Proceedings of the 12th European Conference on Computer Systems (EuroSys'17) (2017), ACM, NY, 127--144.Google ScholarDigital Library
- Hsieh, J.-W., Kuo, T.-W., Chang, L.-P. Efficient identification of hot data for flash memory storage systems. ACM Trans. Storage 2, 1 (Feb. 2006), 22--40.Google ScholarDigital Library
- Lykouris, T., Vassilvitskii, S. Competitive caching with machine learned advice. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10--15, 2018, volume 80 of Proceedings of Machine Learning Research (2018), PMLR, Stockholm, Sweden, 3302--3311.Google Scholar
- Mitzenmacher, M., Vassilvitskii, S. Algorithms with predictions. In Beyond the Worst-Case Analysis of Algorithms. T. Roughgarden, ed. Cambridge University Press, Cambridge, UK, 2020, 646--662.Google ScholarCross Ref
- Narayanan, D., Donnelly, A., Rowstron, A. Write off-loading: Practical power management for enterprise storage. ACM Trans. Storage 4, 3 (Nov. 2008), 1--23.Google ScholarDigital Library
- Park, D., Du, D.H. Hot data identification for flash-based storage systems using multiple Bloom filters. In 27th IEEE Symposium on Mass Storage Systems and Technologies (MSST) (2011) IEEE, Denver, CO, USA.Google Scholar
- Purohit, M., Svitkina, Z., Kumar, R. Improving online algorithms via ML predictions. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018 (December 3--8, 2018, Montrèal, Canada) (2018), Curran Associates, Inc., Montreal, Canada, 9684--9693.Google Scholar
- SNIA IOTTA Trace Repository. MSR Cambridge Traces, 2007. http://iotta.snia.org/traces/block-io/388.Google Scholar
- Yadgar, G., Gabel, M., Jaffer, S., Schroeder, B. SSD-based workload characteristics and their performance implications. ACM Trans. Storage 17, 1 (Jan. 2021) 1--26.Google ScholarDigital Library
- Yang, Y., Misra, V., Rubenstein, D. On the optimality of greedy garbage collection for SSDs. SIGMETRICS Perform. Eval. Rev. 43, 2 (Sept. 2015), 63--65.Google ScholarDigital Library
- Yao, A.C.-C. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symposium on Foundations of Computer Science (SFCS) (1977), IEEE, Los Alamitos, CA, USA, 222--227.Google ScholarDigital Library
Index Terms
- Offline and Online Algorithms for SSD Management
Recommendations
Offline and Online Algorithms for SSD Management
POMACSFlash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after ...
Offline and Online Algorithms for SSD Management
SIGMETRICS '22The abundance of system-level optimizations for reducing SSD write amplification, which are usually based on experimental evaluation, stands in contrast to the lack of theoretical algorithmic results in this problem domain. To bridge this gap, we ...
Offline and Online Algorithms for SSD Management
SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer SystemsThe abundance of system-level optimizations for reducing SSD write amplification, which are usually based on experimental evaluation, stands in contrast to the lack of theoretical algorithmic results in this problem domain. To bridge this gap, we ...
Comments