A survey on AI for storage

Liu, Yu; Wang, Hua; Zhou, Ke; Li, ChunHua; Wu, Rengeng

doi:10.1007/s42514-022-00101-3

A survey on AI for storage

Regular Paper
Published: 23 May 2022

Volume 4, pages 233–264, (2022)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Yu Liu¹,
Hua Wang¹,
Ke Zhou¹,
ChunHua Li¹ &
…
Rengeng Wu¹

1523 Accesses
2 Citations
Explore all metrics

Abstract

Storage, as a core function and fundamental component of computers, provides services for saving and reading digital data. The increasing complexity of data operations and storage architectures is challenging the performance and reliability of storage services. Artificial intelligence (AI) advancements in the field of intelligent algorithms show significant promise for resolving storage issues. A hot topic of current studies is how to marry AI and storage. In this paper, we present a comprehensive survey of “AI for Storage” and categorize storage research employing intelligent algorithms according to public literature in recent years into Architecture-oriented, Data-specific, and Operation & Maintenance. Based on this classification, we fine-categorize all of the studies by application environment and elaborate on their development history in order to provide guidelines for future research on how to employ AI technologies based on this history. Finally, we present a discussion and future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Big data storage technologies: a survey

Article 01 August 2017

Cost analysis of erasure coding for exa-scale storage

Article Open access 31 October 2018

A novel non-volatile memory storage system for I/O-intensive applications

Article 01 October 2018

Notes

References

C., C.A.R., Pâris, J., Vilalta, R., Cheng, A.M.K., Long, D.D.E.: Disk failure prediction in heterogeneous environments. In: International Symposium on Performance Evaluation of Computer and Telecommunication Systems, SPECTS, pp. 1–7. IEEE, Seattle, WA, USA (2017)
Abu-Libdeh, H., Altinbüken, D., Beutel, A., Chi, E.H., Doshi, L., Kraska, T., Li, X., Ly, A., Olston, C.: Learned indexes for a google-scale disk-based database. CoRR abs/2012.12501 (2020)
Agarwal, V., Bhattacharyya, C., Niranjan, T., Susarla, S.: Discovering rules from disk events for predicting hard drive failures. In: International Conference on Machine Learning and Applications, ICMLA, pp. 782–786. IEEE Computer Society, Miami Beach (2009)
Google Scholar
Aken, D.V., Pavlo, A., Gordon, G.J., Zhang, B.: Automatic database management system tuning through large-scale machine learning. In: International Conference on Management of Data, SIGMOD, pp. 1009–1024. ACM, Chicago (2017)
Google Scholar
Alter, J., Xue, J., Dimnaku, A., Smirni, E.: SSD failures in the field: symptoms, causes, and prediction models. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 75–17514. ACM, Denver (2019)
Google Scholar
Anantharaman, P., Qiao, M., Jadav, D.: Large scale predictive analytics for hard disk remaining useful life estimation. In: International Congress on Big Data, BigData Congress, pp. 251–254. IEEE Computer Society, San Francisco (2018)
Google Scholar
Arzani, B., Ciraci, S., Loo, B.T., Schuster, A., Outhred, G.: Taking the blame game out of data centers operations with netpoirot. In: SIGCOMM, pp. 440–453. ACM, Florianopolis, Brazil (2016)
Aussel, N., Jaulin, S., Gandon, G., Petetin, Y., Fazli, E., Chabridon, S.: Predictive models of hard drive failures based on operational data. In: International Conference on Machine Learning and Applications, pp. 619–625. IEEE, Cancun, Mexico (2017)
Bagbaba, A.: Improving collective I/O performance with machine learning supported auto-tuning. In: International Parallel and Distributed Processing Symposium Workshops, IPDPSW, pp. 814–821. IEEE, New Orleans (2020)
Google Scholar
Basak, S., Sengupta, S., Dubey, A.: Mechanisms for integrated feature normalization and remaining useful life estimation using lstms applied to hard-disks. In: International Conference on Smart Computing, SMARTCOMP, pp. 208–216. IEEE, Washington (2019)
Google Scholar
Baseman, E., DeBardeleben, N., Ferreira, K.B., Levy, S., Raasch, S., Sridharan, V., Siddiqua, T., Guan, Q.: Improving DRAM fault characterization through machine learning. In: 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, DSN Workshops, pp. 250–253. IEEE Computer Society, Toulouse, France (2016)
Google Scholar
Behzad, B., Luu, H.V.T., Huchette, J., Byna, S.: Prabhat, Aydt, R.A., Koziol, Q., Snir, M.: Taming parallel I/O complexity with auto-tuning. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 68–16812. ACM, Denver (2013)
Google Scholar
Behzad, B., Byna, S., Wild, S.M.: Prabhat, Snir, M.: Dynamic model-driven parallel I/O performance tuning. In: International Conference on Cluster Computing, CLUSTER, pp. 184–193. IEEE Computer Society, Chicago (2015)
Google Scholar
Bei, Z., Yu, Z., Zhang, H., Xiong, W., Xu, C., Eeckhout, L., Feng, S.: RFHOC: A random-forest approach to auto-tuning hadoop’s configuration. IEEE Trans. Parallel Distrib. Syst. 27(5), 1470–1483 (2016)
Article Google Scholar
Berger, D.S.: Towards lightweight and robust machine learning for CDN caching. In: Workshop on Hot Topics in Networks, HotNets, pp. 134–140. ACM, Redmond (2018)
Chapter Google Scholar
Berger, D.S., Sitaraman, R.K., Harchol-Balter, M.: Adaptsize: Orchestrating the hot object memory cache in a content delivery network. In: Symposium on Networked Systems Design and Implementation, NSDI, pp. 483–498. USENIX Association, Boston (2017)
Google Scholar
Beutel, A., Kraska, T., Chi, E., Dean, J., Polyzotis, N.: A machine learning approach to databases indexes. In: ML Systems Workshop, Annual Conference on Neural Information Processing Systems, NIPS, Long Beach, CA, USA (2017)
Bhatia, E., Chacon, G., Pugsley, S.H., Teran, E., Gratz, P.V., Jiménez, D.A.: Perceptron-based prefetch filtering. In: International Symposium on Computer Architecture, ISCA, pp. 1–13. ACM, Phoenix (2019)
Google Scholar
Boixaderas, I., Zivanovic, D., Moré, S., Bartolome, J., Vicente, D., Casas, M., Carpenter, P.M., Radojkovic, P., Ayguadé, E.: Cost-aware prediction of uncorrected DRAM errors in the field. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, p. 61. IEEE/ACM, Virtual Event / Atlanta, Georgia, USA (2020)
Botezatu, M.M., Giurgiu, I., Bogojeska, J., Wiesmann, D.: Predicting disk replacement towards reliable data centers. In: International Conference on Knowledge Discovery and Data Mining, SIGKDD, pp. 39–48. ACM, San Francisco (2016)
Google Scholar
Braam, P.: The lustre storage architecture. CoRR abs/1903.01955 (2019)
Braun, P., Litz, H.: Understanding memory access patterns for prefetching. In: International Workshop on AI-assisted Design for Architecture (AIDArc), Held in Conjunction with ISCA, Phoenix, AZ, USA (2019)
Bux, W., Iliadis, I.: Performance of greedy garbage collection in flash-based solid-state drives. Perform. Eval. 67(11), 1172–1186 (2010)
Article Google Scholar
Cai, Z., Li, W., Zhu, W., Liu, L., Yang, B.: A real-time trace-level root-cause diagnosis system in alibaba datacenters. IEEE Access 7, 142692–142702 (2019)
Article Google Scholar
Cao, Z., Tarasov, V., Tiwari, S., Zadok, E.: Towards better understanding of black-box auto-tuning: A comparative analysis for storage systems. In: 2018 USENIX Annual Technical Conference, USENIX ATC 2018, , July 11-13, 2018, pp. 893–907. USENIX Association, Boston, MA, USA (2018)
Cao, S., Gao, Y., Gao, X., Chen, G.: Adam: An adaptive fine-grained scheme for distributed metadata management. In: International Conference on Parallel Processing, ICPP, pp. 37–13710. ACM, Kyoto (2019)
Google Scholar
Chakrabortti, C., Litz, H.: Learning I/O access patterns to improve prefetching in ssds. In: Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track - European Conference, ECML PKDD, Lecture Notes in Computer Science, vol. 12460, pp. 427–443. Springer, Ghent (2020)
Google Scholar
Chandramouli, B., Prasaad, G., Kossmann, D., Levandoski, J.J., Hunter, J., Barnett, M.: FASTER: A concurrent key-value store with in-place updates. In: International Conference on Management of Data, SIGMOD, pp. 275–290. ACM, Houston (2018)
Google Scholar
Chaves, I.C., de Paula, M.R.P., de,: Moura Leite, L.G., Gomes, J.P.P., Machado, J.C.: Hard disk drive failure prediction method based on A bayesian network. In: International Joint Conference on Neural Networks, IJCNN, pp. 1–7. IEEE, Rio de Janeiro, Brazil (2018)
Chaves, I.C., de Paula, M.R.P., de,: Moura Leite, L.G., Queiroz, L.P., Gomes, J.P.P., Machado, J.C.: Banhfap: A bayesian network based failure prediction approach for hard disk drives. In: Brazilian Conference on Intelligent Systems, BRACIS, pp. 427–432. IEEE Computer Society, Recife, Brazil (2016)
Chen, L., Gao, Y., Li, X., Jensen, C.S., Chen, G.: Efficient metric indexing for similarity search and similarity joins. IEEE Trans. Knowl. Data Eng. 29(3), 556–571 (2017)
Article Google Scholar
Cheng, P., Lu, Y., Du,: Y., Chen, Z., Liu, Y.: Optimizing data placement on hierarchical storage architecture via machine learning. In: Network and Parallel Computing, NPC, Lecture Notes in Computer Science, vol. 11783, pp. 289–302. Springer, Hohhot, China (2019)
Cheng, W., Zhang, K., Chen, H., Jiang, G., Chen, Z., Wang, W.: Ranking causal anomalies via temporal and dynamical analysis on vanishing correlations. In: International Conference on Knowledge Discovery and Data Mining, KDD, pp. 805–814. ACM, San Francisco (2016)
Google Scholar
Cherubini, G., Jelitto, J., Venkatesan, V.: Cognitive storage for big data. Computer 49(4), 43–51 (2016)
Article Google Scholar
Chledowski, J., Polak, A., Szabucki, B., Zolna, K.T.: Robust learning-augmented caching: An experimental study. In: International Conference on Machine Learning, ICML, Proceedings of Machine Learning Research, vol. 139, pp. 1920–1930. PMLR, Virtual Event (2021)
Cohn, D.A., Singh, S.P.: Predicting lifetimes in dynamically allocated memory. In: Advances in Neural Information Processing Systems, NIPS, pp. 939–945. MIT Press, Denver (1996)
Google Scholar
Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., Bianchini, R.: Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In: Symposium on Operating Systems Principles, SOSP, pp. 153–167. ACM, Shanghai (2017)
Chapter Google Scholar
Dai, Y., Xu, Y., Ganesan, A., Alagappan, R., Kroth, B., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: From wisckey to bourbon: A learned index for log-structured merge trees. In: Symposium on Operating Systems Design and Implementation, OSDI, pp. 155–171. USENIX Association, Virtual Event (2020)
Davitkova, A., Milchevski, E., Michel, S.: The ml-index: A multidimensional, learned index for point, range, and nearest-neighbor queries. In: International Conference on Extending Database Technology, EDBT, pp. 407–410. OpenProceedings.org, Copenhagen, Denmark (2020)
Ding, J., Minhas, U.F., Yu, J., Wang, C., Do, J., Li, Y., Zhang, H., Chandramouli, B., Gehrke, J., Kossmann, D., Lomet, D.B., Kraska, T.: ALEX: an updatable adaptive learned index. In: International Conference on Management of Data, SIGMOD, pp. 969–984. ACM, Portland (2020)
Google Scholar
dos Santos Lima, F.D.: Pereira, F.L.F., Chaves, I.C., Gomes, J.P.P., de Castro Machado, J.: Evaluation of recurrent neural networks for hard disk drives failure prediction. In: Brazilian Conference on Intelligent Systems, BRACIS, pp. 85–90. IEEE Computer Society, São Paulo (2018)
Google Scholar
Featherstun, R.W., Fulp, E.W.: Using syslog message sequences for predicting disk failures. In: Large Installation System Administration Conference, LISA. USENIX Association, San Jose, CA, USA (2010)
Ferragina, P., Vinciguerra, G.: The pgm-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow. 13(8), 1162–1175 (2020)
Article Google Scholar
Fu, C., Cai, D.: EFANNA : An extremely fast approximate nearest neighbor search algorithm based on knn graph. CoRR abs/1609.07228 (2016) 1609.07228
Gan, Y., Zhang, Y., Hu, K., Cheng, D., He, Y., Pancholi, M., Delimitrou, C.: Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices. In: International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pp. 19–33. ACM, Providence (2019)
Google Scholar
Ganguly, S., Consul, A., Khan, A., Bussone, B., Richards, J., Miguel, A.: A practical approach to hard disk failure prediction in cloud platforms: Big data model for failure management in datacenters. In: International Conference on Big Data Computing Service and Applications, pp. 105–116. IEEE Computer Society, Oxford, United Kingdom (2016)
Gao, J., Yaseen, N., MacDavid, R., Frujeri, F.V., Liu, V., Bianchini, R., Aditya, R., Wang, X., Lee, H., Maltz, D.A., Yu, M., Arzani, B.: Scouts: Improving the diagnosis process through domain-customized incident routing. In: Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM, pp. 253–269. ACM, Virtual Event, USA (2020)
Gao, X., Zha, S., Li, X., Yan, B., Jing, X., Li, J., Xu, J.: Incremental prediction model of disk failures based on the density metric of edge samples. IEEE Access 7, 114285–114296 (2019)
Article Google Scholar
Gao, Y., Gao, X., Chen, G.: Deephash: An end-to-end learning approach for metadata management in distributed file systems. In: International Conference on Parallel Processing, ICPP, pp. 36–13610. ACM, Kyoto (2019)
Google Scholar
Gheisari, M., Movassagh, A.A., Qin, Y., Yong, J., Tao, X., Zhang, J., Shen, H.: NSSSD: A new semantic hierarchical storage for sensor data. In: International Conference on Computer Supported Cooperative Work in Design, CSCWD, pp. 174–179. IEEE, Nanchang (2016)
Google Scholar
Giurgiu, I., Szabó, J., Wiesmann, D., Bird, J.: Predicting DRAM reliability in the field with machine learning. In: Zhu, X., Roy, I. (eds.) Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference: Industrial Track, pp. 15–21. ACM, Las Vegas, NV, USA (2017)
Guan, Y., Zhang, X., Guo, Z.: CACA: learning-based content-aware cache admission for video content in edge caching. In: International Conference on Multimedia, MM, pp. 456–464. ACM, Nice (2019)
Google Scholar
Hadian, A., Heinis, T.: Shift-table: A low-latency learned index for range queries using model correction. In: International Conference on Extending Database Technology, EDBT, pp. 253–264. OpenProceedings.org, Nicosia, Cyprus (2021)
Hadian, A., Heinis, T.: Considerations for handling updates in learned index structures. In: International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD, pp. 3–134. ACM, Amsterdam (2019)
Google Scholar
Hamerly, G., Elkan, C.: Bayesian approaches to failure prediction for disk drives. In: International Conference on Machine Learning (ICML, pp. 202–209. Morgan Kaufmann, Williams College, Williamstown, MA, USA (2001)
Hashemi, M., Swersky, K., Smith, J.A., Ayers, G., Litz, H., Chang, J., Kozyrakis, C., Ranganathan, P.: Learning memory access patterns. In: International Conference on Machine Learning, ICML, Proceedings of Machine Learning Research, vol. 80, pp. 1924–1933. PMLR, Stockholmsmässan (2018)
Google Scholar
Herodotou, H., Babu, S.: Profiling, what-if analysis, and cost-based optimization of mapreduce programs. Proc. VLDB Endow. 4(11), 1111–1122 (2011)
Article Google Scholar
Higuchi, S., Takemasa, J., Koizumi, Y., Tagami, A., Hasegawa, T.: Feasibility of longest prefix matching using learned index structures. SIGMETRICS Perform. Eval. Rev. 48(4), 45–48 (2021)
Article Google Scholar
Hu, G., Shao, J., Zhang, D., Yang, Y., Shen, H.T.: Preserving-ignoring transformation based index for approximate k nearest neighbor search. In: International Conference on Data Engineering, ICDE, pp. 91–94. IEEE Computer Society, San Diego (2017)
Google Scholar
Hua, Y., Jiang, H., Zhu, Y., Feng, D., Tian, L.: Smartstore: a new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Conference on High Performance Computing, SC. ACM, Portland, Oregon, USA (2009)
Hua, Y., Jiang, H., Feng, D.: FAST: near real-time searchable data analytics for the cloud. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 754–765. IEEE Computer Society, New Orleans (2014)
Chapter Google Scholar
Huang, S., Fu, S., Zhang, Q., Shi, W.: Characterizing disk failures with quantified disk degradation signatures: An early experience. In: International Symposium on Workload Characterization, IISWC, pp. 150–159. IEEE Computer Society, Atlanta (2015)
Google Scholar
Indyk, P., Motwani, R., Raghavan, P., Vempala, S.S.: Locality-preserving hashing in multidimensional spaces. In: Symposium on the Theory of Computing, STOC, pp. 618–625. ACM, El Paso (1997)
Google Scholar
Jain, R., Panda, P.R., Subramoney, S.: A coordinated multi-agent reinforcement learning approach to multi-level cache co-partitioning. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 800–805. IEEE, Lausanne (2017)
Google Scholar
Ji, X., Ma, Y., Ma, R., Li, P., Ma, J., Wang, G., Liu, X., Li, Z.: A proactive fault tolerance scheme for large scale storage systems. In: International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP, Lecture Notes in Computer Science, vol. 9530, pp. 337–350. Springer, Zhangjiajie (2015)
Google Scholar
Jiang, T., Zeng, J., Zhou, K., Huang, P., Yang, T.: Lifelong disk failure prediction via gan-based anomaly detection. In: International Conference on Computer Design, ICCD, pp. 199–207. IEEE, Abu Dhabi (2019)
Google Scholar
Jiang, T., Huang, P., Zhou, K.: Scrub unleveling: Achieving high data reliability at low scrubbing cost. In: Teich, J., Fummi, F. (eds.) Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 1403–1408. IEEE, Florence (2019)
Google Scholar
Jiménez, D.A., Teran, E.: Multiperspective reuse prediction. In: International Symposium on Microarchitecture, MICRO, pp. 436–448. ACM, Cambridge (2017)
Google Scholar
Jin, X., Agun, D., Yang, T., Wu, Q., Shen, Y., Zhao, S.: Hybrid indexing for versioned document search with cluster-based retrieval. In: International Conference on Information and Knowledge Management, CIKM, pp. 377–386. ACM, Indianapolis (2016)
Google Scholar
Kang, W., Yoo, S.: Dynamic management of key states for reinforcement learning-assisted garbage collection to reduce long tail latency in SSD. In: Design Automation Conference, DAC, pp. 8–186. ACM, San Francisco (2018)
Google Scholar
Kang, W., Yoo, S.: \(q\) -value prediction for reinforcement learning assisted garbage collection to reduce long tail latency in SSD. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst 39(10), 2240–2253 (2020)
Article Google Scholar
Kang, W., Shin, D., Yoo, S.: Reinforcement learning-assisted garbage collection to mitigate long-tail latency in SSD. ACM Trans. Embed. Comput. Syst. 16(5s), 134–113420 (2017)
Article Google Scholar
Kim, M., Lee, S.: Reducing tail latency of dnn-based recommender systems using in-storage processing. In: SIGOPS Asia-Pacific Workshop on Systems, pp. 90–97. ACM, Tsukuba, Japan (2020)
Kim, J.: An ftl-aware host system alleviating severe long latency of NAND flash-based storage. In: International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pp. 189–194. IEEE, Houston (2021)
Google Scholar
Kim, M., Sumbaly, R., Shah, S.: Root cause detection in a service-oriented architecture. In: International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS, pp. 93–104. ACM, Pittsburgh (2013)
Google Scholar
Kim, Y., More, A., Shriver, E., Rosing, T.: Application performance prediction and optimization under cache allocation technology. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 1285–1288. IEEE, Florence (2019)
Google Scholar
Kipf, A., Marcus, R., van Renen, A., Stoian, M., Kemper, A., Kraska, T., Neumann, T.: Radixspline: a single-pass learned index. In: Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD, pp. 5–155. ACM, Portland (2020)
Google Scholar
Kirilin, V., Sundarrajan, A., Gorinsky, S., Sitaraman, R.K.: Rl-cache: Learning-based cache admission for content delivery. In: Proceedings of the 2019 Workshop on Network Meets AI & ML, NetAI@SIGCOMM 2019, pp. 57–63. ACM, Beijing, China (2019)
Kirilin, V., Sundarrajan, A., Gorinsky, S., Sitaraman, R.K.: Rl-cache: Learning-based cache admission for content delivery. IEEE J. Sel. Areas Commun. 38(10), 2372–2385 (2020)
Article Google Scholar
Klein, K., Kriege, N.M., Mutzel, P.: Ct-index: Fingerprint-based graph indexing combining cycles and trees. In: International Conference on Data Engineering, ICDE, pp. 1115–1126. IEEE Computer Society, Hannover (2011)
Google Scholar
Kraska, T., Alizadeh, M., Beutel, A., Chi, E.H., Kristo, A., Leclerc, G., Madden, S., Mao, H., Nathan, V.: Sagedb: A learned database system. In: Biennial Conference on Innovative Data Systems Research, CIDR. www.cidrdb.org, Asilomar, CA, USA (2019)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: International Conference on Management of Data, SIGMOD, pp. 489–504. ACM, Houston (2018)
Google Scholar
Leuoth, S., Benn, W.: A self-adaptive insert strategy for content-based multidimensional database storage. In: GI-Workshop on Foundations of Databases (Grundlagen Von Datenbanken). Preprints aus dem Institut für Informatik, vol. CS-02-09, pp. 75–79. Universität Rostock, Mecklenburg-Vorpommern, Germany (2009)
Leuoth, S., Benn, W.: Towards SISI - a self adaptive insert strategy for the intelligent cluster index (icix). In: Machine Learning and Data Mining in Pattern Recognition, MLDM, pp. 141–155. ibai Publishing, Leipzig, Germany (2009)
Li, P., Hua, Y., Zuo, P., Jia, J.: A scalable learned index scheme in storage systems. CoRR abs/1905.06256 (2019) 1905.06256
Li, J., Ji, X., Jia, Y., Zhu, B., Wang, G., Li, Z., Liu, X.: Hard drive failure prediction using classification and regression trees. In: International Conference on Dependable Systems and Networks, DSN, pp. 383–394. IEEE Computer Society, Atlanta (2014)
Google Scholar
Li, J., Stones, R.J., Wang, G., Li, Z., Liu, X., Xiao, K.: Being accurate is not enough: New metrics for disk failure prediction. In: Symposium on Reliable Distributed Systems, SRDS, pp. 71–80. IEEE Computer Society, Budapest (2016)
Google Scholar
Li, Y., Chang, K., Bel, O., Miller, E.L., Long, D.D.E.: CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 42–14214. ACM, Denver (2017)
Google Scholar
Li, J., Stones, R.J., Wang, G., Liu, X., Li, Z., Xu, M.: Hard drive failure prediction using decision trees. Reliab. Eng. Syst. Saf. 164, 55–65 (2017)
Article Google Scholar
Li, Z.L., Liang, C.M., He, W., Zhu, L., Dai, W., Jiang, J., Sun, G.: Metis: Robustly tuning tail latencies of cloud systems. In: Annual Technical Conference, USENIX ATC, pp. 981–992. USENIX Association, Boston (2018)
Google Scholar
Li, G., Zhou, X., Li, S., Gao, B.: Qtune: A query-aware database tuning system with deep reinforcement learning. Proc. VLDB Endow. 12(12), 2118–2130 (2019)
Article Google Scholar
Li, P., Lu, H., Zheng, Q., Yang, L., Pan, G.: LISA: A learned index structure for spatial data. In: Maier, D., Pottinger, R., Doan, A., Tan, W., Alawini, A., Ngo, H.Q. (eds.) International Conference on Management of Data, SIGMOD, pp. 2119–2133. ACM, Portland (2020)
Google Scholar
Li, C., Wang, Y., Liu, C., Liang, S., Li, H., Li, X.: GLIST: towards in-storage graph learning. In: Annual Technical Conference, ATC, pp. 225–238. USENIX Association, Ho Chi Minh City (2021)
Google Scholar
Liang, S., Wang, Y., Lu, Y., Yang, Z., Li, H., Li, X.: Cognitive SSD: A deep learning engine for in-storage data retrieval. In: Annual Technical Conference, ATC, pp. 395–410. USENIX Association, Renton (2019)
Google Scholar
Lin, W., Ma, M., Pan, D., Wang, P.: Facgraph: Frequent anomaly correlation graph mining for root cause diagnose in micro-service architecture. In: International Performance Computing and Communications Conference, IPCCC, pp. 1–8. IEEE, Orlando (2018)
Google Scholar
Liu, J., Wang, R., Gao, X., Yang, X., Chen, G.: Anglecut: A ring-based hashing scheme for distributed metadata management. In: Database Systems for Advanced Applications, DASFAA, Lecture Notes in Computer Science, vol. 10177, pp. 71–86. Springer, Suzhou (2017)
Chapter Google Scholar
Liu, Y., Song, J., Zhou, K., Yan, L., Liu, L., Zou, F., Shao, L.: Deep self-taught hashing for image retrieval. IEEE Trans. Cybern. 49(6), 2229–2241 (2019)
Article Google Scholar
Liu, P., Chen, Y., Nie, X., Zhu, J., Zhang, S., Sui, K., Zhang, M., Pei, D.: Fluxrank: A widely-deployable framework to automatically localizing root cause machines for software service failure mitigation. In: International Symposium on Software Reliability Engineering, ISSRE, pp. 35–46. IEEE, Berlin (2019)
Google Scholar
Liu, Y., Jiang, H., Wang, Y., Zhou, K., Liu, Y., Liu, L.: Content sifting storage: Achieving fast read for large-scale image dataset analysis. In: Design Automation Conference, DAC, pp. 1–6. IEEE, San Francisco (2020)
Google Scholar
Liu, Y., Wang, Y., Song, J., Guo, C., Zhou, K., Xiao, Z.: Deep self-taught graph embedding hashing with pseudo labels for image retrieval. In: International Conference on Multimedia and Expo, ICME, pp. 1–6. IEEE, London (2020)
Google Scholar
Liu, W., Cui, J., Liu, J., Yang, L.T.: Mlcache: A space-efficient cache scheme based on reuse distance and machine learning for nvme ssds. In: International Conference On Computer Aided Design, ICCAD, pp. 58–1589. IEEE, San Diego (2020)
Google Scholar
Liu, P., Xu, H., Ouyang, Q., Jiao, R., Chen, Z., Zhang, S., Yang, J., Mo, L., Zeng, J., Xue, W., Pei, D.: Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks. In: International Symposium on Software Reliability Engineering, ISSRE, pp. 48–58. IEEE, Coimbra (2020)
Google Scholar
Lu, S., Luo, B., Patel, T., Yao, Y., Tiwari, D., Shi, W.: Making disk failure predictions smarter! In: Conference on File and Storage Technologies, FAST, pp. 151–167. USENIX Association, Santa Clara, CA, USA (2020)
Luaces, D., Viqueira, J.R.R., Pena, T.F., Cotos, J.M.: Leveraging bitmap indexing for subgraph searching. In: International Conference on Extending Database Technology, EDBT, pp. 49–60. OpenProceedings.org, Lisbon, Portugal (2019)
Luo, C., Zhao, P., Qiao, B., Wu, Y., Zhang, H., Wu, W., Lu, W., Dang, Y., Rajmohan, S., Lin, Q., Zhang, D.: NTAM: neighborhood-temporal attention model for disk failure prediction in cloud platforms. In: The Web Conference, WWW, pp. 1181–1191. ACM / IW3C2, Virtual Event / Ljubljana, Slovenia (2021)
Luo, C., Lou, J., Lin, Q., Fu, Q., Ding, R., Zhang, D., Wang, Z.: Correlating events with time series for incident diagnosis. In: International Conference on Knowledge Discovery and Data Mining, KDD, pp. 1583–1592. ACM, New York (2014)
Google Scholar
Luo, Q., Fang, X., Sun, Y., Ai, J., Yang, C.: Self-learning hot data prediction: Where echo state network meets NAND flash memories. IEEE Trans. Circuits Syst. I Regul. Pap. 67(I(3)), 939–950 (2020)
Article Google Scholar
Lykouris, T., Vassilvitskii, S.: Competitive caching with machine learned advice. In: International Conference on Machine Learning, ICML, Proceedings of Machine Learning Research, vol. 80, pp. 3302–3311. PMLR, Stockholmsmässan (2018)
Google Scholar
Ma, M., Xu, J., Wang, Y., Chen, P., Zhang, Z., Wang, P.: Automap: Diagnose your microservice-based web applications automatically. In: The Web Conference, WWW, pp. 246–258. ACM / IW3C2, Taipei, Taiwan (2020)
Ma, M., Zhang, S., Chen, J., Xu, J., Li, H., Lin, Y., Nie, X., Zhou, B., Wang, Y., Pei, D.: Jump-starting multivariate time series anomaly detection for online service systems. In: Annual Technical Conference, ATC, pp. 413–426. USENIX Association, Virtual Event (2021)
Ma, M., Lin, W., Pan, D., Wang, P.: Ms-rank: Multi-metric and self-adaptive root cause diagnosis for microservice applications. In: International Conference on Web Services, ICWS, pp. 60–67. IEEE, Milan (2019)
Google Scholar
Maas, M., Andersen, D.G., Isard, M., Javanmard, M.M., McKinley, K.S., Raffel, C.: Learning-based memory allocation for C++ server workloads. In: Architectural Support for Programming Languages and Operating Systems, ASPLOS, pp. 541–556. ACM, Lausanne (2020)
Google Scholar
Mahdisoltani, F., Stefanovici, I.A., Schroeder, B.: Proactive error prediction to improve storage system reliability. In: Silva, D.D., Ford, B. (eds.) Annual Technical Conference, ATC, pp. 391–402. USENIX Association, Santa Clara (2017)
Google Scholar
Mailthody, V.S., Qureshi, Z., Liang, W., Feng, Z., Gonzalo, S.G.D., Li, Y., Franke, H., Xiong, J., Huang, J., Hwu, W.: Deepstore: In-storage acceleration for intelligent queries. In: International Symposium on Microarchitecture, MICRO, pp. 224–238. ACM, Columbus (2019)
Google Scholar
Marcus, R., Kipf, A., van Renen, A., Stoian, M., Misra, S., Kemper, A., Neumann, T., Kraska, T.: Benchmarking learned indexes. Proc. VLDB Endow. 14(1), 1–13 (2020)
Article Google Scholar
Meng, Y., Zhang, S., Sun, Y., Zhang, R., Hu, Z., Zhang, Y., Jia, C., Wang, Z., Pei, D.: Localizing failure root causes in a microservice through causality inference. In: International Symposium on Quality of Service, IWQoS, pp. 1–10. IEEE, Hangzhou (2020)
Google Scholar
Mishra, M., Singhal, R.: RUSLI: real-time updatable spline learned index. In: Bordawekar, R., Amsterdamer, Y., Shmueli, O., Tatbul, N. (eds.) Workshop in Exploiting AI Techniques for Data Management, aiDM, pp. 1–8. ACM, Virtual Event, China (2021)
Monjalet, F., Leibovici, T.: Predicting file lifetimes with machine learning. In: High Performance Computing - ISC High Performance 2019 International Workshops, Lecture Notes in Computer Science, vol. 11887, pp. 288–299. Springer, Frankfurt (2019)
Google Scholar
Mukhanov, L., Tovletoglou, K., Vandierendonck, H., Nikolopoulos, D.S., Karakonstantis, G.: Workload-aware DRAM error prediction using machine learning. In: International Symposium on Workload Characterization, IISWC, pp. 106–118. IEEE, Orlando (2019)
Google Scholar
Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Hard drive failure prediction using non-parametric statistical methods. In: ICANN/ICONIP (2003)
Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Machine learning methods for predicting failures in hard drives: a multiple-instance application. J. Mach. Learn. Res. 6, 783–816 (2005)
MathSciNet MATH Google Scholar
Narayanan, I., Wang, D., Jeon, M., Sharma, B., Caulfield, L., Sivasubramaniam, A., Cutler, B., Liu, J., Khessib, B.M., Vaid, K.: SSD failures in datacenters: What, when and why? In: SIGMETRICS, pp. 407–408. ACM, Antibes Juan-Les-Pins, France (2016)
Narayanan, A., Verma, S., Ramadan, E., Babaie, P., Zhang, Z.: Deepcache: A deep learning based framework for content caching. In: Workshop on Network Meets AI & ML, NetAI@SIGCOMM, pp. 48–53. ACM, Budapest (2018)
Google Scholar
Nathan, V., Ding, J., Alizadeh, M., Kraska, T.: Learning multi-dimensional indexes. In: International Conference on Management of Data, SIGMOD, pp. 985–1000. ACM, Portland (2020)
Google Scholar
Neubert, R., Görlitz, O., Benn, W.: Towards content-related indexing in databases. In: Datenbanksysteme in Büro, Technik und Wissenschaft (BTW), Informatik Aktuell, pp. 305–321. Springer, GI-Fachtagung (2001)
Chapter Google Scholar
Ni, J., Cheng, W., Zhang, K., Song, D., Yan, T., Chen, H., Zhang, X.: Ranking causal anomalies by modeling local propagations on networked systems. In: International Conference on Data Mining, ICDM, pp. 1003–1008. IEEE Computer Society, New Orleans (2017)
Google Scholar
Pang, S., Jia, Y., Stones, R.J., Wang, G., Liu, X.: A combined bayesian network method for predicting drive failure times from SMART attributes. In: International Joint Conference on Neural Networks, IJCNN, pp. 4850–4856. IEEE, Vancouver (2016)
Google Scholar
Park, N., Ahmad, I., Lilja, D.J.: Romano: autonomous storage management using performance prediction in multi-tenant datacenters. In: Symposium on Cloud Computing, SOCC, p. 21. ACM, San Jose, CA, USA (2012)
Park, J.K., Kim, J.: A method for reducing garbage collection overhead of SSD using machine learning algorithms. In: International Conference on Information and Communication Technology Convergence, ICTC, pp. 775–777. IEEE, Jeju Island (2017)
Google Scholar
Park, S., Kim, D., Bang, K., Lee, H., Yoo, S., Chung, E.: An adaptive idle-time exploiting method for low latency NAND flash-based storage devices. IEEE Trans. Comput. 63(5), 1085–1096 (2014)
Article MathSciNet MATH Google Scholar
Paschos, G.S., Destounis, A., Vigneri, L., Iosifidis, G.: Learning to cache with no regrets. In: Conference on Computer Communications, INFOCOM, pp. 235–243. IEEE, Paris (2019)
Google Scholar
Peled, L., Mannor, S., Weiser, U.C., Etsion, Y.: Semantic locality and context-based prefetching using reinforcement learning. In: International Symposium on Computer Architecture, ISCA, pp. 285–297. ACM, Portland (2015)
Google Scholar
Peled, L., Weiser, U.C., Etsion, Y.: A neural network prefetcher for arbitrary memory access patterns. ACM Trans. Archit. Code Optim. 16(4), 37–13727 (2020)
Google Scholar
Pereira, F.L.F., dos,: Santos Lima, F.D., de Moura Leite, L.G., Gomes, J.P.P., de Castro Machado, J.: Transfer learning for bayesian networks with application on hard disk drives failure prediction. In: Brazilian Conference on Intelligent Systems, BRACIS, pp. 228–233. IEEE Computer Society, Uberlândia, Brazil (2017)
Pereira, F., Teixeira, D., Gomes, J.P., Machado, J.C.: Evaluating one-class classifiers for fault detection in hard disk drives. In: Brazilian Conference on Intelligent Systems, BRACIS, pp. 586–591. IEEE, Salvador (2019)
Google Scholar
Pham, C., Wang, L., Tak, B., Baset, S., Tang, C., Kalbarczyk, Z.T., Iyer, R.K.: Failure diagnosis for distributed systems using targeted fault injection. IEEE Trans. Parallel Distrib. Syst. 28(2), 503–516 (2017)
Google Scholar
Pitakrat, T., van Hoorn, A., Grunske, L.: A comparison of machine learning algorithms for proactive hard disk drive failure detection. In: International ACM Sigsoft Symposium on Architecting Critical Systems, ISARCS, pp. 1–10. ACM, Vancouver (2013)
Google Scholar
Poppe, O., Amuneke, T., Banda, D., De, A., Green, A., Knoertzer, M., Nosakhare, E., Rajendran, K., Shankargouda, D., Wang, M., Au, A., Curino, C., Guo, Q., Jindal, A., Kalhan, A., Oslake, M., Parchani, S., Ramani, V., Sellappan, R., Sen, S., Shrotri, S., Srinivasan, S., Xia, P., Xu, S., Yang, A., Zhu, Y.: Seagull: An infrastructure for load prediction and optimized resource allocation. Proc. VLDB Endow. 14(2), 154–162 (2020)
Article Google Scholar
Prats, D.B., Portella, F.A., Costa, C.H.A., Berral, J.L.: You only run once: Spark auto-tuning from a single run. IEEE Trans. Netw. Serv. Manag. 17(4), 2039–2051 (2020)
Article Google Scholar
Qiu, J., Du, Q., Yin, K., Zhang, S.-L., Qian, C.: A causality mining and knowledge graph based method of root cause diagnosis for performance anomaly in cloud applications. Appl. Sci. 10(6), 2166 (2020)
Article Google Scholar
Queiroz, L.P., Gomes, J.P.P., Rodrigues, F.C.M., Brito, F.T., Chaves, I.C., de Moura Leite, L.G., Machado, J.C.: Fault detection in hard disk drives based on a semi parametric model and statistical estimators. New Gen. Comput 36(1), 5–19 (2018)
Article Google Scholar
Rahman, S., Burtscher, M., Zong, Z., Qasem, A.: Maximizing hardware prefetch effectiveness with machine learning. In: International Conference on High Performance Computing and Communications, HPCC, International Symposium on Cyberspace Safety and Security, CSS, International Conference on Embedded Software and Systems, ICESS, pp. 383–389. IEEE, New York (2015)
Google Scholar
Ravandi, B., Papapanagiotou, I.: A self-organized resource provisioning for cloud block storage. Future Gen. Comput. Syst. 89, 765–776 (2018)
Article Google Scholar
Ren, J., Chen, X., Liu, D., Tan, Y., Duan, M., Li, R., Liang, L.: A machine learning assisted data placement mechanism for hybrid storage systems. J. Syst. Archit. 120, 102295 (2021)
Article Google Scholar
Rodriguez, L.V., Yusuf, F.B., Lyons, S., Paz, E., Rangaswami, R., Liu, J., Zhao, M., Narasimhan, G.: Learning cache replacement with CACHEUS. In: Conference on File and Storage Technologies, FAST, pp. 341–354. USENIX Association, Virtual Event (2021)
Sethumurugan, S., Yin, J., Sartori, J.: Designing a cost-effective cache replacement policy using machine learning. In: International Symposium on High-Performance Computer Architecture, HPCA, pp. 291–303. IEEE, Seoul (2021)
Google Scholar
Shen, J., Wan, J., Lim, S., Yu, L.: Random-forest-based failure prediction for hard disk drives. Int. J. Distrib. Sens. Netw. 14(11) (2018)
Shi, H., Arumugam, R.V., Foh, C.H., Khaing, K.K.: Optimal disk storage allocation for multi-tier storage system. In: 2012 Digest APMRC, pp. 1–7 (2012)
Shi, W., Cheng, P., Zhu, C., Chen, Z.: An intelligent data placement strategy for hierarchical storage systems. In: International Conference on Computer and Communications (ICCC), pp. 2023–2027 (2020). IEEE
Shi, Z., Jain, A., Swersky, K., Hashemi, M., Ranganathan, P., Lin, C.: A hierarchical neural model of data prefetching. In: International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pp. 861–873. ACM, Virtual Event, USA (2021)
Shi, Z., Huang, X., Jain, A., Lin, C.: Applying deep learning to the cache replacement problem. In: International Symposium on Microarchitecture, MICRO, pp. 413–425. ACM, Columbus (2019)
Google Scholar
Song, Z., Berger, D.S., Li, K., Lloyd, W.: Learning relaxed belady for content distribution network caching. In: Symposium on Networked Systems Design and Implementation, NSDI, pp. 529–544. USENIX Association, Santa Clara (2020)
Google Scholar
Spector, B., Kipf, A., Vaidya, K., Wang, C., Minhas, U.F., Kraska, T.: Bounding the last mile: Efficient learned string indexing. CoRR abs/2111.14905 (2021)
Srivastava, A., Lazaris, A., Brooks, B., Kannan, R., Prasanna, V.K.: Predicting memory accesses: the road to compact ml-driven prefetcher. In: International Symposium on Memory Systems, MEMSYS, pp. 461–470. ACM, Washington (2019)
Chapter Google Scholar
Stoian, M., Kipf, A., Marcus, R., Kraska, T.: Plex: Towards practical learned indexing. (2021) arXiv preprint arXiv:2108.05117
Subedi, P., Davis, P.E., Duan, S., Klasky, S., Kolla, H., Parashar, M.: Stacker: an autonomic data movement engine for extreme-scale data staging-based in-situ workflows. In: International Conference for High Performance Computing, Networking, Storage, and Analysis, SC, pp. 73–17311. IEEE / ACM, Dallas (2018)
Google Scholar
Sun, X., Chakrabarty, K., Huang, R., Chen, Y., Zhao, B., Cao, H., Han, Y., Liang, X., Jiang, L.: System-level hardware failure prediction using deep learning. In: Design Automation Conference, DAC, p. 20. ACM, Las Vegas, NV, USA (2019)
Sun, Q., Jin, T., Romanus, M., Bui, H., Zhang, F., Yu, H., Kolla, H., Klasky, S., Chen, J., Parashar, M.: Adaptive data placement for staging-based coupled scientific workflows. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC, pp. 65–16512. ACM, Austin (2015)
Google Scholar
Sun, Y., Zhao, Y., Su, Y., Liu, D., Nie, X., Meng, Y., Cheng, S., Pei, D., Zhang, S., Qu, X., Guo, X.: Hotspot: Anomaly localization for additive kpis with multi-dimensional attributes. IEEE Access 6, 10909–10923 (2018)
Article Google Scholar
Tang, C., Dong, Z., Wang, M., Wang, Z., Chen, H.: Learned indexes for dynamic workloads. CoRR abs/1902.00655 (2019) 1902.00655
Tang, C., Wang, Y., Dong, Z., Hu, G., Wang, Z., Wang, M., Chen, H.: Xindex: a scalable learned index for multicore data storage. In: Gupta, R., Shen, X. (eds.) SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pp. 308–320. ACM, San Diego (2020)
Google Scholar
Teran, E., Wang, Z., Jiménez, D.A.: Perceptron learning for reuse prediction. In: International Symposium on Microarchitecture, MICRO, pp. 2–1212. IEEE Computer Society, Taipei (2016)
Google Scholar
Thomas, L., Gougeaud, S., Rubini, S., Deniel, P., Boukhobza, J.: Predicting file lifetimes for data placement in multi-tiered storage systems for HPC. ACM SIGOPS Oper. Syst. Rev. 55(1), 99–107 (2021)
Article Google Scholar
Tomes, E., Rush, E.N., Altiparmak, N.: Towards adaptive parallel storage systems. IEEE Trans. Comput. 67(12), 1840–1848 (2018)
Article MathSciNet MATH Google Scholar
Tsai, L., Franke, H., Li, C., Liao, W.: Learning-based memory allocation optimization for delay-sensitive big data processing. IEEE Trans. Parallel Distrib.Syst. 29(6), 1332–1341 (2018)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Annual Conference on Neural Information Processing Systems, NIPS, Long Beach, CA, USA, pp. 5998–6008 (2017)
Vietri, G., Rodriguez, L.V., Martinez, W.A., Lyons, S., Liu, J., Rangaswami, R., Zhao, M., Narasimhan, G.: Driving cache replacement with ml-based lecar. In: Workshop on Hot Topics in Storage and File Systems, HotStorage. USENIX Association, Boston, MA, USA (2018)
Wang, H., He, H., Alizadeh, M., Mao, H.: Learning caching policies with subsampling. In: NeurIPS Machine Learning for Systems Workshop (2019)
Wang, X., Li, Y., Chen, Y., Wang, S., Du,: Y., He, C., Zhang, Y., Chen, P., Li, X., Song, W., Xu, Q., Jiang, L.: On workload-aware DRAM failure prediction in large-scale data centers. In: VLSI Test Symposium, VTS, pp. 1–6. IEEE, San Diego, CA, USA (2021)
Wang, P., Xu, J., Ma, M., Lin, W., Pan, D., Wang, Y., Chen, P.: Cloudranger: Root cause identification for cloud native systems. In: International Symposium on Cluster, Cloud and Grid Computing, CCGRID, pp. 492–502. IEEE Computer Society, Washington (2018)
Google Scholar
Wang, H., Yi, X., Huang, P., Cheng, B., Zhou, K.: Efficient SSD caching by avoiding unnecessary writes using machine learning. In: International Conference on Parallel Processing, ICPP, pp. 82–18210. ACM, Eugene (2018)
Google Scholar
Wang, H., Nguyen, P., Li, J., Köprü, S., Zhang, G., Katariya, S., Ben-Romdhane, S.: GRANO: interactive graph-based root cause analysis for cloud-native distributed data platform. Proc. VLDB Endow. 12(12), 1942–1945 (2019)
Article Google Scholar
Wang, H., Yang, Y., Huang, P., Zhang, Y., Zhou, K., Tao, M., Cheng, B.: S-CDA: A smart cloud disk allocation approach in cloud block storage system. In: Design Automation Conference, DAC, pp. 1–6. IEEE, San Francisco (2020)
Google Scholar
Wang, H., Zhang, J., Huang, P., Yi, X., Cheng, B., Zhou, K.: Cache what you need to cache: Reducing write traffic in cloud cache via “one-time-access-exclusion’’ policy. ACM Trans. Storage 16(3), 18–11824 (2020)
Article Google Scholar
Wang, Y., Tang, C., Wang, Z., Chen, H.: Sindex: a scalable learned index for string keys. In: SIGOPS Asia-Pacific Workshop on Systems, APSys, pp. 17–24. ACM, Tsukuba (2020)
Chapter Google Scholar
Wang, Y., Song, J., Zhou, K., Liu, Y.: Unsupervised deep hashing with node representation for image retrieval. Pattern Recognit. 112, 107785 (2021)
Article Google Scholar
Wei, X., Chen, R., Chen, H.: Fast rdma-based ordered key-value store using remote learned cache. In: Symposium on Operating Systems Design and Implementation, OSDI, pp. 117–135. USENIX Association, Virtual Event (2020)
Wei, X., Chen, R., Chen, H., Zang, B.: Xstore: Fast rdma-based ordered key-value store using remote learned cache. ACM Trans. Storage 17(3), 18–11832 (2021)
Article Google Scholar
Weng, J., Wang, J.H., Yang, J., Yang, Y.: Root cause analysis of anomalies of multitier services in public clouds. IEEE/ACM Trans. Netw. 26(4), 1646–1659 (2018)
Article Google Scholar
Wilkening, M., Gupta, U., Hsia, S., Trippel, C., Wu, C., Brooks, D., Wei, G.: Recssd: near data processing for solid state drive based recommendation inference. In: International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pp. 717–729. ACM, Virtual Event, USA (2021)
Wu, Z., Xu, H., Pang, G., Yu, F., Wang, Y., Jian, S., Wang, Y.: DRAM failure prediction in aiops: Empiricalevaluation, challenges and opportunities. CoRR abs/2104.15052 (2021)
Wu, C., Ji, C., Xue, C.J.: Reinforcement learning based background segment cleaning for log-structured file system on mobile devices. In: International Conference on Embedded Software and Systems, ICESS, pp. 1–8. IEEE, Las Vegas (2019)
Google Scholar
Wu, L., Tordsson, J., Elmroth, E., Kao, O.: Microrca: Root cause localization of performance issues in microservices. In: Network Operations and Management Symposium, NOMS, pp. 1–9. IEEE, Budapest (2020)
Google Scholar
Wu, J., Zhang, Y., Chen, S., Chen, Y., Wang, J., Xing, C.: Updatable learned index with precise positions. Proc. VLDB Endow. 14(8), 1276–1288 (2021)
Article Google Scholar
Xiao, J., Xiong, Z., Wu, S., Yi, Y., Jin, H., Hu, K.: Disk failure prediction in data centers via online learning. In: International Conference on Parallel Processing, ICPP, pp. 35–13510. ACM, Eugene (2018)
Google Scholar
Xie, D., Chandramouli, B., Li, Y., Kossmann, D.: Fishstore: Faster ingestion with subset hashing. In: International Conference on Management of Data, SIGMOD, pp. 1711–1728. ACM, Amsterdam (2019)
Google Scholar
Xu, C., Wang, G., Liu, X., Guo, D., Liu, T.: Health status assessment and failure prediction for hard drives with recurrent neural networks. IEEE Trans. Comput. 65(11), 3502–3508 (2016)
Article MathSciNet MATH Google Scholar
Xu, Y., Sui, K., Yao, R., Zhang, H., Lin, Q., Dang, Y., Li, P., Jiang, K., Zhang, W., Lou, J., Chintalapati, M., Zhang, D.: Improving service availability of cloud systems by predicting disk error. In: Annual Technical Conference, ATC, pp. 481–494. USENIX Association, Boston (2018)
Google Scholar
Xu, R., Jin, X., Tao, L., Guo, S., Xiang, Z., Tian, T.: An efficient resource-optimized learning prefetcher for solid state drives. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 273–276. IEEE, Dresden (2018)
Google Scholar
Xu, F., Han, S., Lee, P.P.C., Liu, Y., He, C., Liu, J.: General feature selection for failure prediction in large-scale SSD deployment. In: International Conference on Dependable Systems and Networks, DSN, pp. 263–270. IEEE, Taipei (2021)
Google Scholar
Yan, G., Li, J.: Rl-bélády: A unified learning framework for content caching. In: Chen, C.W., Cucchiara, R., Hua, X., Qi, G., Ricci, E., Zhang, Z., Zimmermann, R. (eds.) International Conference on Multimedia, MM, pp. 1009–1017. ACM, Virtual Event/Seattle (2020)
Google Scholar
Yang, P., Xue, N., Zhang, Y., Zhou, Y., Sun, L., Chen, W., Chen, Z., Xia, W., Li, J., Kwon, K.: Reducing garbage collection overhead in SSD based on workload prediction. In: Workshop on Hot Topics in Storage and File Systems, HotStorage. USENIX Association, Renton, WA, USA (2019)
Yang, W., Hu, D., Liu, Y., Wang, S., Jiang, T.: Hard drive failure prediction using big data. In: Symposium on Reliable Distributed Systems Workshop, SRDS, pp. 13–18. IEEE Computer Society, Montreal (2015)
Google Scholar
Yang, Y., Misra, V., Rubenstein, D.: On the optimality of greedy garbage collection for ssds. SIGMETRICS Perform. Eval. Rev. 43(2), 63–65 (2015)
Article Google Scholar
Yang, L., Wang, F., Tan, Z., Feng, D., Qian, J., Tu, S.: ARS: reducing F2FS fragmentation for smartphones using decision trees. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 1061–1066. IEEE, Grenoble (2020)
Google Scholar
Yang, L., Tan, Z., Wang, F., Tu, S., Shao, J.: M2H: optimizing F2FS via multi-log delayed writing and modified segment cleaning based on dynamically identified hotness. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 808–811. IEEE, Grenoble (2021)
Google Scholar
Ye, J., Li, Z., Wang, Z., Zheng, Z., Hu, H., Zhu, W.: Joint cache size scaling and replacement adaptation for small content providers. In: Conference on Computer Communications, INFOCOM, pp. 1–10. IEEE, Vancouver (2021)
Google Scholar
Yu, W., Luo, M., Zhou, P., Si, C., Zhou, Y., Wang, X., Feng, J., Yan, S.: Metaformer is actually what you need for vision. CoRR abs/2111.11418 (2021)
Yuan, D., Yang, Y., Liu, X., Chen, J.: A data placement strategy in scientific cloud workflows. Fut. Gen. Comput. Syst. 26(8), 1200–1214 (2010)
Article Google Scholar
Zeng, Y., Guo, X.: Long short term memory based hardware prefetcher: a case study. In: International Symposium on Memory Systems, MEMSYS, pp. 305–311. ACM, Alexandria (2017)
Chapter Google Scholar
Zhang, M., He, Y.: Zoom: Multi-view vector search for optimizing accuracy, latency and memory. Technical Report MSR-TR-2018-25 (August 2018). https://www.microsoft.com/en-us/research/publication/zoom-multi-view-vector-search-for-optimizing-accuracy-latency-and-memory/
Zhang, J., Huang, P., Zhou, K., Xie, M., Schelter, S.: Hddse: Enabling high-dimensional disk state embedding for generic failure detection system of heterogeneous disks in large data centers. In: Annual Technical Conference, ATC, pp. 111–126. USENIX Association, Virtual Event (2020)
Zhang, X., Wu, H., Chang, Z., Jin, S., Tan, J., Li, F., Zhang, T., Cui, B.: Restune: Resource oriented tuning boosted by meta-learning for cloud databases. In: International Conference on Management of Data, SIGMOD, pp. 2102–2114. ACM, Virtual Event, China (2021)
Zhang, J., Liu, Y., Zhou, K., Li, G., Xiao, Z., Cheng, B., Xing, J., Wang, Y., Cheng, T., Liu, L., Ran, M., Li, Z.: An end-to-end automatic cloud database tuning system using deep reinforcement learning. In: International Conference on Management of Data, SIGMOD, pp. 415–432. ACM, Amsterdam (2019)
Google Scholar
Zhang, C., Song, D., Chen, Y., Feng, X., Lumezanu, C., Cheng, W., Ni, J., Zong, B., Chen, H., Chawla, N.V.: A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In: Conference on Artificial Intelligence, AAAI, pp. 1409–1416. AAAI Press, Honolulu (2019)
Google Scholar
Zhang, S., Roy, R., Rumancik, L., Wang, A.A.: The composite-file file system: decoupling one-to-one mapping of files and metadata for better performance. ACM Trans. Storage 16(1), 5–1518 (2020)
Article Google Scholar
Zhang, J., Zhou, K., Huang, P., He, X., Xie, M., Cheng, B., Ji, Y., Wang, Y.: Minority disk failure prediction based on transfer learning in large data centers of heterogeneous disk systems. IEEE Trans. Parallel Distrib. Syst. 31(9), 2155–2169 (2020)
Article Google Scholar
Zhang, Y., Huang, P., Zhou, K., Wang, H., Hu, J., Ji, Y., Cheng, B.: OSCA: an online-model based cache allocation scheme in cloud block storage systems. In: Gavrilovska, A., Zadok, E. (eds.) Annual Technical Conference, ATC, pp. 785–798. USENIX Association, Virtual Event (2020)
Google Scholar
Zhang, J., Wang, Y., Wang, Y., Zhou, K., Schelter, S., Huang, P., Cheng, B., Ji, Y.: Tier-scrubbing: An adaptive and tiered disk scrubbing scheme with improved MTTD and reduced cost. In: Design Automation Conference, DAC, pp. 1–6. IEEE, San Francisco (2020)
Google Scholar
Zhang, Y., Zhou, K., Huang, P., Wang, H., Hu, J., Wang, Y., Ji, Y., Cheng, B.: A machine learning based write policy for SSD cache in cloud block storage. In: Design, Automation & Test in Europe Conference & Exhibition, DATE, pp. 1279–1282. IEEE, Grenoble (2020)
Google Scholar
Zhao, Y., Liu, X., Gan, S., Zheng, W.: Predicting disk failures with HMM- and hsmm-based approaches. In: International Conference on Data Mining, ICDM, Lecture Notes in Computer Science, vol. 6171, pp. 390–404. Springer, Berlin (2010)
Google Scholar
Zheng, Y., Guo, Q., Tung, A.K.H., Wu, S.: Lazylsh: Approximate nearest neighbor search for multiple distance functions with a single index. In: International Conference on Management of Data, SIGMOD, pp. 2023–2037. ACM, San Francisco (2016)
Google Scholar
Zhou, K., Liu, Y., Song, J., Yan, L., Zou, F., Shen, F.: Deep self-taught hashing for image retrieval. In: Conference on Multimedia Conference, MM, pp. 1215–1218. ACM, Brisbane (2015)
Google Scholar
Zhou, J., Guo, Q., Jagadish, H.V., Krcál, L., Liu, S., Luan, W., Tung, A.K.H., Yang, Y., Zheng, Y.: A generic inverted index framework for similarity search on the GPU. In: International Conference on Data Engineering, ICDE, pp. 893–904. IEEE Computer Society, Paris (2018)
Google Scholar
Zhou, X., Peng, X., Xie, T., Sun, J., Ji, C., Liu, D., Xiang, Q., He, C.: Latent error prediction and fault localization for microservice applications by learning from system trace logs. In: Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/SIGSOFT FSE, pp. 683–694. ACM, Tallinna (2019)
Google Scholar
Zhu, Y., Liu, J.: Classytune: A performance auto-tuner for systems in the cloud. CoRR abs/1910.05482 (2019)
Zhu, B., Wang, G., Liu, X., Hu, D., Lin, S., Ma, J.: Proactive drive failure prediction for large scale storage systems. In: Symposium on Mass Storage Systems and Technologies, MSST, pp. 1–5. IEEE Computer Society, Long Beach (2013)
Google Scholar
Zhu, Y., Liu, J., Guo, M., Bao, Y., Ma, W., Liu, Z., Song, K., Yang, Y.: Bestconfig: tapping the performance potential of systems via automatic configuration tuning. In: Symposium on Cloud Computing, SoCC, pp. 338–350. ACM, Santa Clara (2017)
Google Scholar
Züfle, M., Krupitzer, C., Erhard, F., Grohmann, J., Kounev, S.: To fail or not to fail: Predicting hard disk drive failure time windows. In: Measurement, Modelling and Evaluation of Computing Systems MMB, Lecture Notes in Computer Science, vol. 12040, pp. 19–36. Springer, Saarbrücken (2020)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China No.61902135 and No.62172180, and the Joint Founds of ShanDong Natural Science Funds (Grant No.ZR2019LZH003). Thanks to Professor Hong Jiang for his advice on classification issues. Thanks to all students in Intelligent Data Storage and Management Laboratory\(^{4}\).

Author information

Authors and Affiliations

Huazhong University of Science and Technology, Luoyu Road, Wuhan, 430074, Hubei, China
Yu Liu, Hua Wang, Ke Zhou, ChunHua Li & Rengeng Wu

Authors

Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Zhou
View author publications
You can also search for this author in PubMed Google Scholar
ChunHua Li
View author publications
You can also search for this author in PubMed Google Scholar
Rengeng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Wang, H., Zhou, K. et al. A survey on AI for storage. CCF Trans. HPC 4, 233–264 (2022). https://doi.org/10.1007/s42514-022-00101-3

Download citation

Received: 16 December 2021
Accepted: 08 April 2022
Published: 23 May 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s42514-022-00101-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on AI for storage

Abstract

Access this article

Similar content being viewed by others

Big data storage technologies: a survey

Cost analysis of erasure coding for exa-scale storage

A novel non-volatile memory storage system for I/O-intensive applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A survey on AI for storage

Abstract

Access this article

Similar content being viewed by others

Big data storage technologies: a survey

Cost analysis of erasure coding for exa-scale storage

A novel non-volatile memory storage system for I/O-intensive applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation