ABSTRACT
Object cloud storage systems are deployed with diverse applications that have varying latency service level objectives (SLOs), posting challenges for supporting quality of service with limited storage resources. Existing methods provide prediction-based recommendations for dispatching requests from applications to storage devices, but the prediction accuracy can be affected by complex system topology. To address this issue, Graph3PO is designed to combine storage device queue information with system topological information for forming a temporal graph, which can accurately predict device queue states. Additionally, Graph3PO contains the urgency degree model and cost model for measuring SLO violation risks and penalties of scheduling requests on storage device queues. When the urgency degree of a request exceeds a threshold, Graph3PO determines whether to schedule it in the queue or initiate a hedge request to another storage device. Experimental results show that Graph3PO outperforms its competitors, with SLO violation rates 2.8 to 201.1 times lower.
- Ahmed Ali-Eldin, Bin Wang, and Prashant J. Shenoy. 2021. The hidden cost of the edge: a performance comparison of edge and cloud latencies. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2021, St. Louis, Missouri, USA, November 14--19, 2021. ACM, 23. Google ScholarDigital Library
- Mohammad Alian and Nam Sung Kim. 2019. NetDIMM: Low-Latency Near-Memory Network Interface Architecture. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019, Columbus, OH, USA, October 12--16, 2019. ACM, 699--711. Google ScholarDigital Library
- Harsh Bhatia, Francesco Di Natale, Joseph Y. Moon, Xiaohua Zhang, Joseph R. Chavez, Fikret Aydin, Chris Stanley, Tomas Oppelstrup, Chris Neale, Sara Kokkila Schumacher, Dong H. Ahn, Stephen Herbein, Timothy S. Carpenter, Sandrasegaram Gnanakaran, Peer-Timo Bremer, James N. Glosli, Felice C. Lightstone, and Helgi I. Ingólfsson. 2021. Generalizable coordination of large multiscale workflows: challenges and learnings at scale. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2021, St. Louis, Missouri, USA, November 14--19, 2021. ACM, 10. Google ScholarDigital Library
- Sumitro Bhaumik, Ravi Bansal, Raja Karmakar, Satish Kumar Mopur, Saikat Mukherjee, Mandar Jagannath Chitale, and Sandip Chakraborty. 2022. NetStor: Network and Storage Traffic Management for Ensuring Application QoS in a Hyperconverged Data-Center. IEEE Trans. Cloud Comput. 10, 2 (2022), 1287--1300. Google ScholarCross Ref
- Sergey Blagodurov, Sergey Zhuravlev, Mohammad Dashti, and Alexandra Fedorova. 2011. A Case for NUMA-aware Contention Management on Multicore Systems. In 2011 USENIX Annual Technical Conference, Portland, OR, USA, June 15--17, 2011. USENIX Association.Google ScholarDigital Library
- Marcos Carvalho and Daniel Fernandes Macedo. 2021. QoE-Aware Container Scheduler for Co-located Cloud Environments. In 17th IFIP/IEEE International Symposium on Integrated Network Management, IM 2021, Bordeaux, France, May 17--21, 2021. IEEE, 286--294.Google Scholar
- Jie Chen, Licheng Jiao, Xu Liu, Lingling Li, Fang Liu, and Shuyuan Yang. 2022. Automatic Graph Learning Convolutional Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote. Sens. 60 (2022), 1--16. Google ScholarCross Ref
- Shuang Chen, Yi Jiang, Christina Delimitrou, and José F. Martínez. 2022. PIM-Cloud: QoS-Aware Resource Management of Latency-Critical Applications in Clouds with Processing-in-Memory. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022, Seoul, South Korea, April 2--6, 2022. IEEE, 1086--1099. Google ScholarCross Ref
- Shuang Chen, Angela Jin, Christina Delimitrou, and José F. Martínez. 2022. ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022, Seoul, South Korea, April 2--6, 2022. IEEE, 155--168. Google ScholarCross Ref
- Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In Proceedings of SSST@EMNLP 2014, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, 25 October 2014. Association for Computational Linguistics, 103--111. Google ScholarCross Ref
- Clément Courageux-Sudan, Anne-Cécile Orgerie, and Martin Quinson. 2021. Automated performance prediction of microservice applications using simulation. In 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2021, Houston, TX, USA, November 3--5, 2021. IEEE, 1--8. Google ScholarCross Ref
- Weihao Cui, Han Zhao, Quan Chen, Ningxin Zheng, Jingwen Leng, Jieru Zhao, Zhuo Song, Tao Ma, Yong Yang, Chao Li, and Minyi Guo. 2021. Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2021, St. Louis, Missouri, USA, November 14--19, 2021. ACM, 15. Google ScholarDigital Library
- Ziquan Fang, Lu Pan, Lu Chen, Yuntao Du, and Yunjun Gao. 2021. MDTP: A Multi-source Deep Traffic Prediction Framework over Spatio-Temporal Trajectory Data. Proc. VLDB Endow. 14, 8 (2021), 1289--1297. Google ScholarDigital Library
- Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7--12, 2015, Montreal, Quebec, Canada, Vol. 28. Curran Associates, Inc., 2962--2970.Google Scholar
- Ashanie Gunathillake, Hailong Huang, and Andrey V. Savkin. 2019. Sensor-Network-Based Navigation of a Mobile Robot for Extremum Seeking Using a Topology Map. IEEE Trans. Ind. Informatics 15, 7 (2019), 3962--3972. Google ScholarCross Ref
- Udit Gupta, Samuel Hsia, Vikram Saraph, Xiaodong Wang, Brandon Reagen, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, and Carole-Jean Wu. 2020. Deep-RecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference. In 47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020, Valencia, Spain, May 30 - June 3, 2020. IEEE, 982--995. Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Shuo Huai, Lei Zhang, Di Liu, Weichen Liu, and Ravi Subramaniam. 2021. ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems. In 58th ACM/IEEE Design Automation Conference, DAC 2021, San Francisco, CA, USA, December 5--9, 2021. IEEE, 151--156. Google ScholarDigital Library
- Yakun Huang, Xiuquan Qiao, Schahram Dustdar, and Yan Li. 2022. AoDNN: An Auto-Offloading Approach to Optimize Deep Inference for Fostering Mobile Web. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, London, United Kingdom, May 2--5, 2022. IEEE, 2198--2207. Google ScholarDigital Library
- Mihailo Isakov, Eliakin Del Rosario, Sandeep Madireddy, Prasanna Balaprakash, Philip H. Carns, Robert B. Ross, and Michel A. Kinsy. 2020. HPC I/O throughput bottleneck analysis with explainable local models. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9--19, 2020. IEEE/ACM, 33. Google ScholarCross Ref
- Anura P. Jayasumana, Randy C. Paffenroth, Gunjan Mahindre, Sridhar Ramasamy, and Kelum Gajamannage. 2019. Network Topology Mapping From Partial Virtual Coordinates and Graph Geodesics. IEEE/ACM Trans. Netw. 27, 6 (2019), 2405--2417. Google ScholarDigital Library
- Saurabh Jha, Shengkun Cui, Subho S. Banerjee, Tianyin Xu, Jeremy Enos, Mike Showerman, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. Live forensics for HPC systems: a case study on distributed storage systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9--19, 2020. IEEE/ACM, 65. Google ScholarCross Ref
- Haifeng Jin, François Chollet, Qingquan Song, and Xia Hu. 2023. AutoKeras: An AutoML Library for Deep Learning. Journal of Machine Learning Research 24, 6 (2023), 1--6.Google Scholar
- M. I. Jordan. 1986. Serial order: A parallel distributed processing approach. ICS-Report 8604 Institute for Cognitive Science University of California 121 (1986), 64.Google Scholar
- Parikshit Juluri, Venkatesh Tamarapalli, and Deep Medhi. 2016. Measurement of Quality of Experience of Video-on-Demand Services: A Survey. IEEE Commun. Surv. Tutorials 18, 1 (2016), 401--418. Google ScholarDigital Library
- Ki-Dong Kang, Gyeongseo Park, Hyosang Kim, Mohammad Alian, Nam Sung Kim, and Daehoon Kim. 2021. NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads. In MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, October 18--22, 2021. ACM, 143--154.Google Scholar
- Flora Karniavoura and Kostas Magoutis. 2019. Decision-Making Approaches for Performance QoS in Distributed Storage Systems: A Survey. IEEE Trans. Parallel Distributed Syst. 30, 8 (2019), 1906--1919. Google ScholarCross Ref
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net.Google Scholar
- Marios Kogias, Stephen Mallon, and Edouard Bugnion. 2019. Lancet: A self-correcting Latency Measuring Tool. In 2019 USENIX Annual Technical Conference, USENIX ATC 2019, Renton, WA, USA, July 10--12, 2019. USENIX Association, 881--896.Google Scholar
- Baolin Li, Rohan Basu Roy, Tirthak Patel, Vijay Gadepally, Karen Gettings, and Devesh Tiwari. 2021. RIBBON: cost-effective and qos-aware deep learning model inference using a diverse pool of cloud computing instances. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2021, St. Louis, Missouri, USA, November 14--19, 2021. ACM, 24. Google ScholarDigital Library
- Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He, I-Ting Angelina Lee, Chenyang Lu, and Kathryn S. McKinley. 2016. Work stealing for interactive services to meet target latency. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016, Barcelona, Spain, March 12--16, 2016. ACM, 14:1--14:13. Google ScholarDigital Library
- Ning Li, Hong Jiang, Dan Feng, and Zhan Shi. 2019. Storage Sharing Optimization Under Constraints of SLO Compliance and Performance Variability. IEEE Trans. Serv. Comput. 12, 1 (2019), 58--72. Google ScholarCross Ref
- Shutian Luo, Huanle Xu, Kejiang Ye, Guoyao Xu, Liping Zhang, Jian He, Guodong Yang, and Chengzhong Xu. 2023. Erms: Efficient Resource Management for Shared Microservices with SLA Guarantees. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2023, Vancouver, BC, Canada, March 25--29, 2023. ACM, 62--77. Google ScholarDigital Library
- David Meisner, Christopher M. Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F. Wenisch. 2011. Power management of online data-intensive services. In 38th International Symposium on Computer Architecture (ISCA 2011), June 4--8, 2011, San Jose, CA, USA. ACM, 319--330. Google ScholarDigital Library
- David Meisner and Thomas F. Wenisch. 2012. DreamWeaver: architectural support for deep sleep. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012, London, UK, March 3--7, 2012. ACM, 313--324. Google ScholarDigital Library
- David Meisner, Junjie Wu, and Thomas F. Wenisch. 2012. BigHouse: A simulation infrastructure for data center systems. In 2012 IEEE International Symposium on Performance Analysis of Systems & Software, New Brunswick, NJ, USA, April 1--3, 2012. IEEE Computer Society, 35--45. Google ScholarDigital Library
- Amirhossein Mirhosseini, Brendan L. West, Geoffrey W. Blake, and Thomas F. Wenisch. 2020. Q-Zilla: A Scheduling Framework and Core Microarchitecture for Tail-Tolerant Microservices. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, San Diego, CA, USA, February 22--26, 2020. IEEE, 207--219. Google ScholarCross Ref
- Minh Nguyen, Sami Alesawi, Ning Li, Hao Che, and Hong Jiang. 2020. A Black-Box Fork-Join Latency Prediction Model for Data-Intensive Applications. IEEE Trans. Parallel Distributed Syst. 31, 9 (2020), 1983--2000. Google ScholarCross Ref
- Tuan Nguyen, Giang T. T. Nguyen, Thin Nguyen, and Duc-Hau Le. 2022. Graph Convolutional Networks for Drug Response Prediction. IEEE ACM Trans. Comput. Biol. Bioinform. 19, 1 (2022), 146--154. Google ScholarDigital Library
- Rajiv Nishtala, Vinicius Petrucci, Paul M. Carpenter, and Magnus Själander. 2020. Twig: Multi-Agent Task Management for Colocated Latency-Critical Cloud Services. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, San Diego, CA, USA, February 22--26, 2020. IEEE, 167--179. Google ScholarCross Ref
- Marcelo Orenes-Vera, Aninda Manocha, Jonathan Balkind, Fei Gao, Juan L. Aragón, David Wentzlaff, and Margaret Martonosi. 2022. Tiny but mighty: designing and realizing scalable latency tolerance for manycore SoCs. In ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18 - 22, 2022. ACM, 817--830. Google ScholarDigital Library
- Poovaiah M. Palangappa and Kartik Mohanram. 2018. CASTLE: compression architecture for secure low latency, low energy, high endurance NVMs. In Proceedings of the 55th Annual Design Automation Conference, DAC 2018, San Francisco, CA, USA, June 24--29, 2018. ACM, 87:1--87:6. Google ScholarDigital Library
- Tirthak Patel and Devesh Tiwari. 2020. CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, San Diego, CA, USA, February 22--26, 2020. IEEE, 193--206. Google ScholarCross Ref
- Zhenbo Qiao, Qing Liu, Norbert Podhorszki, Scott Klasky, and Jieyang Chen. 2020. Taming I/O variation on QoS-less HPC storage: what can applications do?. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9--19, 2020. IEEE/ACM, 11. Google ScholarCross Ref
- Michael D. Shields and Jiaxin Zhang. 2016. The generalization of Latin hypercube sampling. Reliab. Eng. Syst. Saf. 148 (2016), 96--108. Google ScholarCross Ref
- Ioan A. Stefanovici, Bianca Schroeder, Greg O'Shea, and Eno Thereska. 2016. sRoute: Treating the Storage Stack Like a Network. In 14th USENIX Conference on File and Storage Technologies, FAST 2016, Santa Clara, CA, USA, February 22--25, 2016. USENIX Association, 197--212.Google ScholarDigital Library
- Shucheng Wang, Ziyi Lu, Qiang Cao, Hong Jiang, Jie Yao, Yuanyuan Dong, Puyuan Yang, and Changsheng Xie. 2022. Exploration and Exploitation for Buffer-Controlled HDD-Writes for SSD-HDD Hybrid Storage Server. ACM Trans. Storage 18, 1 (2022), 6:1--6:29. Google ScholarDigital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-Performance Distributed File System. In 7th Symposium on Operating Systems Design and Implementation (OSDI '06), November 6--8, Seattle, WA, USA. USENIX Association, 307--320.Google ScholarDigital Library
- Zhe Wu, Curtis Yu, and Harsha V. Madhyastha. 2015. CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, Oakland, CA, 543--557.Google Scholar
- Yanan Yang, Laiping Zhao, Yiming Li, Huanyu Zhang, Jie Li, Mingyang Zhao, Xingzhen Chen, and Keqiu Li. 2022. INFless: a native serverless system for low-latency, high-throughput inference. In ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022. ACM, 768--781. Google ScholarDigital Library
- Tianqi Yu, Xianbin Wang, and Jianling Hu. 2021. A Fast Hierarchical Physical Topology Update Scheme for Edge-Cloud Collaborative IoT Systems. IEEE/ACM Trans. Netw. 29, 5 (2021), 2254--2266. Google ScholarDigital Library
- Haitao Yuan, Jing Bi, and MengChu Zhou. 2022. Energy-Efficient and QoS-Optimized Adaptive Task Scheduling and Management in Clouds. IEEE Trans Autom. Sci. Eng. 19, 2 (2022), 1233--1244. Google ScholarCross Ref
- Di Zhang, Dong Dai, Youbiao He, Forrest Sheng Bao, and Bing Xie. 2020. RLScheduler: an automated HPC batch job scheduler using reinforcement learning. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9--19, 2020. IEEE/ACM, 31. Google ScholarCross Ref
- Fenghui Zhang, Michael Mao Wang, Ruilong Deng, and Xiaohu You. 2022. QoS Optimization for Mobile Ad Hoc Cloud: A Multi-Agent Independent Learning Approach. IEEE Trans. Veh. Technol. 71, 1 (2022), 1077--1082. Google ScholarCross Ref
- Wei Zhang, Quan Chen, Kaihua Fu, Ningxin Zheng, Zhiyi Huang, Jingwen Leng, and Minyi Guo. 2022. Astraea: towards QoS-aware and resource-efficient multi-stage GPU services. In ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022. ACM, 570--582. Google ScholarDigital Library
- Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G. Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-based and QoS-aware resource management for cloud microservices. In ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual Event, USA, April 19--23, 2021. ACM, 167--181. Google ScholarDigital Library
- Yu Zhang, Qingsong Wei, Cheng Chen, Mingdi Xue, Xinkun Yuan, and Chundong Wang. 2018. Dynamic Scheduling with Service Curve for QoS Guarantee of Large-Scale Cloud Storage. IEEE Trans. Computers 67, 4 (2018), 457--468. Google ScholarCross Ref
- Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and Haifeng Li. 2020. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 21, 9 (2020), 3848--3858. Google ScholarCross Ref
- Laiping Zhao, Yanan Yang, Yiming Li, Xian Zhou, and Keqiu Li. 2021. Understanding, predicting and scheduling serverless workloads under partial interference. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2021, St. Louis, Missouri, USA, November 14--19, 2021. ACM, 22. Google ScholarDigital Library
- Liang Zhou, Laxmi N. Bhuyan, and K. K. Ramakrishnan. 2020. Gemini: Learning to Manage CPU Power for Latency-Critical Search Engines. In 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020, Athens, Greece, October 17--21, 2020. IEEE, 637--349. Google ScholarCross Ref
- Liang Zhou, Laxmi N. Bhuyan, and K. K. Ramakrishnan. 2022. Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022, Seoul, South Korea, April 2--6, 2022. IEEE, 113--125. Google ScholarCross Ref
Index Terms
- Graph3PO: A Temporal Graph Data Processing Method for Latency QoS Guarantee in Object Cloud Storage System
Recommendations
Middleware enabled data sharing on cloud storage services
MW4SOC '10: Proceedings of the 5th International Workshop on Middleware for Service Oriented ComputingWith the emergence of public cloud storage platforms like Amazon, Microsoft and Google etc, individual applications and some enterprise storage are being increasingly deployed on Clouds. However, dynamic data sharing in public clouds face problems of ...
Resource Provisioning with QoS in Cloud Storage
BIGDATACONGRESS '14: Proceedings of the 2014 IEEE International Congress on Big DataWith the rapid development of cloud computing, cloud services and human life has been closely together. In recent years, cloud services are widely used by everyone. With network technology advanced, network accessing for data transmission become more ...
Availability and Fairness Support for Storage QoS Guarantee
ICDCS '08: Proceedings of the 2008 The 28th International Conference on Distributed Computing SystemsMulti-dimensional storage virtualization (MDSV) technology allows multiple virtual disks, each with a distinct combination of capacity, latency and bandwidth requirements, to be multiplexed on a physical disk storage system with performance isolation. ...
Comments