skip to main content
10.1145/3337821.3337904acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

BPP: A Realtime Block Access Pattern Mining Scheme for I/O Prediction

Published: 05 August 2019 Publication History

Abstract

Block access patterns refer to the regularities of accessed blocks, and can be used to effectively enhance the intelligence of block storage systems. However, existing algorithms fail to uncover block access patterns in efficient ways. They either suffer high time and space overhead or only focus on the simplest patterns like sequential ones.
In this paper, we propose a realtime block access pattern mining scheme, called BPP, to mine block access patterns at run time with low time and space overhead for making efficient I/O predictions. To reduce the time and space overhead for mining block access patterns, BPP classifies block access patterns into simple and compound ones based on the mining costs of different patterns, and differentiates the mining policies for simple and compound patterns. BPP also adopts a novel garbage cleaning policy, which is specially designed based on the observed features of the obtained patterns to accurately detect valueless patterns and remove them as early as possible. With such a garbage cleaning policy, BPP further reduces the space overhead for managing and utilizing the obtained patterns. To demonstrate the effect of BPP, we conduct a series of experiments with real-world workloads. The experimental results show that BPP can significantly outperform the state-of-the-art I/O prediction schemes.

References

[1]
2007. http://traces.cs.umass.edu/index.php/Storage/Storage
[2]
2008. http://iotta.snia.org/traces/130
[3]
2018. https://aws.amazon.com/cn/ebs/
[4]
Ahmed Amer and Darrell DE Long. 2001. Noah: Low-cost file access prediction through pairs. In Proceedings of the 20th IEEE International Performance, Computing and Communications Conference (IPCCC'01). 27--33.
[5]
Ahmed Amer, Darrell DE Long, Jehan-Fracois PÃćris, and Randal C Burns. 2002. File access prediction with adjustable accuracy. In Proceedings of the 21st International Performance of Computers and Communication Conference. 131--140.
[6]
Surendra Byna, Yong Chen, Xian-He Sun, Rajeev Thakur, and William Gropp. 2008. Parallel I/O prefetching using MPI file caching and I/O signatures. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing. 1--12.
[7]
Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. 2007. DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch. In Proceedings of the annual conference on USENIX Annual Technical Conference. 261--274.
[8]
M. Dorier, S. Ibrahim, G. Antoniu, and R. Ross. 2014. Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction. In Proccedings of the International Conference for High Performance Computing, Networking, Storage and Analysis(SC'14). 623--634.
[9]
Binny S Gill and Luis Angel D Bathen. 2007. AMP: Adaptive Multiple-stream Prefetching in a Shared Cache. In Proccedings of the 5th conference on File and Storage Technologies(FAST'07). 185--198.
[10]
Binny S Gill and Luis Angel D Bathen. 2007. Optimal multistream sequential prefetching in a shared cache. ACM Transactions on Storage 3, 3 (2007), 10:1--10:27.
[11]
Binny S Gill and Dharmendra S Modha. 2005. SARC: Sequential Prefetching in Adaptive Replacement Cache. In Proceedings of the annual conference on USENIX Annual Technical Conference. 293--308.
[12]
Peng Gu, Yifeng Zhu, Hong Jiang, and Jun Wang. 2006. Nexus: a novel weighted-graph-based prefetching algorithm for metadata servers in petabyte-scale storage systems. In Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid. 409--416.
[13]
Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, and Xian-He Sun. 2013. I/O acceleration with pattern detection. In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing. 25--36.
[14]
Binbing Hou and Feng Chen. 2018. Pacaca: Mining Object Correlations and Parallelism for Enhancing User Experience with Cloud Storage. In Proceedings of the 26th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS'18). IEEE, 293--305.
[15]
Jong Min Kim, Jongmoo Choi, Jesung Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2000. A low-overhead high-performance unified buffer management scheme that exploits sequential and looping references. In Proceedings of the 4th conference on Symposium on Operating System Design and Implementation. 119--134.
[16]
Chunghan Lee, Tatsuo Kumano, Tatsuma Matsuki, Hiroshi Endo, Naoto Fukumoto, and Mariko Sugawara. 2017. Understanding Storage Traffic Characteristics on Enterprise Virtual Desktop Infrastructure. In Proceedings of the 10th ACM International Systems and Storage Conference(SYSTOR'17). 13:1--13:11.
[17]
Zhenmin Li, Zhifeng Chen, Sudarshan M Srinivasan, and Yuanyuan Zhou. 2004. C-Miner: Mining Block Correlations in Storage Systems. In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST'04). 173--186.
[18]
Shuang Liang, Song Jiang, and Xiaodong Zhang. 2007. STEP: Sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS'07). 64--73.
[19]
Tara M. Madhyastha and Daniel A. Reed. 1997. Input/output access pattern classification using hidden Markov models. In Proceedings of the fifth workshop on I/O in parallel and distributed systems. 57--67.
[20]
Nimrod Megiddo and Dharmendra S Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST'03). 115--130.
[21]
Nils Nieuwejaar and David Kotz. 1996. Low-Level Interfaces for High-Level Parallel I/O. In Input/Output in Parallel and Distributed Computer Systems. 205--223.
[22]
James Oly and Daniel A. Reed. 2002. Markov Model Prediction of I/O Requests for Scientific Applications. In Proceedings of the 16th International Conference on Supercomputing (ICS '02). 147--155.
[23]
Purvi Shah, Jehan-FranÃğois PÃćris, Ahmed Amer, and Darrell DE Long. 2004. Identifying Stable File Access Patterns. In Proceedings of the Twelfth NASA Goddard/Twenty First IEEE Conference on Mass Storage Systems and Technologies (MSST '04). 159--163.
[24]
Nancy Tran and Daniel Reed. 2004. Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Transactions on Parallel and Distributed Systems 15, 4 (2004), 362--377.
[25]
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-performance Distributed File System. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI '06). 307--320.
[26]
Fengguang Wu. 2010. Sequential file prefetching in Linux. (2010), 217--236.
[27]
Xifeng Yan, Jiawei Han, and Ramin Afshar. 2003. CloSpan: Mining closed sequential patterns in large datasets. In Proceedings of the 2003 SIAM International Conference on Data Mining(SDM'03). 166--177.

Cited By

View all
  • (2021)KBP: Mining Block Access Pattern for I/O Prediction with K-Truss2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00035(167-176)Online publication date: Sep-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
August 2019
1107 pages
ISBN:9781450362955
DOI:10.1145/3337821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. I/O predicting
  2. block access pattern
  3. block storage

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Defense Preliminary Research Project
  • Hubei Province Technical Innovation Special Project
  • Wuhan Application Basic Research Project
  • Fundamental Research Funds for the Central Universities
  • NSFC

Conference

ICPP 2019

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)KBP: Mining Block Access Pattern for I/O Prediction with K-Truss2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00035(167-176)Online publication date: Sep-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media