skip to main content
10.1145/2597652.2597679acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Supporting storage configuration for I/O intensive workflows

Authors Info & Claims
Published:10 June 2014Publication History

ABSTRACT

System provisioning, resource allocation, and system configuration decisions for I/O-intensive workflow applications are complex even for expert users. Users face choices at multiple levels: allocating resources to individual sub-systems (e.g., the application layer, the storage layer) and configuring each of these optimally (e.g., replication level, chunk size, caching policies in case of storage) all having a large impact on overall application performance. This paper presents our progress on addressing the problem of supporting these provisioning, allocation and configuration decisions for workflow applications. To enable selecting a good choice in a reasonable time, we propose an approach that accelerates the exploration of the configuration space based on a low-cost performance predictor that estimates total execution time of a workflow application in a given setup. Our evaluation shows that: (i) the predictor is effective in identifying the desired system configuration, (ii) it can scale to model a workflow application run on an entire cluster, while (iii) using over 2000x less resources (machines x time) than running the actual application.

References

  1. DiskSim. http://www.pdl.cmu.edu/DiskSim/.Google ScholarGoogle Scholar
  2. The network simulator NS2. http://www.isi.edu/nsnam/ns/, 2012.Google ScholarGoogle Scholar
  3. M. Abd-El-Malek, W. V. C. II, C. Cranor, G. R. Ganger, J. Hendricks, A. J. Klosterman, M. P. Mesnier, M. Prasad, B. Salmon, R. R. Sambasivan, S. Sinnamohideen, J. D. Strunk, E. Thereska, M. Wachs, and J. J. Wylie. Ursa minor: Versatile cluster-based storage. In Proc. of the Conf. on File and Storage Technologies, Dec. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Al-Kiswany, A. Gharaibeh, and M. Ripeanu. The case for a versatile storage system. SIGOPS Oper. Syst. Rev., 44:10--14, March 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic Local Alignment Search Tool. J. of Molecular Biology, 215(3):403--410, Oct. 1990.Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: Running circles around storage administration. In Proc. of the Conf. on File and Storage Technologies, pages 175--188, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Anderson, S. Spence, R. Swaminathan, M. Kallahalla, and Q. Wang. Quickly finding near-optimal storage designs. ACM Trans. Comput. Syst., 23(4):337--374, Nov 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Behzad, H. V. T. Luu, J. Huchette, S. Byna, Prabhat, R. A. Aydt, Q. Koziol, and M. Snir. Taming Parallel I/O Complexity with Auto-Tuning. In SC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi. Characterization of scientific workflows. In Workflows in Support of Large-Scale Science, 2008. WORKS 2008. 3rd Workshop on, pages 1--10, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  10. F. Cappello, E. Caron, M. Daydé, F. Desprez, Y. Jégou, P. Primet, E. Jeannot, S. Lanteri, J. Leduc, N. Melab, G. Mornet, R. Namyst, B. Quetier, and O. Richard. Grid'5000: a large scale and highly reconfigurable grid experimental testbed. In Grid Comp.Proc of the 6th IEEE/ACM Intl. Workshop on, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. B. Costa, S. Al-Kiswany, R. V. Lopes, and M. Ripeanu. Assessing data deduplication trade-offs from an energy and performance perspective. In 2011 Intl. Green Computing Conf. and Workshops, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. B. Costa, A. Barros, S. Al-Kiswany, E. Vairavanathan, and M. Ripeanu. Predicting intermediate storage performance for workflow applications. CoRR, abs/1302.4760, 2013.Google ScholarGoogle Scholar
  13. L. B. Costa, J. Brunet, L. Hattori, and M. Ripeanu. Experience on Applying Performance Prediction during Development: a Distributed Storage System Tale. Technical report, UBC/ECE/NetSysLab, Sep. 13. http://www.ece.ubc.ca/~lauroc/tr/tech2.pdf.Google ScholarGoogle Scholar
  14. L. B. Costa and M. Ripeanu. Towards Automating the Configuration of a Distributed Storage System. In 11th ACM/IEEE Intl. Conf. on Grid Computing - Grid 2010, Oct. 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. I. F. Haddad. Pvfs: A parallel virtual file system for linux clusters. Linux Journal, 2000(80es), Nov. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. C. Laity, N. Anagnostou, G. B. Berriman, J. C. Good, J. C. Jacob, D. S. Katz, and T. Prince. Montage: An Astronomical Image Mosaic Service for the NVO. In Astronomical Data Analysis Software and Systems XIV, volume 347 of Astronomical Society of the Pacific Conf. Series, page 34, Dec 2005.Google ScholarGoogle Scholar
  17. N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, A. Crume, and C. Maltzahn. Modeling a leadership scale storage system. In Proc. of the 9th Intl. Conf. on Parallel Processing and Applied Mathematics - Vol. Part I, PPAM'11, pages 10--19, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. On the role of burst buffers in leadership class storage systems. In Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symp. on, pages 1--11, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  19. E. Molina-Estolano, C. Maltzahn, J. Bent, and S. Brandt. Building a Parallel File System Simulator. 180(1):012050, 2009.Google ScholarGoogle Scholar
  20. A. Montresor and M. Jelasity. PeerSim: A scalable P2P simulator. In Proc. of the 9th Int. Conf. on Peer-to-Peer (P2P'09), pages 99--100, Sep 2009.Google ScholarGoogle ScholarCross RefCross Ref
  21. T. Shibata, S. Choi, and K. Taura. File-access patterns of data-intensive workflow applications and their implications to distributed filesystems. In Proc. of the 19th ACM Intl. Symp. on High Perf. Distributed Computing, HPDC '10, pages 746--755, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. D. Strunk, E. Thereska, C. Faloutsos, and G. R. Ganger. Using utility to provision storage systems. In 6th USENIX Conf. on File and Storage Technologies, FAST, pages 313--328, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Thereska, M. Abd-El-Malek, J. J. Wylie, D. Narayanan, and G. R. Ganger. Informed datamdistribution selection in a self-predicting storagemsystem. In Proc. of the 3rd Intl. Conf. on AutonomicmComputing, pages 187--198, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Thereska, B. Salmon, J. D. Strunk, M. Wachs,M. Abd-El-Malek, J. Lopez, and G. R. Ganger.Stardust: tracking activity in a distributed storagesystem. In SIGMETRICS/Perf., pages 3--14, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Vairavanathan, S. Al-Kiswany, L. B. Costa,Z. Zhang, D. S. Katz, M. Wilde, and M. Ripeanu. Aworkflow-aware storage system: An opportunity study. In Cluster Computing and the Grid, IEEE Intl. Symp. on, pages 326--334, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Varga. Using the OMNeT++ Discrete Event Simulation System in Education. Education, IEEETrans. on, 42(4), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. T. Foster. Swift: A language for distributed parallel scripting. Parallel Computing, 37(9):633--652, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. M. Wozniak and M. Wilde. Case studies in storage access by loosely coupled petascale applications. InProc. of the 4th Workshop on Petascale Data Storage,PDSW'09, pages 16--20, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Zhang, D. S. Katz, M. Wilde, J. M. Wozniak, and I. Foster. MTC Envelope: Defining the Capability ofLarge Scale Computers in the Context of ParallelScripting Applications. In Proc. of the 22Nd Intl. Symp. on High Perf. Parallel and Distributed Computing, HPDC'13, pages 37--48, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Zhang, D. S. Katz, J. M. Wozniak, A. Espinosa, and I. Foster. Design and analysis of data management in scalable parallel scripting. In Proc. of the Intl. Conf. on High Performance Computing, Networking, Storage and Analysis, SC'12, pages 85:1--85:11, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supporting storage configuration for I/O intensive workflows

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICS '14: Proceedings of the 28th ACM international conference on Supercomputing
      June 2014
      378 pages
      ISBN:9781450326421
      DOI:10.1145/2597652

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICS '14 Paper Acceptance Rate34of160submissions,21%Overall Acceptance Rate584of2,055submissions,28%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader