skip to main content
10.1145/3569951.3597552acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
short-paper

dug: A Tool for Illuminating Disk Usage in HPC Environments

Published: 10 September 2023 Publication History

Abstract

During the last decade, the demand for big data analysis on traditional HPC systems has led to the need for significant increases in storage capacity across many scientific disciplines. While storage solutions have kept up with the demand, storage management tools to help control and understand group and user storage patterns have somewhat lagged behind. Growing and cumulative data analysis sometimes conflicts with available storage space and its accompanying cost. Combined with the problem of dark data (data that has become untracked by the researcher), managing storage requires more well-defined information available to the research groups and administrators. We present our work on an open source Linux utility called Disk Usage by User or Group (dug), that summarizes the user or group owner composition of all files under a target directory. Our solution facilitates fast queries that identify invalid, imbalanced, and excessive use of group storage by individual users. Source code and documentation are available from https://github.com/cwru-rcci/dug.

References

[1]
[n. d.]. Starfish. https://starfishstorage.com/. Accessed: 2023-02-12.
[2]
Feng Chen, Binbing Hou, and Rubao Lee. 2016. Internal Parallelism of Flash Memory-Based Solid-State Drives. ACM Trans. Storage 12, 3, Article 13 (may 2016), 39 pages. https://doi.org/10.1145/2818376
[3]
Donatello Elia, Sandro Fiore, and Giovanni Aloisio. 2021. Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale. IEEE Access 9 (2021), 73307–73326. https://doi.org/10.1109/ACCESS.2021.3079139
[4]
Yoran Heling. [n. d.]. NCurses Disk Usage. https://dev.yorhel.nl/ncdu. Accessed: 2023-02-12.
[5]
Thomas Leibovici. 2015. Taking back control of HPC file systems with Robinhood Policy Engine. https://doi.org/10.48550/ARXIV.1505.01448
[6]
Glenn K Lockwood, Damian Hazen, Quincey Koziol, R Shane Canon, Katie Antypas, Jan Balewski, Nicholas Balthaser, Wahid Bhimji, James Botts, Jeff Broughton, 2023. Storage 2020: A vision for the future of HPC storage. (2023).
[7]
Björn Schembera and Juan M. Durán. 2020. Dark Data as the New Challenge for Big Data Science and the Introduction of the Scientific Data Officer. Philosophy & Technology 33 (2020), 93–115. Issue 1. https://doi.org/10.1007/s13347-019-00346-x

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PEARC '23: Practice and Experience in Advanced Research Computing 2023: Computing for the Common Good
July 2023
519 pages
ISBN:9781450399852
DOI:10.1145/3569951
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. group quota
  2. storage
  3. system administration

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

PEARC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 67
    Total Downloads
  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media