Skip to main content
Log in

Adaptive hybrid storage systems leveraging SSDs and HDDs in HPC cloud environments

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cloud computing should inherently support various types of data-intensive workloads with different storage access patterns. This makes a high-performance storage system in the Cloud an important component. Emerging flash device technologies such as solid state drives (SSDs) are a viable choice for building high performance computing (HPC) cloud storage systems to address more fine-grained data access patterns. However, the bit-per-dollar SSD price is still higher than the prices of HDDs. This study proposes an optimized progressive file layout (PFL) method to leverage the advantages of SSDs in a parallel file system such as Lustre so that small file I/O performance can be significantly improved. A PFL can dynamically adjust chunk sizes and stripe patterns according to various I/O traffics. Extensive experimental results show that this approach (i.e. building a hybrid storage system based on a combination of SSDs and HDDs) can actually achieve balanced throughput over mixed I/O workloads consisting of large and small file access patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. High Performance Computing in the AWS Cloud. https://aws.amazon.com/hpc/

  2. Sun Oracle. Lustre Software Release 2.x Operation Manual. http://lustre.org/documentation. Accessed January 2011

  3. Paciucci, G., Paper, S., Meyers, I., Ballantyne, D.: Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud Edition Lustre and Amazon Web Services. Reference Architecture: Developing Storage Solutions with Intel Cloud Edition for Lustre and Amazon Web Services (2015)

  4. Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers 2008. In: Workshop on Many-Task Computing on Grids and Supercomputers (2008)

  5. The Apache Hadoop project: open-source software for reliable, scalable, distributed computing. http://hadoop.apache.org/

  6. Hammond, J.L.: Intel high performance data division. Progressive file layouts prototype. LUG (2015)

  7. Mohr, R., Brim, M., Oral, S., Dilger, A.: Evaluating progressive file layouts for Lustre. LUG (2016)

  8. Koo, D., Kim, J.-S., Hwang, S., Eom, H., Lee, J.: Utilizing Progressive File Layout Leveraging SSDs in HPC Cloud Environments. In: Proceedings of the IEEE International Workshops on Foundations and Applications of Self* Systems, September 2016

  9. Benchmarking Working Group, OpenSFS. I/O Characterization of Large-Scale HPC Centers. Reference Architecture: the Supercomputing Conference (2012)

  10. Lee, J., Koo, D., Park, K., Kim, J., Hwang, S.: Performance analysis of Lustre file system using high performance storage devices. KIISE Trans. Comput. Pract. 22(4), 163–169 (2016)

    Article  Google Scholar 

  11. Intel SSD-Based Lustre Cluster File System Evaluation. http://www.intel.com/content/www/us/en/software/lustre-cluster-file-system-performance-evaluation.html

  12. Prabhakar, R., Vazhkudai, S.S., Kim, Y., Butt, A.R., Li, M., Kandemir, M.: Provisioning a multi-tiered data staging area for extreme-scale machines. In: Proceedings of the 2011 31st International Conference on Distributed Computing Systems, ICDCS’11, pp. 1–12, IEEE Computer Society, Washington, DC, USA (2011)

  13. Layout Enhancement High Level Design. http://wiki.lustre.org/Layout_Enhancement_High_Level_Design

  14. Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: IEEE 28th Symposium on MSST/SNAPI. IEEE (2012)

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. NRF-2015R1C1A1A02036524), and by Institute for Information & Communications Technology Promotion (IITP) Grant Funded by the Korean government (MSIP) (No. R0190-16-2012, High Performance Big Data Analytics Platform Performance Acceleration Technologies Development).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaehwan Lee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koo, D., Kim, JS., Hwang, S. et al. Adaptive hybrid storage systems leveraging SSDs and HDDs in HPC cloud environments. Cluster Comput 20, 2119–2131 (2017). https://doi.org/10.1007/s10586-017-1002-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1002-5

Keywords

Navigation