Skip to main content

A Dynamic Extension and Data Migration Method Based on PVFS

  • Conference paper
  • First Online:
Book cover Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9529))

  • 1294 Accesses

Abstract

With the development of the big data, The traditional file system can no longer meet the demand of High Performance Computing and Big Data. Parallel file systems are getting more and more popular in High Performance Computing. As a typical parallel file system, PVFS has been widely used in big data computing area in recent years. However with the increasing of computing scale, there exist the needs to dynamic extend data nodes, which PVFS does not support at present. This paper put forward a dynamic data node extension method as well as the subsequent data migration algorithm based on PVFS. The algorithm first adds a new data node automatically and transparently. After that, the algorithm finds out the most loaded data node in the original file system using a new load evaluation method and transfer the data into the newly added data node to mitigate the imbalance of the system. The experimental results show that our dynamic data node extension method could improve the performance of PVFS and reduce the probability of hot point effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Weil, S.A., Brandt, S.A., Miller, E.L, et al.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 307–320. USENIX Association (2006)

    Google Scholar 

  2. Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM SIGOPS Operating Syst. Rev. 37(5), 29–43 (2003)

    Article  Google Scholar 

  3. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  4. Haddad, I.F.: PVFS: A parallel virtual file system for linux clusters. Linux J. 2000(80es), 5 (2000)

    MathSciNet  Google Scholar 

  5. Kuhn, M., Kunkel, J.M., Ludwig, T.: Dynamic file system semantics to enable metadata optimizations in PVFS. Concurrency Comput. Pract. Experience 21(14), 1775–1788 (2009)

    Article  Google Scholar 

  6. Tantisiriroj, W., Son, S.W., Patil, S., et al.: On the duality of data-intensive file system design: reconciling HDFS and PVFS. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 67. ACM (2011)

    Google Scholar 

  7. Wu, J., Wyckoff, P., Panda, D.: PVFS over InfiniBand: design and performance evaluation. In: Proceedings of the 2003 International Conference on Parallel Processing, pp. 125–132. IEEE (2003)

    Google Scholar 

  8. Pfister, G.F.: An introduction to the infiniband architecture. In: High Performance Mass Storage and Parallel I/O, chap. 42, pp. 617–632 (2001)

    Google Scholar 

  9. Hsiao, H.C., Chung, H.Y., Shen, H., et al.: Load rebalancing for distributed file systems in clouds. IEEE Trans. Parallel Distrib. Syst. 24(5), 951–962 (2013)

    Article  Google Scholar 

  10. Wang, K., Zhou, X., Li, T., et al.: Optimizing load balancing and data-locality with data-aware scheduling. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 119–128. IEEE (2014)

    Google Scholar 

  11. Guoying, L., et al.: Data consistency for self-acting load balancing of parallel file system. In: Park, J.H.(James), et al. (eds.) Information Technology Convergence, Secure and Trust Computing, and Data Management. LNEE, vol. 180, pp. 135–143. Springer, Netherlands (2012)

    Chapter  Google Scholar 

  12. Kobayashi, K., Mikami, S., Kimura H., et al.: The gfarm file system on compute clouds. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 1034–1041. IEEE (2011)

    Google Scholar 

  13. Dong, B., Li, X., Xiao, L., et al.: Self-acting load balancing with parallel sub file migration for parallel file system. In: 2010 Third International Joint Conference on Computational Science and Optimization (CSO), vol. 2, pp. 317–321. IEEE (2010)

    Google Scholar 

  14. Jenkins, J., Zou, X., Tang, H., et al.: Parallel data layout optimization of scientific data through access-driven replication. Technical report-Not held in TRLN member libraries (2014)

    Google Scholar 

  15. Soares, T.S., Dantas, M.A.R., de Macedo, D.D.J., et al.: A data management in a private cloud storage environment utilizing high performance distributed file systems. In: 2013 IEEE 22nd International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 158–163. IEEE (2013)

    Google Scholar 

  16. Huo, Y., Yang, K., Liang, H., et al.: Summary of parallel file system research. J. Chin. Comput. Syst. 29(9), 1631–1636 (2008)

    Google Scholar 

  17. Zhang, C., Yin, J., et al.: Dynamic load balancing algorithm of distributed file system. J. Chin. Comput. Syst. 32(7), 1424–1426 (2011)

    Google Scholar 

  18. Zhu, Y., Li, B., Sun, T., et al.: Parallel computing system scalability. Comput. Eng. Appl. 47(21), 47–49 (2011)

    Google Scholar 

Download references

Acknowledgments

We would like to thank the anonymous reviewers for helping us refine this paper. Their constructive comments and suggestions are very helpful. This paper is partly founded by National Science and Technology Major Project of the Ministry of Science and Technology of China under grant 2011ZX05035-004-004HZ.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, X., Tang, J., Gao, H., Wu, G. (2015). A Dynamic Extension and Data Migration Method Based on PVFS. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9529. Springer, Cham. https://doi.org/10.1007/978-3-319-27122-4_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27122-4_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27121-7

  • Online ISBN: 978-3-319-27122-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics