skip to main content
10.1145/2286996.2287005acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency

Published:19 June 2012Publication History

ABSTRACT

As part of the SCinet Research Sandbox at the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), Indiana University utilized a dedicated 100 Gbps wide area network (WAN) link spanning more than 3,500 km (2,175 mi) to demonstrate the capabilities of the Lustre high performance parallel file system in a high bandwidth, high latency WAN environment. This demonstration functioned as a proof of concept and provided an opportunity to study Lustre's performance over a 100 Gbps WAN. To characterize the performance of the network and file system, a series of benchmarks and tests were undertaken. These included low level iperf network tests, Lustre networking (LNET) tests, file system tests with the IOR benchmark, and a suite of real-world applications reading and writing to the file system. All of the benchmarks were run over a the WAN link with a latency of 50.5 ms. In this article, we describe the configuration and constraints of the demonstration, and focus on the key findings made regarding the Lustre networking layer for this extremely high bandwidth, high latency connection. Of particular interest is the relationship between the peer_credits and max_rpcs_in_flight settings when considering LNET performance.

References

  1. Lustre 1.8 Operations Manual. http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_pgfId-1290515.Google ScholarGoogle Scholar
  2. R. Henschel, S. Michael, and S. Simms. A distributed workflow for an astrophysical OpenMP application: using the data capacitor over WAN to enhance productivity. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 644--650, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Henschel, S. Simms, D. Hancock, S. Michael, T. Johnson, N. Heald, T. William, M. Allen, R. Knepper, M. Davy, M. Link, and C. Stewart. Demonstrating Lustre over a 100Gbps Wide Area Network of 3500km. In Proceedings of 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC '12, Submitted, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. iperf Team. Home page. http://sourceforge.net/projects/iperf/, 2012.Google ScholarGoogle Scholar
  5. M. Kluge, S. Simms, T. Wiliam, R. Henschel, A. Georgi, C. Meyer, M. Mueller, C. Stewart, W. Wuensch, and W. Nagel. Performance and quality of service of data and video movement over a 100 Gbps testbed. Future Generation Computer Systems, Accepted, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Michael. LNET self-test SRS demonstration scripts. https://github.com/scamicha/Sandbox-Scripts.Google ScholarGoogle Scholar
  7. R. Riesen, R. Brightwell, K. Pedretti, A. B. Maccabe, and T. Hudson. The Portals 3.3 Message Passing Interface. Technical Report SAND2006-0420, Sandia National Laboratories, 2006.Google ScholarGoogle Scholar
  8. S. C. Simms. private communcation, 2012.Google ScholarGoogle Scholar
  9. S. C. Simms, G. G. Pike, S. Teige, B. Hammond, Y. Ma, L. L. Simms, C. Westneat, and D. A. Balog. Empowering distributed workflow with the data capacitor: maximizing lustre performance across the wide area network. In SOCP '07: Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches, pages 53--58, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            DIDC '12: Proceedings of the fifth international workshop on Data-Intensive Distributed Computing Date
            June 2012
            68 pages
            ISBN:9781450313414
            DOI:10.1145/2286996

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 June 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate7of12submissions,58%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader