Phoebus: A system for high throughput data movement

https://doi.org/10.1016/j.jpdc.2010.08.011Get rights and content

Abstract

Phoebus is an infrastructure for improving end-to-end throughput in high-bandwidth, long-distance networks by using a “session layer” protocol and “gateways” in the network. Phoebus has the ability to dynamically allocate network resources and to use segment-specific transport protocols between gateways, as well as to apply other performance-improving techniques on behalf of the user. We have developed interfaces to Phoebus to allow its use in various real applications and data movement services. This paper extends our earlier work with tests of Phoebus-enabled applications on both real-world networks as well as over configurable network testbeds that allow us to modify latency and loss rates. We demonstrate that Phoebus improves the performance of bulk data transfer in a variety of network configurations and conditions.

Section snippets

Introduction and motivation

Despite continuing advances in the link speeds of networks, data movement remains a key problem in parallel and distributed computing. Applications in both science and industry are becoming increasingly data intensive. The viability of many distributed computing paradigms depends on the ability to have data transfer speeds scale up as computing power increases. This paper describes and investigates the efficacy of a network middleware system for improving data transfer performance for

Background

For decades, the end-to-end argument [36] has provided the conceptual basis for transport protocols. The common interpretation of this argument states that the core of the network should remain simple, and that all protocol functionality, beyond merely forwarding packets, should be handled by the end hosts. This absolutist interpretation of the end-to-end argument forces all control and optimizations to the edge. This control mechanism needs to infer the state of the network and when a packet

Phoebus architecture

Phoebus [8] is a system that implements a new protocol and associated forwarding infrastructure for improving throughput in today’s networks.1 The current Internet model binds all end-to-end communication to a “Transport” layer protocol such as the Internet Protocol (IP) suite’s Transmission Control Protocol (TCP). The Phoebus model binds end-to-end communication to a “Session” protocol, which is a

Application use and integration

The introduction of a new protocol and network system necessitates the means to enable their use in existing applications. To that end, we have developed two methods that allow applications to make use of Phoebus that require no changes in existing code: a Phoebus wrapper library and transparent redirection. The wrapper library overrides the standard operating system socket calls so that any application linked against it can transparently use the Phoebus infrastructure. The application can be

Phoebus services

A key tenet in the Phoebus model is that an end-to-end connection, articulated via a series of Transport protocol adapting Session gateways, can often outperform a single end-to-end transport protocol. A session-layer connection such as this can also outperform parallel connections in many cases, though Phoebus itself can also make use of parallel connections.

As mentioned in Section 2, an end-to-end transport protocol must behave conservatively as it may cross a wide variety of network

Real-world testing

This section presents two different real-world experiments: The first demonstrates performance gains using Phoebus with a common network performance benchmarking tool between locations on the East and West coasts of the US. The second examines how Phoebus improves transatlantic data transfers between REDDnet storage sites at CERN, Switzerland and locations in the US. Both experiments utilize segments of Internet2’s IP backbone network and focus on TCP adaptation between PGs.

Testbed results

Our goal in this section is to test a real data transfer tool, GridFTP, using the Phoebus infrastructure in a variety of network conditions. Despite the availability of a prototype Phoebus infrastructure in Internet2 POPs and test deployments in various other networks, getting access to a wide variety of end-to-end network paths is challenging. Even then, we have been at the mercy of prevailing network conditions, making experiment repeatability difficult as we can attest to in gathering our

Conclusion

This paper presents a set of controlled experiments measuring the efficacy of the Phoebus system under a variety of network conditions. Specifically, we have shown how Phoebus with protocol adaptation can dramatically improve throughput in various network conditions and provide significantly improved single-stream application performance. This single stream performance is comparable with the best performance that parallel TCP can deliver when utilized over clean, low-latency paths. We developed

Acknowledgment

Phoebus was supported by the Department of Energy Office of Science under FG02-04Er25642.

Ezra Kissel received the B.Sc. degree in Computer Science from the University of Delaware in 2003. After working in private industry for 2 years, he returned to the University of Delaware earning the M.Sc. degree in 2007 and is currently a Ph.D. candidate in the CIS department. His research interests include high-performance networking, network protocol design, grid computing, and network security.

References (45)

  • P. Rizk et al.

    Performance of a gridftp overlay network

    Future Generation Computer Systems

    (2008)
  • W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, The globus striped gridftp framework and server, in: Proceedings of...
  • D. Andersen, H. Balakrishnan, M. Kaashoek, R. Morris, The case for resilient overlay networks, in: 8th Annual Workshop...
  • K. Argyraki et al.

    Can software routers scale?

  • A. Bakre et al.

    I-tcp: indirect tcp for mobile hosts

  • A. Bassi et al.

    Logistical networking: when institutions peer

  • R. Bolla et al.

    Pc-based software routers: high performance and application service support

  • J. Border, M. Kojo, J. Griner, G. Montenegro, Z. Shelby, Performance enhancing proxies intended to mitigate...
  • A. Brown, E. Kissel, M. Swany, G. Almes, Phoebus: a session protocol for dynamic and heterogeneous networks, UDCIS...
  • N. Egi et al.

    Towards high performance virtual routers on commodity hardware

  • S. Floyd

    Connections with multiple congested gateways in packet-switched networks part1: one-way traffic

    Computer Communication Review

    (1991)
  • S. Floyd, HighSpeed TCP for large congestion windows, Internet Engineering Task Force, INTERNET-DRAFT,...
  • B. Ford, J. Iyengar, Breaking up the transport logjam, in: Proceedings of ACM HotNets,...
  • I. Foster et al.

    Globus: a metacomputing infrastructure toolkit

    International Journal of Supercomputer Applications

    (1997)
  • GridFTP,...
  • Y. Gu, R. Grossman, UDT: UDP-based data transfer for high-speed wide area networks, Computer Networks 51...
  • C.P. Guok, D.W. Robertson, E. Chaniotakis, M.R. Thompson, W. Johnston, B. Tierney, A user driven dynamic circuit...
  • T.J. Hacker et al.

    The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network

  • E. He, J. Leigh, O. Yu, T.A. DeFanti, Reliable blast UDP:predictable high performance bulk data transfer, in: Proc. of...
  • V. Jacobson, R. Braden, D. Borman, TCP extensions for high performance, Network Working Group, Internet Engineering...
  • K. Kaneko, J. Katto, TCP-fusion: A hybrid congestion control algorithm for high-speed networks, in: International...
  • G. Khanna et al.

    Using overlays for efficient data transfer over shared wide-area networks

  • Cited by (11)

    • Using traffic filtering rules and OpenFlow devices for transparent flow switching and automatic dynamic-circuit creation in hybrid networks

      2016, Journal of Systems and Software
      Citation Excerpt :

      Lastly, Section 6 draws some conclusions and discusses future research. Virtual circuits have been introduced in the NREN’s backbones to overcome the lack of predictability in IP networks, e.g., the inability to offer predefined paths, traffic isolation, QoS, and other related properties (Falk et al., 2003; Jacobson et al., 1992; Kissel et al., 2011; Lakshman and Madhow, 1997). In the initial deployments involving virtual circuit networks, the circuits were established manually by the network administrators, who configured all the devices along the virtual path.

    • Hercules: High-Speed Bulk-Transfer over SCION

      2023, 2023 IFIP Networking Conference, IFIP Networking 2023
    • Data logistics: Toolkit and applications

      2019, ACM International Conference Proceeding Series
    • Store-and-Forward Data Transfer using Optimized Intermediate Node

      2019, 2019 20th Asia-Pacific Network Operations and Management Symposium: Management in a Cyber-Physical World, APNOMS 2019
    • Differentiated network services for data-intensive science using application-aware SDN

      2018, 11th IEEE International Conference on Advanced Networks and Telecommunications Systems, ANTS 2017
    View all citing articles on Scopus

    Ezra Kissel received the B.Sc. degree in Computer Science from the University of Delaware in 2003. After working in private industry for 2 years, he returned to the University of Delaware earning the M.Sc. degree in 2007 and is currently a Ph.D. candidate in the CIS department. His research interests include high-performance networking, network protocol design, grid computing, and network security.

    Martin Swany is an Associate Professor in the Department of Computer and Information Sciences at the University of Delaware. He received his B.A. and M.S. from the University of Tennessee in 1992 and 1998, respectively. He completed his Ph.D. at the University of California, Santa Barbara in 2003 and joined the faculty of the University of Delaware that year. Since 2005, Swany has been the Internet2 Faculty Fellow involving work in network metrics and performance-enhancing middleware. His research interests include high-performance parallel and distributed computing and networking.

    Aaron Brown gruadated with a Bachelor’s degree in Computer Science from Clark University in 2003. In 2006, He received a Master’s degree from the University of Delaware where his research interests included high-performance networking and measurement. He is currently employed as a network software engineer at Internet2, working on the perfSONAR project as well as Internet2’s dynamic circuit networking initiative.

    View full text