Elsevier

Computer Networks

Volume 44, Issue 3, 20 February 2004, Pages 353-382
Computer Networks

A hybrid architecture for cost-effective on-demand media streaming

https://doi.org/10.1016/j.comnet.2003.10.002Get rights and content

Abstract

We propose a new architecture for on-demand media streaming centered around the peer-to-peer (P2P) paradigm. The key idea of the architecture is that peers share some of their resources with the system. As peers contribute resources to the system, the overall system capacity increases and more clients can be served. The proposed architecture employs several novel techniques to: (1) use the often-underutilized peers’ resources, which makes the proposed architecture both deployable and cost-effective, (2) aggregate contributions from multiple peers to serve a requesting peer so that supplying peers are not overloaded, (3) make a good use of peer heterogeneity by assigning relatively more work to the powerful peers, and (4) organize peers in a network-aware fashion, such that nearby peers are grouped into a logical entity called a cluster. The network-aware peer organization is validated by statistics collected and analyzed from real Internet data. The main benefit of the network-aware peer organization is that it allows to develop efficient searching (to locate nearby suppliers) and dispersion (to disseminate new files into the system) algorithms. We present network-aware searching and dispersion algorithms that result in: (i) fast dissemination of new media files, (ii) reduction of the load on the underlying network, and (iii) better streaming service.

We demonstrate the potential of the proposed architecture for a large-scale on-demand media streaming service through an extensive simulation study on large, Internet-like, topologies. Starting with a limited streaming capacity (hence, low cost), the simulation shows that the capacity rapidly increases and many clients can be served. This occurs for all studied arrival patterns, including constant rate arrivals, flash crowd arrivals, and Poisson arrivals. Furthermore, the simulation shows that a reasonable client-side initial buffering of 10–20 s is sufficient to ensure full quality playback even in the presence of peer failures.

Introduction

Streaming multimedia files to a large number of customers imposes a high load on the underlying network and the streaming server. The voluminous nature of the multimedia traffic along with its timing constraints make deploying a large-scale and cost-effective media streaming architecture over the current Internet a challenge. In this paper, we target on-demand streaming environments such as the one shown in Fig. 1. Examples of this environment include a university distance learning service and an enterprise streaming service. In this kind of environment, the media contents are streamed to many clients distributed over several campuses or branches in the Internet.

Before we proceed, we clarify the differences between a P2P file-sharing system and a P2P media streaming system [41]. In file-sharing systems such as Gnutella [26] and Kazaa [27], a client first downloads the entire file before using it. The shared files are typically small (a few Mbytes) and take a relatively short time to download. A file is stored entirely by one peer and hence, a requesting peer needs to establish only one connection to download the file. There are no timing constraints on downloading the fragments of the file. Rather, the total download time is more important. This means that the system can tolerate inter-packet delays. In media streaming systems, a client overlaps downloading with the consumption of the file. It uses one part while downloading another to be used in the immediate future. The files are large (on the order of Gbytes) and take a long time to stream. A large media file is expected to be stored by several peers, which requires the requesting peer to manage several connections concurrently. Finally, timing constraints are crucial to the streaming service, since a packet arriving after its scheduled playback time is useless.

There are several approaches that can be used to stream media to the clients in the target environment. We start by briefly describing the current approaches in the literature. The objective is to highlight the key ideas and limitations of each approach and to position our proposed approach in the appropriate context within the global picture. We can roughly categorize the current approaches into two categories: unicast-based and multicast-based.

In these approaches a unicast stream is established for every client. Roughly, there are three approaches that use unicast for on-demand streaming: centralized, proxy, and content distribution networks (CDN).

Centralized. The straightforward centralized approach (Fig. 1) is to deploy a powerful server with a high-bandwidth connection to the Internet. This approach is easy to deploy and manage. However, the scalability and reliability concerns are obvious. The reliability concern arises from the fact that only one entity is feeding all clients; i.e., there is a single point of failure. The scalability of these approaches is not on a par with the requirements of a media distribution service that spans large-scale potential users, since adding more users requires adding a commensurate amount of resources to the supplying server. There are two other critical, but less obvious, disadvantages of the centralized approach: high cost and load on the backbone network. To appreciate the cost issue, consider, for instance, a streaming server connected to the Internet through a T3 link (∼45 Mb/s), which is a decent and expensive link. This server would be able to support up to 45 concurrent users requesting media files recorded at 1 Mb/s, assuming that the CPU and I/O support that. Since all clients have to go to the server for all requests, much traffic will have to travel through the wide-area network. This adds to the cost of streaming and increases the load on the backbone network. In addition, when the traffic travels through many network hops, it will be susceptible to higher delay variations and packet losses due to possible congestion in the Internet.

Proxy. In the proxy approach [13], [17], [36], [39], proxy servers are deployed near the client domains (Fig. 2). Since movies are large in size, the proxy may be able to cache a few movies in their entirety. A number of caching techniques have been proposed to enable the proxy to cache a fraction of each movie, and therefore more movies can be cached. In prefix caching [36], the proxy stores the first few frames of the movie allowing for short startup delays. In staging caching [39], the proxy stores the bursty portions of the frames and leaves the smoother parts on the central server. This alleviates the stringent bandwidth requirements on the WAN links. A non-contiguous selection of intermediate frames can also be cached [17], which facilitates control functions such as fast forward and rewind. The proxy approach and its variations save WAN bandwidth and is expected to yield short startup delay and small jitter. On the negative side, this approach requires deploying and managing proxies at many locations. While deploying proxies increases the overall system capacity, it multiplies the cost. The capacity is still limited by the aggregate resources of the proxies. This shifts the bottleneck from one central point to a “few” distributed points, but does not eliminate it.

Content distribution network. The third unicast approach employs a third-party for delivering the contents to the clients. This third party is known as a content delivery network (CDN). Content delivery networks, such as Akamai and Digital Island, deploy thousands of servers at the edge of the Internet (see Fig. 3). Akamai, for instance, deploys more than 10,000 servers [1]. These servers (also called caches) are installed at many POPs (point of presence) of major ISPs such as AT&T and Sprint. The idea is to keep the contents close to the clients, and hence traffic traverses fewer network hops. This reduces the load on the backbone network and results in a better service in terms of shorter delay and smaller loss rate. The CDN caches the contents at many servers and redirects a client to the most suitable server. Proprietary protocols are used to distribute contents over servers, monitor the current traffic situation over the Internet, and direct clients to servers. Cost-effectiveness is a major concern in this approach, especially for distributing large files such as movies: the CDN operator charges the content provider for every megabyte served. This delivery cost might be acceptable for relatively small files such as web pages with some images. However, it would render a costly streaming service for the targeted environment because media files are typically large.

The multicast approaches achieve better resource utilization by serving multiple clients using the same stream. The basic idea is to establish a multicast session to which clients subscribe. This is done by creating multicast distribution trees. Multicast approaches are more natural to live streaming in which clients are synchronized: all clients receive the same portions of the stream at the same time. To cope with the asynchronous nature of the on-demand service, several techniques have been proposed. One of the key ideas in adapting multicast to on-demand service is patching and its variations [12], [35]. A good comparison is given in [16]. In patching (also known as tapping), a new client arriving within a threshold is allowed to join an on-going multicast session. In addition, the client establishes a unicast connection with the server to “patch” or get the missed part of the file. The two streams run at the full play rate. The patch stream terminates when the client gets the missed part. Patching techniques may require the client to tune in to multiple streams during the patching period. This means that the client has to have an inbound bandwidth of at least double the streaming rate. This is quite a stringent requirement for the limited-capacity peers in the target environment.

Multicast distribution trees are either created at the network level (Fig. 4) or at the application level (Fig. 5). The network-level multicast establishes a tree over the internal routers with the clients as the leaves of the tree. While network-level multicast is efficient, it is not widely deployed. For the target environment (e.g., distance learning), some of the intermediate network domains may not support multicast. Therefore, currently, network-level multicast is not a feasible solution for delivering contents to all the clients.

Application-level multicast techniques, such as NICE [2], Narada [7], Zigzag [38], among others, construct the distribution trees over the end systems. The algorithms used to construct the distribution tree (or mesh in the case of Narada) differ from one technique to another. Building the tree over the end systems achieves deployability in the current Internet. However, it introduces another problem: it may overload some end systems beyond their capacities. An end system in the tree may become a parent of several other end systems. For example, in Fig. 5, peer P1 is the parent of peers P2 and P3. Hence, P1 should be able to provide the stream to both of them. This assumes that P1 can (and is willing to) support multiple folds of the streaming rate. In the target environment, nodes typically have limited capacity, especially of the upstream bandwidth. In many cases, nodes cannot even provide the full stream rate to another node.

We propose a new peer-to-peer (P2P) media distribution architecture that can support a large number of clients at a low overall system cost. The key idea of the architecture is that end systems (called peers hereafter) share some of their resources with the system. As peers contribute resources to the system, the overall system capacity increases and more clients can be served. As shown in Fig. 6, most of the requesting peers will be served using resources contributed by other peers. The proposed architecture employs several novel techniques to avoid the limitations of the current approaches. Specifically, it has techniques to:

  • (i)

    Use the often-underutilized peer resources, which makes the proposed architecture both deployable and cost-effective. It is deployable because it does not need any support from the underlying network: all work is done at the peers. Since the architecture neither needs new hardware to be deployed nor requires powerful servers, it is highly cost-effective.

  • (ii)

    Aggregate contributions from multiple peers to serve a requesting peer. This indicates that a single supplying peer may only serve a fraction of the full request. Moreover, the requesting peer is not required to have an extra inbound bandwidth to get full-quality streaming.

  • (iii)

    Organize peers in a network-aware fashion, in which nearby peers are grouped into a logical entity called a cluster. This organization of peers is validated by statistics collected and analyzed from real Internet data. The main benefit of the network-aware peer organization is that it allows for developing efficient searching (to locate nearby suppliers) and dispersion (to disseminate new files into the system) algorithms. Network-aware searching and dispersion result in two desirable effects: (1) reduction of the load on the underlying network, since the traffic traverses a fewer number of hops, and (2) better streaming service because the delay is shorter and less variable.

  • (iv)

    Make good use of peer heterogeneity [34]. The architecture assigns relatively more work to the powerful peers. Specifically, powerful peers help in the searching and the dispersion algorithms. This special assignment makes the proposed architecture not purely P2P. Therefore, in the rest of the paper we will call it the hybrid architecture.


The P2P architecture has the potential to provide the desired large-scale media distribution service, especially for the environments considered in this paper (Fig. 1). These environments assume that peers can be asked/configured to share resources. For example, a university distance learning service may configure peers at remote campuses to share storage and bandwidth. This is not a concern, since these peers are owned by the university. The architecture may also be extended to provide a commercial service. However, in this case peer rationality or self-interest should be considered, since peers may not voluntarily contribute resources to the system. This requires developing economic incentive mechanisms to properly motivate peers. One such mechanism is the revenue sharing proposed in [11]. In the revenue-sharing mechanism, the provider shares part of the revenue from serving clients with the peers who helped in serving those clients.

The rest of this paper is organized as follows. Section 2 provides an overview of the proposed architecture. The network-aware organization of peers is detailed in Section 3. The searching and the dispersion algorithms are presented in Section 4. The simulation study is presented in Section 5. Section 6 summarizes related research effort. Section 7 concludes and proposes future extensions for this research.

Section snippets

The hybrid architecture for media streaming

This section provides an overview of the proposed architecture. It first identifies all entities in the system. Then, it explains how the streaming sessions are established and the effect of peer failures on the quality of playback.

Organization of peers

Most of the current P2P file-sharing systems do not consider network locality in their protocols. For example, in Gnutella [26], a joining peer connects itself to a few of the currently active peers in the network. These active peers are obtained from one of the known Gnutella hosts such as gnutellahosts.com. The joining peer and its neighbors can be many network hops away from one another. As another example, in Kazaa [27], a joining peer is assigned a randomly chosen super node. In both

Searching

As discussed in Section 3, peers in the system are organized in a network-aware fashion. This organization enables the searching algorithm to locate nearby peers who have segments of the requested media file. The proposed cluster-based searching algorithm can be summarized in the following steps (consider Fig. 13):

  • 1.

    The requesting peer sends a lookup request to its own network-cluster super peer (NCSP). The NCSP has an index of the files available in the network cluster and their locations. The

Evaluation

We evaluate the performance of the proposed architecture through extensive simulation experiments. We use the Network Simulator ns-2 [20] in the simulation. Three sets of experiments are presented. The first set of experiments (Section 5.1) addresses the system-wide performance parameters such as capacity and waiting time as well as how peers’ levels of cooperation affect these parameters. The second set of experiments (Section 5.2) focuses on the client-side performance parameters such as the

Related work

We summarize related work in both areas of P2P and media streaming systems.

Conclusions and future work

We presented a hybrid architecture for on-demand media streaming that can serve many clients in a cost effective manner. We presented the details of the architecture and showed how it can be deployed over the current Internet. Specifically, we presented the streaming protocol used by a participating peer to request a media file from the system; a cluster-based dispersion algorithm, which efficiently disseminates the media files into the system; and a cluster-based searching algorithm to locate

Acknowledgements

We are grateful to Stefan Saroiu of the University of Washington for sharing the Gnutella data with us. We would like to thank the editor and the anonymous reviewers for their valuable suggestions and detailed comments. We thank Ahsan Habib and Leszek Lilien for their valuable comments. This research is sponsored in part by the National Science Foundation grants ANI-0219110, CCR-0001788, and CCR-9895742.

Mohamed M. Hefeeda is a Ph.D. candidate with the Department of Computer Sciences at Purdue University, West Lafayette. He received the M.S. degree in computer science and engineering from University of Connecticut, Storrs in 2001, and the B.Sc. degree in Electronics Engineering from ElMansoura University, Egypt in 1994. His research interests include computer networks, peer-to-peer systems, multimedia networking, network economics, and network security. He is a student member of the IEEE

References (43)

  • Available from Akamai home page...
  • S. Banerjee, B. Bhattacharjee, C. Kommareddy, G. Varghese, Scalable application layer multicast, in: Proceedings of ACM...
  • P. Barford, J. Cai, J. Gast, Cache placement methods based on client demand clustering, in: Proceedings of IEEE...
  • A. Bestavros, S. Mehrotra, DNS-based Internet client clustering and characterization, in: Proceedings of the 4th IEEE...
  • K. Calvert et al.

    Modeling Internet topology

    IEEE Communications Magazine

    (1997)
  • E. Choen, S. Shenker, Replication strategies in unstructured peer-to-peer networks, in: Proceedings of ACM SIGCOMM’02,...
  • Y. Chu et al.

    A case for end system multicast

    IEEE Journal on Selected Areas in Communications

    (2002)
  • D.E. Comer

    Internetworking with TCP/IP: Principles, Protocols, and Architectures

    (2000)
  • F. Dabek, M. Kaashoek, D. Karger, D. Morris, I. Stoica, H. Balakrishnan, Building peer-to-peer systems with chord, a...
  • H. Deshpande, M. Bawa, H. Garcia-Molina, Streaming live media over peer-to-peer network, Technical Report, Stanford...
  • M. Hefeeda, A. Habib, B. Bhargava, Cost-profit analysis of a peer-to-peer media streaming architecture, CERIAS TR...
  • K. Hua, Y. Cai, S. Sheu, Patching: A multicast technique for true video on-demand services, in: Proceedings of ACM...
  • S. Jin, A. Bestavros, A. Iyengar, Accelerating Internet streaming media delivery using network-aware partial caching,...
  • B. Krishnamurthy, J. Wang, On network-aware clustering of web clients, in: Proceedings of ACM SIGCOMM’00, Stockholm,...
  • J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer,...
  • A. Mahanti et al.

    Scalable on-demand media streaming with packet loss recovery

    IEEE/ACM Transactions on Networking

    (2003)
  • Z. Miao, A. Ortega, Proxy caching for efficient video services over the Internet, in: Proceedings of 9th International...
  • T. Nguyen, A. Zakhor, Distributed video streaming over Internet, in: Proceedings of Multimedia Computing and Networking...
  • S. Nilsson, G. Karlssom, Fast address lookup for Internet routers, in: Proceedings of Algorithms and Experiments...
  • The network simulator, Available from...
  • University of Oregon Route Views Project, Available from...
  • Cited by (77)

    • Lyapunov stability and performance of user-assisted Video-on-Demand services

      2015, Computer Networks
      Citation Excerpt :

      However, the network access is yet working with a client–server architecture, and the operator (Google Inc.) must afford more than one million dollars per day just for bandwidth requirements, which is a clear motivation to exploit idle uploading resources from YouTube’s users [4]. Experimental works also converge to the fact that user-assistance offloads the server and provides high scalability to video on-demand systems [5,6]. The deployment of video on-demand services must cope with several challenges, including asymmetric playback (users connect when they wish) interactivity (fast forward, rewind and pause options) and high playback quality over a bandwidth-sensitive best-effort infrastructure.

    • Optimal resource allocation for distributed video communication

      2016, Optimal Resource Allocation for Distributed Video Communication
    • Design and implementation of teaching resource platform under flash stream media on-demand mode

      2016, Proceedings - 2015 International Conference on Intelligent Transportation, Big Data and Smart City, ICITBS 2015
    View all citing articles on Scopus

    Mohamed M. Hefeeda is a Ph.D. candidate with the Department of Computer Sciences at Purdue University, West Lafayette. He received the M.S. degree in computer science and engineering from University of Connecticut, Storrs in 2001, and the B.Sc. degree in Electronics Engineering from ElMansoura University, Egypt in 1994. His research interests include computer networks, peer-to-peer systems, multimedia networking, network economics, and network security. He is a student member of the IEEE Computer Society and the ACM SIGCOMM.

    Bharat K. Bhargava is a professor of the Department of Computer Sciences and Department of Electrical & Computer Engineering at Purdue university since 1984. Professor Bhargava is conducting research in security issues in mobile and ad hoc networks. This involves host authentication and key management, secure routing and dealing with malicious hosts, adaptability to attacks, and experimental studies. Related research is in formalizing evidence, trust, and fraud. Applications in e-commerce and transportation security are being tested in a prototype system. He has proposed schemes to identify vulnerabilities in systems and networks, and assess threats to large organizations. He has developed techniques to avoid threats that can lead to operational failures. The research has direct impact on nuclear waste transport, bio-security, disaster management, and homeland security. These ideas and scientific principles are being applied to the building of peer-to-peer systems, cellular assisted mobile ad hoc networks, and to the monitoring of QoS-enabled network domains. He serves on six editorial boards of international journals (Transactions on Mobile Computing, Wireless Communications and Mobile Computing, International Journal of Computers and Applications, Multimedia Tools and Applications, International Journal of Cooperative Information Systems, Journal of System Integration).

    His research group consists of nine Ph.D. students and two post doctors. He has six NSF funded projects. In addition, DARPA, IBM, Motorola, and CISCO are providing contracts and gift funds.

    Professor Bhargava was the chairman of the IEEE Symposium on Reliable and Distributed Systems held at Purdue in October 1998. In the 1988 IEEE Data Engineering Conference, he and John Riedl received the best paper award for their work on “A Model for Adaptable Systems for Transaction Processing.” Professor Bhargava is a Fellow of the Institute of Electrical and Electronics Engineers and of the Institute of Electronics and Telecommunication Engineers. He has been awarded the charter Gold Core Member distinction by the IEEE Computer Society for his distinguished service. He received Outstanding Instructor Awards from the Purdue chapter of the ACM in 1996 and 1998. In 1999 he received IEEE Technical Achievement award for a major impact of his decade long contributions to foundations of adaptability in communication and distributed systems. In 2003, he has been inducted in the Purdue’s book of great teachers.

    David K.Y. Yau received the B.Sc. (first class honors) degree from the Chinese University of Hong Kong, and the M.S. and Ph.D. degrees from the University of Texas at Austin, all in computer sciences. From 1989 to 1990, he was with the Systems and Technology group of Citibank, NA. He was the recipient of an IBM graduate fellowship, and is currently an Associate Professor of Computer Sciences at Purdue University, West Lafayette, IN. He received an NSF CAREER award in 1999, for research on network and operating system architectures and algorithms for quality of service provisioning. His other research interests are in network security, value-added services routers, and mobile wireless networking.

    View full text