Elsevier

Parallel Computing

Volume 27, Issue 6, May 2001, Pages 823-846
Parallel Computing

A scalable off-line MPEG-2 video encoding scheme using a multiprocessor system

https://doi.org/10.1016/S0167-8191(00)00099-5Get rights and content

Abstract

Video compression plays a central role in a vast number of multimedia applications but its computational requirements overwhelm the capabilities of any present single processor system. In this paper, we explore the use of parallel machines like the Intel Paragon to compress MPEG-2 video sequences. The motivation is to build a production-based compression facility by exploiting the potential power of the available machine. Given a video sequence or a set of sequences, the aim of the parallel encoder is to achieve the maximum possible encoding rate. A collective scheduling scheme for the processors, I/O nodes, and disks is proposed that provides fast I/O, minimizes the idle times of processors, and enables the system to work in a highly balanced fashion. An efficient data layout scheme for storing video frames is also proposed in order for the I/O to sustain the desired data transfer rates. Using a small percentage of processors as the I/O nodes results in an efficient utilization of the system resources. As shown by experimental and analytical results, the encoding scheme is scalable and higher performance can be achieved with larger machines. The performance of the proposed scheme can be many times the real-time encoding rates with Standard Interface Format (SIF) and CCIR-601 video sequences. The experimental results indicate about two-fold gain in performance compared to the previous studies. Such a system is useful for the conversion of analog videos to compressed digital form in large studios, digital libraries, and other multimedia database environments. The proposed scheme partitions the system into groups of compute nodes, and I/O nodes, and can be easily extended to other MIMD machines or a set of networked workstations.

Introduction

The realm of high-performance parallel and distributed computing is expanding beyond the traditional scientific community because the current revolution of information technology has created a vast number of commercial applications that require massive computing power [5]. The need for high-performance computing in commercial applications is being recognized by researchers from the areas of both parallel processing and information technology. Large-scale databases, video servers, visualization tools, content-based search and retrieval, graphics rendering, etc., are typical applications that require large computing power.

Video encoding (also called compression) is another application which requires enormous computing power and thus can benefit from high-performance computing. Digitized video, which is a fundamental component of common multimedia applications, typically needs massive amount of data required to represent the audio-visual1 information. A few minutes of video sequence requires Gbytes of data storage. Similarly, the amount of data for transmitting a video sequence over a network can easily overwhelm the network channels. For example, to transmit an uncompressed high quality digital television signal (CCIR-601) would require 126 Mbps. Even for a lower resolution signal suitable for video conferencing applications (Common Intermediate Format), the uncompressed bit rate is 36.5 Mbps [8]. These amounts become out of reach for high-definition television (HDTV) [9], which will be using digital video at a much higher resolution of 1920×1152.

Storage and transmission of a huge amount of data inevitably calls for compression and decompression of digital video. Fortunately, digital video contains ample redundancies in both the spatial and temporal domains, enabling encoding algorithms to achieve a high degree of compression with little degradation in quality. The entire compression/decompression process requires a codec consisting of an encoder and a decoder. The encoder compresses the data at the transmission or storage end while the decoder decompresses the data for reproducing the video to be viewed by the user. In order to guarantee exchange of the compressed video data between different systems such that a unique decompression is possible for a particular encoded bitstream regardless of the decoder configuration, video coding algorithms are standardized. Two recent international standards, known as MPEG-1 and MPEG-2, have been developed by the moving pictures expert group (MPEG) of the international organization for standards (ISO). These standards specify the syntax (representation) for the decoding of video and accompanying audio data.

Compression is considerably more complex as compared to decompression (which can possibly be done using a single processor machine). The objectives of an efficient video compression techniques include (assuming a defined bit rate) a good visual quality evaluated through subjective and objective assessment criteria, high compression ratio, and low complexity of the compression algorithm itself — the emphasis on any of these objectives can, of course, vary according to the target applications. Compression can be aimed for real-time or non-real-time encoding environments. In the former case, compression must be achieved on-line, for example in a system in which a video stream is being generated from a source such as a video camera; a real-time encoding rate is about 30 frames/s. In the latter case, compression can be done off-line without strict requirements of real-time compression rate. Non-real-time compression is required in applications like digital library or production systems which require encoding a video sequence and storing it on a CD-ROM or digital versatile disk (DVD).

There are two approaches to performing video compression: hardware-based [1], [23] and software-based [2], [3], [4], [7], [10], [11], [19], [20], [22], [24]. Both approaches have their own advantages and disadvantages. A hardware approach uses a special-purpose architecture, and its advantages include the ease of use and high compression speed. However, dedicated hardware is less flexible and can become obsolete. Furthermore, hardware is often optimized for a particular coding algorithm, and cannot be used for exploring other present and future video compression standards. A software solution using general-purpose computing platforms, on the other hand, is more flexible, and thus allows algorithmic improvements. In addition, for non-real-time applications, a software implementation can produce better quality video compression by tuning various parameters and by allowing multiple passes for optimization. However, the very high computation requirements of video applications can often overwhelm a single-processor sequential computer [2], [21]. Therefore, it is natural to exploit the potentially enormous computing power offered by parallel computing systems.

In this paper, we explore the use of parallel machines like the Intel Paragon to compress MPEG-2 video sequences. The motivation is to build a production-based compression facility by exploiting the potential power of the available machine. In addition, parallel machines which are normally available for scientific computing can be exploited for new multimedia applications. The compression facility is aimed to compress multiple and large video sequences. The environment is not real-time but the aim is to achieve the maximum possible encoding rate beyond the real-time speed. Such a facility is useful for the conversion of analog videos to compressed digital form in large studios, digital libraries, and other multimedia database environments. We propose schemes for data layout, efficient I/O, and load-balanced data distribution. These schemes provide fast data retrieval as well as efficient scheduling and matching of I/O and computation rates such that the entire machine operates in a highly balanced fashion without any bottlenecks. Using a very small percentage of processors as the I/O nodes results in an efficient utilization of the system. More importantly, our scheme is scalable, that is, an increase in the number of processors will result in a proportional increase in the encoding rate. As a result, larger machines will yield higher encoding rates. Specifically, given any MIMD machine configuration (that is, the number of processors, I/O nodes, and disks), our proposed method will logically configure the machine for the best possible utilization and match the I/O and encoding rates to reach the ideal performance level.

The rest of this paper is organized as follows. Section 2 provides an overview of MPEG-2, followed by Sections 3 which briefly discusses the related work and gives a motivation for pursuing this research. Section 4 gives an overview of the Intel Paragon and its logical partitioning used in our encoding scheme. Section 5 includes a discussion of the proposed parallel encoder and various related issues. Section 6 presents the experimental results. Section 7 discusses the scalability of the encoder and Section 8 provides some concluding remarks.

Section snippets

Overview of MPEG-2

Video coding standards provide a common format and enable the sharing of technology among various industries. Some of the recent standards are JPEG (to compress still images for both storage and transmission applications) [13], H.261 [17] and H.263 (for video telephony and video conferencing applications at a low bit rate), and MPEG-1 (for applications requiring up to 1.5 Mbps bit rate) [6], [14]. The Moving Picture Experts Group of ISO has standardized the second international standard (MPEG-2

Related work

The problem of software-based video encoding using parallel processing is non-trivial, and cannot be solved by simply replicating multiple sequential encoders on different processors, because the local memory of a single processor is usually not large enough to hold more than a few frames and thus an efficient I/O methodology is required to bring the data in and take the compressed data out of processors. Since the video signal can be viewed as a three-dimensional (3D) signal, that is, two

A logical partitioning of the Intel Paragon

The parallel machine we have used in our study is the Intel Paragon XP/S parallel computer. The architecture of the Paragon, which is a distributed-memory machine, has been documented well in various publications [12]. Here, we describe some details of its architecture that are relevant to our work. We also propose a logical partitioning of its processors that is the basis of our scalable MPEG-2 video encoding scheme.

The Paragon consists of compute, I/O, and service partitions (see Fig. 1),

The parallel encoder

The objectives of our parallel encoder are:

  • To achieve the maximum possible encoding rate, given any machine configuration (that is, the number of processors, I/O nodes, and disks).

  • To achieve a complete scalability, in that, the encoding rate should increase linearly with an increase number of processors without reaching a saturation point in performance.

The objectives can be met if all of the processors are kept busy in reading the data, performing the encoding, and writing the coded bit

Experimental results

The SIF (360×240) video sequences flower garden, table tennis and football were used as test sequences. Two CCIR-601 (720×480) sequences susie and football were also used for our experiments. The sequences were repeated to obtain about 4000 frames for SIF and 3000 frames for CCIR-601. The size of the GOP was 12 frames with I–P frame distance of 3. For motion estimation, the 2D-logarithmic search with the search windows of ±11 for P frames and ±10 for B frames were used. The sequential encoder

Scalability of the scheme

First, we will determine the largest possible group size or how many compute nodes can be supported by a single disk without any significant waiting time on compute nodes. Assuming the buffer size b to be equal to the size of 3 frames, Eq. (2) can be written in terms of reading and encoding times as:Topen+m×(Tread+Tsend)⩽Tenc+Trecv,where Tread, Tsend, Trecv and Tenc are the times to read, send, receive and encode the data of size b bytes, respectively. The overhead of receiving the request

Conclusions

We have proposed a parallel MPEG-2 encoder that has been implemented on the Intel Paragon. The proposed scheme partitions the system into groups of compute nodes, and I/O nodes, and can be easily extended to other MIMD machines or a set of networked workstations. The proposed encoder optimizes the system performance by balancing the computation, I/O, and the disk usage. The proposed encoder has achieved the highest level of performance reported for such a problem, with a frame rate of 71

References (24)

  • S.M. Akramullah et al.

    A data-parallel approach for real-time MPEG-2 video encoding

    J. Parallel Distrib. Comput.

    (1995)
  • P. Moulin et al.

    Video signal processing and coding on data-parallel computers

    Digital Signal Process.

    (1995)
  • T. Akiyama

    MPEG2 video codec using image compression DSP

    IEEE Trans. Consumer Electron.

    (1994)
  • S.M. Akramullah et al.

    Performance of software-based MPEG-2 video encoder on parallel and distributed systems

    IEEE Trans. Circuits Syst. Video Tech.

    (1997)
  • A.C. Downton

    Generalized approach to parallelising image sequence coding algorithms

    IEE Proc. Vis. Image Signal Process.

    (1994)
  • B. Furht

    Multimedia Tools and Applications

    (1996)
  • D.J. Le Gall

    MPEG: A video compression standard for multimedia applications

    Commun. ACM

    (1991)
  • K.L. Gong, L.A. Rowe, Parallel MPEG-1 video encoding, in: Proceedings of 1994 Picture Coding Symposium, Sacramento, CA,...
  • B.G. Haskell et al.

    Digital video: An introduction to MPEG-2 digital multimedia standards series

    (1997)
  • R. Hopkins

    Digital terretrial HDTV for North America: the grand alliance HDTV system

    IEEE Trans. Consumer Electron.

    (1994)
  • Z. Huang et al.

    Distributed load balancing schemes for parallel video encoding system

    IEICE Trans. Fundam.

    (1994)
  • H.-C. Huang et al.

    New generation of real-time software-based video codec: Popular video coder II (PVC-II)

    Proc. SPIE

    (1995)
  • Cited by (13)

    • A real-time video watermarking system with buffer sharing for video-on-demand service

      2009, Computers and Electrical Engineering
      Citation Excerpt :

      As long been shown in literature, pure software MPEG-2 encoding/decoding requires large amounts of computation power. There are several works [1,3,6] on parallelizing the MPEG-2 encoding process, using SMP machine or workstation clusters. In [3], a data-parallel MPEG-2 encoder is implemented on an Intel Paragon platform and a real-time encoding performance is reported for low resolution video [6], proposed a coarse-grained parallel version of a MPEG-1 encoder and showed a very good parallel gain.

    • Parallel H.264/AVC rate-distortion optimization baseline profile encoder on distributed share memory system

      2010, International Journal of Innovative Computing, Information and Control
    • Statistical framework for video decoding complexity modeling and prediction

      2009, IEEE Transactions on Circuits and Systems for Video Technology
    View all citing articles on Scopus

    This research was supported by HKUST 6030/97E and HKUST6228/99E.

    View full text