Scalable Architecture for SoC Video Encoders

Kangas, Tero; Hämäläinen, Timo D.; Kuusilinna, Kimmo

doi:10.1007/s11265-006-5918-x

Tero Kangas¹,
Timo D. Hämäläinen¹ &
Kimmo Kuusilinna²

83 Accesses
4 Citations
Explore all metrics

Abstract

Evolving video coding standards demand functional flexibility for implementations, not only at design time but also after fabrication. This paper presents a System-on-Chip design approach with a feasible combination of performance, scalability, programmability, area efficiency, and design time effort for a video encoder. The encoder is based on a homogeneous master-slave processor architecture. Each slave encodes a part of the frame in the Single Program Multiple Data (SPMD) data parallel model. Both shared and distributed memory architectures are presented. Design effort is reduced by identical program codes, automated assembly of software and hardware modules independent of the number and type of processors, as well as our flexible on-chip communication network called Heterogeneous IP Block Interconnection (HIBI). A case study implementation with two to ten simple ARM7 processors, 32-bit HIBI bus and non-optimized processor-independent software gives the performance from 6 to 53 fps for QCIF. The whole encoder area ranges from 173 to 770 kgates excluding the memories. The relation scales reasonably well to systems with more powerful processors and optimized code. The optimization of the communication network shows that with more than six slaves even a serial HIBI connection with 100 MHz speed is feasible. HIBI and the parallelization approach allow exploration and optimization of the communication both at the application and architecture layers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel-Pipeline Processing of Video Information in Multiprocessing Heterogeneous Systems on a Chip

Article 01 December 2022

Cost of Bandwidth-Optimized Sparse Mesh Layouts

An Intra-Server Interconnect Fabric for Heterogeneous Computing

Article 17 November 2014

References

ITU-T Recommendation H.264, Advanced Video Coding for Generic Audiovisual Services, May 2003.
P. Pirch and H.-J. Stolberg, “VLSI Implementations of Image and Video Multimedia Processing Systems,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, no. 7, 1998, pp. 878–891.
Article Google Scholar
A. Dasu and S. Panchanathan, “A Survey of Media Processing Approaches,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 8, 2002, pp. 633–645.
Article Google Scholar
O. Lehtoranta, T. Hämäläinen, V. Lappalainen, and J. Mustonen, “Parallel Implementation of Video Encoder on Quad DSP System,” Microprocessors and Microsystems, vol. 26, no. 1, 2002, pp. 1–15.
Article Google Scholar
E. Iwata, K. Seno, M. Aikawa, M. Ohki, H. Yoshikawa, Y. Fukuzawa, H. Hanaki, K. Nishibori, Y. Kondo, H. Takamuki, T. Nagai, K. Hasegawa, H. Okuda, I. Kumata, M. Soneda, S. Iwase, and T. Yamazaki, “A 2.2 GOPS Video DSP with 2-RISC MIMD, 6-PE SIMD Architecture for Real-Time MPEG2 Video Coding/Decoding,” Digest of Technical Papers of IEEE International Solid-State Circuits Conference, 1997, pp. 258–259, 469.
Google Scholar
S. Ishiwata, T. Yamakage, Y. Tsuboi, T. Shimazawa, T. Kitazawa, S. Michinaka, K. Yahagi, H. Takeda, A. Oue, T. Kodama, N. Matsumoto, T. Kamei, M. Saito, T. Miyamori, G. Ootomo, and M. Matsui, “A Single-Chip MPEG-2 Codec Based on Customizable Media Embedded Processor,” IEEE Journal of Solid-State Circuits, vol. 38, no. 3, 2003, pp. 530–540.
Article Google Scholar
S. Ramachandran and S. Srinivasan, “A Fast, FPGA-based MPEG-2 Video Encoder with a Novel Automatic Quality Control Scheme,” Microprocessors and Microsystems, vol. 25, no. 9-10, 2002, pp. 449–457.
Article Google Scholar
M. Berekovic, S. Flagel, H.-J. Stolberg, L. Friebe, S. Moch, M.B. Kulaczewski, and P. Pirsch, “HiBRID-SoC: a Multi-Core Architecture for Image and Video Applications,” Proceedings of the International Conference on Image Processing, vol. 3, 2003, pp. 101–104.
Google Scholar
H.-J. Stolberg, M. Berekovic, P. Pirsch, H. Runge, H. Moller, and J. Kneip, “The M-PIRE MPEG-4 Codec DSP and Its Macroblock Engine,” Proceedings of the International Symposium on Circuits and Systems, vol. 2, 2000, pp. 192–195.
Google Scholar
M. Harrand, J. Sanches, A. Bellon, J. Bulone, A. Tournier, O. Deygas, J.-C. Herluison, D. Doise, and E. Berrebi, “A Single-Chip CIF 30-Hz, H261, H263, and H263+ Video Encoder/Decoder with Embedded Display Controller,” IEEE Journal of Solid-State Circuits, vol. 34, no. 11, 1999, pp. 1627–1633.
Article Google Scholar
N. Minegishi, N Motoyama, M. Takagi, F. Ogawa, K. Shibata, N. Goda, K. Akiyoshi, T. Kamemaru, and K. Asano, “A Single Chip H.32X Multimedia Communication Processor with CIF 30f/s MPEG4/H.26X Bi-directional Codec,” Proceedings of the European Solid-State Circuits Conference, 2001.
S.H. Lee, M. Kim, and K.-B. Kim, “Modular and Efficient Architecture for H.263 Video Codec VLSI,” Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 5, 2002, pp. V-125–V-128.
Article Google Scholar
T. Onoye, G. Fujita, H. Okuhata, M.H. Miki, and I. Shirakawa, “Low-Power Implementation of H.324 Audiovisual Codec Dedicated to Mobile Computing,” Proceedings of the Asia and South Pacific Design Automation Conference, 1998, pp. 589–594.
J. Hilgenstock, K. Herrmann, J. Otterstedt, D. Niggemeyer, and P. Pirsch, “A Video Signal Processor for MIMD Multiprocessing,” Proceedings of the Design Automation Conference, 1998, pp. 50–55.
S. Park, S. Kim, K, Byeon, J. Cha, and H. Cho, “An Area Efficient Video/Audio Codec for Portable Multimedia Application,” Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 1, 2000, pp. 595–598.
Google Scholar
M. Takahashi, T. Nishikawa, H. Arakida, N. Machida, H. Yamamoto, T. Fujiyoshi, Y. Matsumoto, O. Yamagishi, T. Samata, A. Asano, T. Terazawa, K. Ohmori, J Shirakura, Y. Watanabe, H. Nakamura, S. Minami, and T. Furuyama, “A Scalable MPEG-4 Video Codec Architecture for IMT-2000 Multimedia Applications,” Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 2, 2000, pp. 188–191.
Google Scholar
J. Chaoui, K. Cyr, S. de Gregorio, J.-P. Giacalone, J. Webb, J, and Y. Masse, “Enabling Video Processing in Wireless Terminals with a New Open Multimedia Application Platform,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, 2001, pp. 1009–1012.
Google Scholar
E. Salminen, V. Lahtinen, T. Kangas, J. Riihimäki, K. Kuusilinna, and T. Hämäläinen, “HIBI Communication Network for Systems-on-Chip,” to appear in Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, Springer.
ITU-T Recommendation H.263, Video Coding for Low Bitrate Communication, January 1998.
ARM Limited, ARM Architecture Reference Manual, ARM DDI 0100E, 2000.
VSI Alliance, Virtual Component Interface Specification 2, Version 1.0, 1999.
Open Core Protocol International Partnership, http://www.ocpip.org.
E. Salminen, T. Hämäläinen, T. Kangas, K. Kuusilinna, and J. Saarinen, “Interfacing Multiple Processors in a System-on-Chip Video Codec,” Proceedings of the IEEE International Symposium on Circuits and Systems Conference, vol. 4, 2001, pp. 478–481.
Google Scholar
W.-T. Shiue and C. Chakrabarti, “Memory Design and Exploration for Low Power, Embedded Systems,” Journal of VLSI Signal Processing, Kluwer Academic Publishers, vol. 29, no. 3, pp. 167–178, 2001.
Article MATH Google Scholar
S. Dutta, W. Wolf, and A. Wolfe, “A Methodology to Evaluate Memory Architecture Design Tradeoffs for Video Signal Processors,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, no. 1, 1998, pp. 36–53.
Article Google Scholar
ARM Limited, AXD and armsd Debuggers Guide, ARM DUI 0066D, ARM Developer Suite, version 1.2.
Mentor Graphics, Seamless CVE User’s and Reference Manual, Software Version 5.0, 2003.
J. Riihimäki, V. Helminen, K. Kuusilinna, and T. Hämäläinen, “Parallelizing SoC Simulations over a Network of Computers,” Proceedings of the Euromicro symposium on Digital System Design, 2003, pp. 447–450.
ARM Limited, ARM7 Thumb Family Flyer, DOI 0035-3/02.02, 2002.
Sematech, International Technology Roadmap for Semiconductors: System Drivers, Report, 2003.
H.J. Mattausch, “Hierarchical N-Port Memory Architecture based on 1-Port Memory Cells,” Proceedings of the 23rd European Solid-State Circuits Conference, 1997, pp. 348–351.
H.-J. Stolberg, M. Berekovic, and P. Pirsch, “A Platform-Independent Methodology for Performance Estimation of Streaming Media Applications,” Proceedings of the IEEE International Conference on Multimedia and EXPO, 2002.
J. Riihimäki, E. Salminen, K. Kuusilinna, and T. Hämäläinen, “Parameter Optimization Tool for Enhancing On-chip Network Performance,” Proceedings of the IEEE International Symposium of Circuits and Systems, 2002, pp. 61–64.
T. Kangas, P. Kukkala, H. Orsila, E. Salminen, M. Hännikäinen, T. Hämäläinen, J. Riihimäki, and K. Kuusilinna, “UML-based Multi-Processor SoC Design Framework,” to appear in Transactions on Embedded Computing Systems, ACM, 2006.

Download references

Author information

Authors and Affiliations

Institute of Digital and Computer Systems, Tampere University of Technology, P.O. Box 553, FI-33101, Tampere, Finland
Tero Kangas & Timo D. Hämäläinen
Nokia Research Center, Tampere, Finland
Kimmo Kuusilinna

Authors

Tero Kangas
View author publications
You can also search for this author in PubMed Google Scholar
Timo D. Hämäläinen
View author publications
You can also search for this author in PubMed Google Scholar
Kimmo Kuusilinna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tero Kangas.

Additional information

Tero Kangas, MSc ’01, Tampere University of Technology (TUT). Since 1999 he has been working as a research scientist in the Institute of Digital and Computer Systems (DCS) at TUT. Currently he is working towards his PhD degree and his main research topics are system architectures and SoC design methodologies in multimedia applications.

Kimmo Kuusilinna, PhD ’01, TUT. His main research interests include system-level design and verification, interconnection networks, and parallel memories. Currently he is working as a senior research engineer at the Nokia Research Center.

Timo D. Hämäläinen, MSc ’93, PhD ’97, TUT. He acted as a senior research scientist and project manager at TUT in 1997-2001. He was nominated to full professor at TUT/Institute of Digital and Computer Systems in 2001. He heads the DACI research group that focuses on three main lines: wireless local area networking and wireless sensor networks, high-performance DSP/HW based video encoding, and interconnection networks with design flow tools for heterogeneous SoC platforms.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kangas, T., Hämäläinen, T.D. & Kuusilinna, K. Scalable Architecture for SoC Video Encoders. J VLSI Sign Process Syst Sign Image Video Technol 44, 79–95 (2006). https://doi.org/10.1007/s11265-006-5918-x

Download citation

Published: 27 May 2006
Issue Date: August 2006
DOI: https://doi.org/10.1007/s11265-006-5918-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Architecture for SoC Video Encoders

Abstract

Access this article

Similar content being viewed by others

Parallel-Pipeline Processing of Video Information in Multiprocessing Heterogeneous Systems on a Chip

Cost of Bandwidth-Optimized Sparse Mesh Layouts

An Intra-Server Interconnect Fabric for Heterogeneous Computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scalable Architecture for SoC Video Encoders

Abstract

Access this article

Similar content being viewed by others

Parallel-Pipeline Processing of Video Information in Multiprocessing Heterogeneous Systems on a Chip

Cost of Bandwidth-Optimized Sparse Mesh Layouts

An Intra-Server Interconnect Fabric for Heterogeneous Computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation