Skip to main content
Log in

Architecture for parallel marker-free variable length streams decoding

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Due to throughput requirements above 1 gigapixel/sec for the real-time compression of modern image and video data streams, parallelism for encoding and decoding is inevitable. To achieve parallel decoding, a well-established technique is to insert markers into the variable length code (VLC) stream. By locating markers, it is then possible to extract the sub-streams that are, in turn, decoded in parallel. The use of markers adversely affects compression especially when a high degree of parallelism is required. In this paper, we propose an architecture of a marker-free parallel decoding approach of VLC streams. Instead of multiple local entropy decoders, the proposed architecture is based on using a single parallel entropy decoder in conjunction with a novel format to construct the VLC stream. The approach runs at high clock rates supporting parallelism to a high number of decoders. A synthesized clock frequency well above 110 MHz is achieved for up to 20 decoders on a medium-sized FPGA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Recommendation ITU-R BT.2020-2: Parameter values for ultra-high definition television systems for production and international programme exchange (2015)

  2. ITU-T Recommendation H.264 : Advanced video coding for generic audiovisual services. http://www.itu.int/rec/T-REC-H.264-200711-I/en (2007)

  3. ITU-T Recommendation ITU-T H.265: High efficiency video coding. http://handle.itu.int/11.1002/1000/11885 (2013)

  4. Meenderinck, C., Azevedo, A., Juurlink, B., Alvarez Mesa, M., Ramirez, A.: Parallel scalability of video decoders. J. Signal Process. Syst. 57(2), 173–194 (2009). doi:10.1007/s11265-008-0256-9

    Article  Google Scholar 

  5. Wu, N., Wen, M., Ren, H.S.J., Zhang, C.: A parallel H.264 encoder with CUDA: mapping and evaluation. In: 2012 IEEE 18th International Conference on Parallel and Distributed Systems (ICPADS), pp. 276–283 (2012). doi:10.1109/ICPADS.2012.46

  6. Lu, Y., Zhang, Q., Wei, B.: Real-time CPU based H.265/HEVC encoding solution with x86 platform technology. In: 2015 International Conference on Computing, Networking and Communications (ICNC), pp. 418–421 (2015). doi:10.1109/ICCNC.2015.7069380

  7. Saponara, S., Martina, M., Casula, M., Fanucci, L., Masera, G.: Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding. Microprocess. Microsyst. Embed. Hardw. Des. 34(7–8), 316–328 (2010). doi:10.1016/j.micpro.2010.06.003

    Article  Google Scholar 

  8. Mei-Hua, X., Yu-Lan, C., Feng, R., Zhang-Jin, C.: Optimizing design and FPGA implementation for CABAC decoder. In: 2007 International Symposium on High Density packaging and Microsystem Integration, pp. 1–5 (2007). doi:10.1109/HDP.2007.4283645

  9. Nunez, J.L., Chouliaras, V.A.: High-performance arithmetic coding VLSI macro for the H264 video compression standard. IEEE Trans. Consum. Electron. 51(1), 144–151 (2005). doi:10.1109/TCE.2005.1405712

    Article  Google Scholar 

  10. Yang, Y.C., Guo, J.I.: High-throughput H.264/AVC high-profile CABAC decoder for HDTV applications. IEEE Trans. Circuits Syst. Video Technol. 19(9), 1395–1399 (2009). doi:10.1109/TCSVT.2009.2020340

    Article  Google Scholar 

  11. Sze, V., Chandrakasan, A.P.: Joint algorithm-architecture optimization of CABAC. J. Signal Process. Syst. 69(3), 239–252 (2012). doi:10.1007/s11265-012-0678-2

    Article  Google Scholar 

  12. Liao, T.T., Shen, C.A., Tseng, Y.H.: The algorithm and VLSI architecture of a high efficient motion estimation with adaptive search range for HEVC systems. J. Real-Time Image Process. (2017). doi:10.1007/s11554-017-0697-0

    Article  Google Scholar 

  13. Lung, C.Y., Shen, C.A.: Design and implementation of a highly efficient fractional motion estimation for the HEVC encoder. J. Real-Time Image Process. (2016). doi:10.1007/s11554-016-0663-2

    Article  Google Scholar 

  14. Varma, K.C.R.C., Kumar, M.V.P., Mahapatra, S.: Search range reduction for uni-prediction and bi-prediction in HEVC. J. Real-Time Image Process. (2016). doi:10.1007/s11554-016-0636-5

    Article  Google Scholar 

  15. Sze, V., Budagavi, M.: Parallelization of cabac transform coefficient coding for hevc. In: 2012 Picture Coding Symposium, pp. 509–512 (2012). doi:10.1109/PCS.2012.6213266

  16. Ono, F., Rucklidge, W., Arps, R., Constantinescu, C.: JBIG2—the ultimate bi-level image coding standard. In: ICIP, pp. 140–143 (2000). http://dblp.uni-trier.de/db/conf/icip/icip2000.html#OnoRAC00

  17. Wallace, G.K.: The JPEG still picture compression standard. Commun. ACM 34(4), 30–44 (1991)

    Article  Google Scholar 

  18. Weinberger, M.J., Seroussi, G., Sapiro, G.: The LOCO-I lossless image compression algorithm: principles and standardization into JPEG-LS. IEEE Trans. Image Process. 9(8), 1309–1324 (2000)

    Article  Google Scholar 

  19. Singh, S., Bhasin, A., Saha, K.: Parallelization of variable length decoding. http://www.google.com/patents/US8520958 (2013). US Patent 8,520,958

  20. Korodi, G., He, D., Yang, E., Martin-Cocher, G.: Methods and devices for load balancing in parallel entropy coding and decoding. http://www.google.com/patents/US8730071 (2014). US Patent 8,730,071

  21. Ebrahimi, T., Horne, C.: MPEG-4 natural video coding—an overview. In: Signal Processing: Image Communication, vol. 14. Elsevier, Amsterdam, Netherlands, pp. 365–385 (2000)

  22. ITU: ISO/IEC 10918-1: 1993(E) CCIT Recommendation T.81. http://www.w3.org/Graphics/JPEG/itu-t81.pdf (1993)

  23. Moussalli, R., Najjar, W.A., Luo, X., Khan, A.: A high throughput no-stall Golomb-rice hardware decoder. In: 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2013, Seattle, WA, USA, 28–30 April 2013, pp. 65–72. IEEE Computer Society (2013). doi:10.1109/FCCM.2013.9

  24. Altera: White paper: video and image processing design using fpgas systems. Tech. Rep. WP-VIDEO0306-1.1, Altera Corporation (2007)

  25. Bailey, D.: Design for Embedded Image Processing on FPGAs. Wiley, New York (2011). https://books.google.de/books?id=ynSYGQdsgIAC

    Book  Google Scholar 

  26. Baroud, Y., Lê, N., Wang, Z., Kieß, S., Najmabadi, S.M., Simon, S.: A parallel codec architecture for marker-free variable length code streams. In: Proceedings of the 10th HiPEAC Workshop on Reconfigurable Computing (WRC) (2016)

  27. Baroud, Y., Velarde, J.M.M., Simon, S.: Architecture for parallelizing decoding of marker-free variable length code streams. In: 2016 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 270–275 (2016). doi:10.1109/SPA.2016.7763626

  28. Fimoff, M., Laud, T., Lee, R.: Method of processing variable size blocks of data by storing numbers representing size of data blocks in a fifo. http://www.google.tl/patents/USRE41569 (2010). US Patent RE41,569

  29. Kwon, O.: Apparatus for parallel encoding/decoding of digital video signals. http://www.google.co.ug/patents/EP0720372A1?cl=en (1996). EP Patent App. EP19,940,120,951

  30. Lei, S., Sun, M.T.: An entropy coding system for digital hdtv applications. IEEE Trans. Circuits Syst. Video Technol. 1(1), 147–155 (1991). http://dblp.uni-trier.de/db/journals/tcsv/tcsv1.html#LeiS91

    Article  Google Scholar 

  31. Boliek, M., Allen, J.D., Schwarz, E.L., Gormish, M.J.: Very high speed entropy coding. In: ICIP, vol. 3 (1994)

  32. Lin, H.D., Messerschmitt, D.: Designing a high-throughput VLC decoder. I. Parallel decoding methods. IEEE Trans. Circuits Syst. Video Technol. 2(2), 197–206 (1992). doi:10.1109/76.143419

    Article  Google Scholar 

  33. Sevcenco, A.M., Lu, W.S.: Adaptive down-scaling techniques for JPEG-based low bit-rate image coding. In: 2006 IEEE International Symposium on Signal Processing and Information Technology, pp. 349–354 (2006). doi:10.1109/ISSPIT.2006.270824

  34. Lin, W., Dong, L.: Adaptive downsampling to improve image compression at low bit rates. IEEE Trans. Image Process. 15(9), 2513–2521 (2006). doi:10.1109/TIP.2006.877415

    Article  Google Scholar 

  35. Ahangar, A.I., Agarwal, R., Lakhotia, K.: Real time low complexity VLSI decoder for prefix coded images. In: 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1694–1697 (2016). doi:10.1109/ISCAS.2016.7538893

  36. Lee, E.S., Lee, K.C., Son, K.J., Moon, S.P., Chang, T.G.: Multi-symbol accessing Huffman decoding method for MPEG-2 AAC. J. Electr. Eng. Technol. 4(4) (2014). doi:10.5370/JEET.2014.9.4.1411

    Article  Google Scholar 

  37. Nikara, J., Vassiliadis, S., Takala, J., Sima, M., Liuha, P.: Parallel multiple-symbol variable-length decoding. In: Werner, B. (ed.) IEEE International Conference on Computer Design, pp. 126–131. IEEE Computer Society Press, 10662 Los Vaqueros Circle, P.O. Box 3014, Los Alamitos, CA 90720-1314, Freiburg, Germany (2002). ISBN: 0-7695-1700-5

  38. Howard, P.G., Vitter, J.S.: Fast and efficient lossless image compression. In: Proceedings of the 1993 Data Compression Conference, (Snowbird), pp. 351–360 (1993)

Download references

Acknowledgements

This work is part of the project Intelligenter Optischer Sensor zur 2D/3D Objekt-Erfassung und dimensionellen Messtechnik (IOS23) which is financed by the Baden-Württemberg-Stiftung gGmbH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yousef Baroud.

Appendices

Appendix A: bit stream construction unit functionality in C#

figure c

Appendix B: symbol distribution unit functionality in C#

figure d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baroud, Y., Mariños Velarde, J.M., Wang, Z. et al. Architecture for parallel marker-free variable length streams decoding. J Real-Time Image Proc 16, 2127–2146 (2019). https://doi.org/10.1007/s11554-017-0715-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-017-0715-2

Keywords

Navigation