Skip to main content

A Partitioned Memory Architecture with Prefetching for Efficient Video Encoders

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2022)

Abstract

A hardware video encoder based on recent video coding standards such as HEVC and VVC needs to efficiently handle a massive number of memory accesses to search motion vectors. To this end, first, this paper preliminarily evaluates the memory access behavior of a hardware video encoding pipeline. The preliminary evaluation suggests that the behavior of the early stages of the pipeline, accessing the wide areas of reference frames for the rough search, is quite different from those of the subsequent ones, accessing the small areas of them for the precise search. Therefore, this paper proposes a partitioned memory architecture for the hardware video encoding pipeline. This architecture adopts a split cache structure that consists of a front-end cache and a back-end cache. The front-end cache stores shrunk reference frames and provides them for the rough search in the early stages. Normal reference frames for the precise search are provided only to the subsequent stages through the back-end cache. As a result, this structure can reduce the memory bandwidth requirement. On the other hand, the split cache structure cannot reuse the data loaded by the early stages. It increases cache misses in the subsequent stages and may violate the deadline of memory accesses for real-time encoding. To solve this problem, this paper also designs and implements a coding tree unit (CTU) prefetcher to the back-end cache. The CTU prefetcher loads the data used by the subsequent stages without waiting for the results of the early stages. The evaluation results show that the proposed memory system can successfully reduce the cache miss rate and the deadline miss rate in the subsequent stages. As a result, the proposed memory architecture can contribute to satisfying the demands for real-time encoding while reducing energy consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Binkert, N., et al.: The gem5 simulator. ACM SIGARCH Comput. Architect. News 39(2), 1–7 (2011). https://doi.org/10.1145/2024716.2024718

    Article  Google Scholar 

  2. Bross, B., et al.: Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 31(10), 3736–3764 (2021). https://doi.org/10.1109/TCSVT.2021.3101953

    Article  Google Scholar 

  3. Cerveira, A., Agostini, L., Zatt, B., Sampaio, F.: Memory assessment of versatile video coding. In: International Conference on Image Processing, vol. 2020, pp. 1186–1190. IEEE Computer Society (2020). https://doi.org/10.1109/ICIP40778.2020.9191358

  4. JCT-VC: HEVC test model (2022). https://hevc.hhi.fraunhofer.de/

  5. Kondo, Y., et al.: A shared cache architecture for VVC coding. In: COOL Chips 25 Poster (2022)

    Google Scholar 

  6. Mativi, A., Monteiro, E., Bampi, S.: Memory access profiling for HEVC encoders. In: IEEE 7th Latin American Symposium on Circuits and Systems (LASCAS), pp. 243–246 (2016). https://doi.org/10.1109/LASCAS.2016.7451055

  7. Muralimanohar, N., Balasubramonian, R., Jouppi, N.P.: CACTI 6.0: a tool to model large caches. Technical report. HPL-2009-85, HP Labs (2009)

    Google Scholar 

  8. Omori, Y., Onishi, T., Iwasaki, H., Shimizu, A.: A 120 fps high frame rate real-time HEVC video encoder with parallel configuration scalable to 4K. IEEE Trans. Multi-Scale Comput. Syst. 4(4), 491–499 (2018). https://doi.org/10.1109/TMSCS.2018.2825320

    Article  Google Scholar 

  9. Onishi, T., et al.: A single-chip 4K 60-fps 4:2:2 HEVC video encoder LSI employing efficient motion estimation and mode decision framework with scalability to 8K. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 26(10), 1930–1938 (2018). https://doi.org/10.1109/TVLSI.2018.2842179

  10. Sinangil, M.E., Chandrakasan, A.P., Sze, V., Zhou, M.: Memory cost vs. coding efficiency trade-offs for HEVC motion estimation engine. In: International Conference on Image Processing, pp. 1533–1536 (2012). https://doi.org/10.1109/ICIP.2012.6467164

  11. Sullivan, G.J., Ohm, J.R., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012). https://doi.org/10.1109/TCSVT.2012.2221191

    Article  Google Scholar 

  12. The Advanced Television Systems Committee Inc: ATSC3.0 standards (2022). https://www.atsc.org/atsc-documents/type/3-0-standards/

  13. Tsai, S.F., Li, C.T., Chen, H.H., Tsung, P.K., Chen, K.Y., Chen, L.G.: A 1062Mpixels/s 8192\(\times \)4320p high efficiency video coding (H.265) encoder chip. In: IEEE Symposium on VLSI Circuits, Digest of Technical Papers, pp. C146–C147. IEEE (2013). https://ieeexplore.ieee.org/abstract/document/6578657

  14. Wiegand, T., Sullivan, G.J., Bjøntegaard, G., Luthra, A.: overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13(7), 560–576 (2003). https://doi.org/10.1109/TCSVT.2003.815165

Download references

Acknowledgements

This work was partially supported by Grant-in-Aid for Scientific Research (B) No. 22H03571 and the joint research between Tohoku University and NTT Device Innovation Center, NTT Corporation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masayuki Sato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sato, M. et al. (2023). A Partitioned Memory Architecture with Prefetching for Efficient Video Encoders. In: Takizawa, H., Shen, H., Hanawa, T., Hyuk Park, J., Tian, H., Egawa, R. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2022. Lecture Notes in Computer Science, vol 13798. Springer, Cham. https://doi.org/10.1007/978-3-031-29927-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29927-8_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29926-1

  • Online ISBN: 978-3-031-29927-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics