skip to main content
research-article

ExtraNet: real-time extrapolated rendering for low-latency temporal supersampling

Published:10 December 2021Publication History
Skip Abstract Section

Abstract

Both the frame rate and the latency are crucial to the performance of realtime rendering applications such as video games. Spatial supersampling methods, such as the Deep Learning SuperSampling (DLSS), have been proven successful at decreasing the rendering time of each frame by rendering at a lower resolution. But temporal supersampling methods that directly aim at producing more frames on the fly are still not practically available. This is mainly due to both its own computational cost and the latency introduced by interpolating frames from the future. In this paper, we present ExtraNet, an efficient neural network that predicts accurate shading results on an extrapolated frame, to minimize both the performance overhead and the latency. With the help of the rendered auxiliary geometry buffers of the extrapolated frame, and the temporally reliable motion vectors, we train our ExtraNet to perform two tasks simultaneously: irradiance in-painting for regions that cannot find historical correspondences, and accurate ghosting-free shading prediction for regions where temporal information is available. We present a robust hole-marking strategy to automate the classification of these tasks, as well as the data generation from a series of high-quality production-ready scenes. Finally, we use lightweight gated convolutions to enable fast inference. As a result, our ExtraNet is able to produce plausibly extrapolated frames without easily noticeable artifacts, delivering a 1.5× to near 2× increase in frame rates with minimized latency in practice.

Skip Supplemental Material Section

Supplemental Material

a278-guo.mp4

mp4

263.8 MB

References

  1. Dmitry Andreev. 2010. Real-Time Frame Rate up-Conversion for Video Games: Or How to Get from 30 to 60 Fps for "Free". In ACM SIGGRAPH 2010 Talks (Los Angeles, California) (SIGGRAPH '10). Association for Computing Machinery, Article 16, 1 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Simon Baker, Stefan Roth, Daniel Scharstein, Michael J. Black, J.P. Lewis, and Richard Szeliski. 2007. A Database and Evaluation Methodology for Optical Flow. In 2007 IEEE 11th International Conference on Computer Vision. 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  3. Steve Bako, Thijs Vogels, Brian Mcwilliams, Mark Meyer, Jan NováK, Alex Harvill, Pradeep Sen, Tony Derose, and Fabrice Rousselle. 2017. Kernel-predicting Convolutional Networks for Denoising Monte Carlo Renderings. ACM Trans. Graph. 36, 4 (July 2017), 97:1--97:14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang. 2019. Depth-Aware Video Frame Interpolation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3698--3707.Google ScholarGoogle Scholar
  5. Wenbo Bao, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang. 2021. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2021), 933--948.Google ScholarGoogle ScholarCross RefCross Ref
  6. Huw Bowles, Kenny Mitchell, Robert Sumner, Jeremy Moore, and Markus Gross. 2012. Iterative Image Warping. Computer Graphics Forum 31 (05 2012), 1.Google ScholarGoogle Scholar
  7. Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive Reconstruction of Monte Carlo Image Sequences Using a Recurrent Denoising Autoencoder. ACM Trans. Graph. 36, 4 (July 2017), 98:1--98:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gyorgy Denes, Kuba Maruszczyk, George Ash, and Rafał K. Mantiuk. 2019. Temporal Resolution Multiplexing: Exploiting the limitations of spatio-temporal vision for more efficient VR rendering. IEEE Transactions on Visualization and Computer Graphics 25, 5 (2019), 2072--2082.Google ScholarGoogle ScholarCross RefCross Ref
  9. Piotr Didyk, Elmar Eisemann, Tobias Ritschel, Karol Myszkowski, and Hans-Peter Seidel. 2010a. Perceptually-motivated Real-time Temporal Upsampling of 3D Content for High-refresh-rate Displays. Computer Graphics Forum 29, 2 (2010), 713--722.Google ScholarGoogle ScholarCross RefCross Ref
  10. Piotr Didyk, Tobias Ritschel, Elmar Eisemann, Karol Myszkowski, and Hans-Peter Seidel. 2010b. Adaptive Image-space Stereo View Synthesis. In Vision, Modeling, and Visualization (2010). The Eurographics Association.Google ScholarGoogle Scholar
  11. Epic Games. 2018. Unreal Engine 4.19: Screen Percentage with Temporal Upsample. https://docs.unrealengine.com/en-US/Engine/Rendering/ScreenPercentage/index.html. Accessed in August 2019.Google ScholarGoogle Scholar
  12. Denis Fortun, Patrick Bouthemy, and Charles Kervrann. 2015. Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134 (2015), 1 -- 21. Image Understanding for Real-world Distributed Video Networks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jie Guo, Mengtian Li, Quewei Li, Yuting Qiang, Bingyang Hu, Yanwen Guo, and Ling-Qi Yan. 2019. GradNet: Unsupervised Deep Screened Poisson Reconstruction for Gradient-Domain Rendering. ACM Trans. Graph. 38, 6, Article 223 (Nov. 2019), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2019. Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 558--567.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, and Shuchang Zhou. 2020. RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation. arXiv preprint arXiv:2011.06294 (2020).Google ScholarGoogle Scholar
  16. Huaizu Jiang, Deqing Sun, Varan Jampani, Ming-Hsuan Yang, Erik Learned-Miller, and Jan Kautz. 2018. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9000--9008.Google ScholarGoogle Scholar
  17. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  18. Thomas Leimkühler, Hans-Peter Seidel, and Tobias Ritschel. 2017. Minimal Warping: Planning Incremental Novel-view Synthesis. Computer Graphics Form (Proc. EGSR) 36, 4 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Edward Liu. 2020. DLSS 2.0 - Image Reconstruction for Real-Time Rendering with Deep learning. In Game Developers Conference.Google ScholarGoogle Scholar
  20. Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image Inpainting for Irregular Holes Using Partial Convolutions. In The European Conference on Computer Vision (ECCV).Google ScholarGoogle Scholar
  21. Hongyu Liu, Bin Jiang, Yibing Song, Wei Huang, and Chao Yang. 2020. Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations. In Computer Vision - ECCV 2020. Springer International Publishing, Cham, 725--741.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Michael Mara, Morgan McGuire, Benedikt Bitterli, and Wojciech Jarosz. 2017. An efficient denoising algorithm for global illumination. High Performance Graphics 10 (2017), 3105762--3105774.Google ScholarGoogle Scholar
  23. William R. Mark, Leonard McMillan, and Gary Bishop. 1997. Post-Rendering 3D Warping. In Proceedings of the 1997 Symposium on Interactive 3D Graphics (Providence, Rhode Island, USA) (I3D '97). Association for Computing Machinery, New York, NY, USA, 7--ff. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Joerg H. Mueller, Thomas Neff, Philip Voglreiter, Markus Steinberger, and Dieter Schmalstieg. 2021. Temporally Adaptive Shading Reuse for Real-Time Rendering and Virtual Reality. ACM Trans. Graph. 40, 2, Article 11 (April 2021), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Netflix. 2016. Toward a practical perceptual video quality metric. https://medium.com/netflix-techblog/toward-a-practical-perceptual-video-quality-metric-653f208b9652.Google ScholarGoogle Scholar
  26. Simon Niklaus and Feng Liu. 2020. Softmax Splatting for Video Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  27. Oculus. 2016. Asynchronous SpaceWarp (ASW). https://developer.oculus.com/blog/asynchronous-spacewarp//.Google ScholarGoogle Scholar
  28. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. AJ Piergiovanni and Michael S. Ryoo. 2019. Representation Flow for Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  30. Bernhard Reinert, Johannes Kopf, Tobias Ritschel, Eduardo Cuervo, David Chu, and Hans-Peter Seidel. 2016. Proxy-guided Image-based Rendering for Mobile Devices. Computer Graphics Forum 35, 7 (2016), 353--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H. Li, Shan Liu, and Ge Li. 2019. StructureFlow: Image Inpainting via Structure-Aware Appearance Flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  32. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.Google ScholarGoogle ScholarCross RefCross Ref
  33. Marcel Santana Santos, Tsang Ing Ren, and Nima Khademi Kalantari. 2020. Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss. ACM Trans. Graph. 39, 4, Article 80 (July 2020), 10 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Daniel Scherzer, Lei Yang, Oliver Mattausch, Diego Nehab, Pedro V. Sander, Michael Wimmer, and Elmar Eisemann. 2012. Temporal Coherence Methods in Real-Time Rendering. Comput. Graph. Forum 31, 8 (Dec. 2012), 2378--2408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christoph Schied, Anton Kaplanyan, Chris Wyman, Anjul Patney, Chakravarty R Alla Chaitanya, John Burgess, Shiqiu Liu, Carsten Dachsbacher, Aaron Lefohn, and Marco Salvi. 2017. Spatiotemporal variance-guided filtering: real-time reconstruction for path-traced global illumination. In Proceedings of High Performance Graphics. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Andre Schollmeyer, Simon Schneegans, Stephan Beck, Anthony Steed, and Bernd Froehlich. 2017. Efficient Hybrid Image Warping for High Frame-Rate Stereoscopic Rendering. IEEE Transactions on Visualization and Computer Graphics 23, 4 (2017), 1332--1341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pradeep Sen, Matthias Zwicker, Fabrice Rousselle, Sung-Eui Yoon, and Nima Khademi Kalantari. 2015. Denoising Your Monte Carlo Renders: Recent Advances in Image-space Adaptive Sampling and Reconstruction. In ACM SIGGRAPH 2015 Courses (Los Angeles, California) (SIGGRAPH '15). 11:1--11:255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Eli Shechtman, Alex Rav-Acha, Michal Irani, and Steve Seitz. 2010. Regenerative Morphing. In IEEE Conference on Computer VIsion and Pattern Recognition (CVPR). San-Francisco, CA.Google ScholarGoogle Scholar
  39. Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1 (Montreal, Canada) (NIPS'14). MIT Press, Cambridge, MA, USA, 568--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Josef Spjut, Ben Boudaoud, Kamran Binaee, Jonghyun Kim, Alexander Majercik, Morgan McGuire, David Luebke, and Joohwan Kim. 2019. Latency of 30 Ms Benefits First Person Targeting Tasks More Than Refresh Rate Above 60 Hz. In SIGGRAPH Asia 2019 Technical Briefs (Brisbane, QLD, Australia) (SA '19). Association for Computing Machinery, New York, NY, USA, 110--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Tiancheng Sun, Zexiang Xu, Xiuming Zhang, Sean Fanello, Christoph Rhemann, Paul Debevec, Yun-Ta Tsai, Jonathan T Barron, and Ravi Ramamoorthi. 2020. Light stage super-resolution: continuous high-frequency relighting. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Thijs Vogels, Fabrice Rousselle, Brian McWilliams, Gerhard Röthlin, Alex Harvill, David Adler, Mark Meyer, and Jan Novák. 2018. Denoising with kernel prediction and asymmetric loss functions. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600--612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, and Chenliang Xu. 2020. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3370--3379.Google ScholarGoogle Scholar
  45. Kai Xiao, Gabor Liktor, and Karthik Vaidyanathan. 2018. Coarse Pixel Shading with Temporal Supersampling. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (Montreal, Quebec, Canada) (I3D '18). Association for Computing Machinery, New York, NY, USA, Article 1, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Lei Xiao, Salah Nouri, Matt Chapman, Alexander Fix, Douglas Lanman, and Anton Kaplanyan. 2020. Neural Supersampling for Real-Time Rendering. ACM Trans. Graph. 39, 4, Article 142 (July 2020), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, and Jiebo Luo. 2019. Foreground-Aware Image Inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  48. Lei Yang, Shiqiu Liu, and Marco Salvi. 2020. A Survey of Temporal Antialiasing Techniques. Computer Graphics Forum 39, 2 (2020), 607--621.Google ScholarGoogle ScholarCross RefCross Ref
  49. Lei Yang, Diego Nehab, Pedro V. Sander, Pitchaya Sitthi-amorn, Jason Lawrence, and Hugues Hoppe. 2009. Amortized Supersampling. ACM Trans. Graph. 28, 5 (Dec. 2009), 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Lei Yang, Yu-Chiu Tse, Pedro V Sander, Jason Lawrence, Diego Nehab, Hugues Hoppe, and Clara L Wilkins. 2011. Image-based bidirectional scene reprojection. In Proceedings of the 2011 SIGGRAPH Asia Conference. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Zili Yi, Qiang Tang, Shekoofeh Azizi, Daesik Jang, and Zhan Xu. 2020. Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting., 7505-7514 pages.Google ScholarGoogle Scholar
  52. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative Image Inpainting With Contextual Attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar
  53. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2019. Free-form image inpainting with gated convolution. In Proceedings of the IEEE International Conference on Computer Vision. 4471--4480.Google ScholarGoogle ScholarCross RefCross Ref
  54. Jiyang Yu and Ravi Ramamoorthi. 2020. Learning Video Stabilization Using Optical Flow. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8156--8164.Google ScholarGoogle Scholar
  55. Zheng Zeng, Shiqiu Liu, Jinglei Yang, Lu Wang, and Ling-Qi Yan. 2021. Temporally Reliable Motion Vectors for Real-time Ray Tracing (to appear). In Computer Graphics Forum (Proceedings of Eurographics 2021).Google ScholarGoogle Scholar
  56. Henning Zimmer, Fabrice Rousselle, Wenzel Jakob, Oliver Wang, David Adler, Wojciech Jarosz, Olga Sorkine-Hornung, and Alexander Sorkine-Hornung. 2015. Path-space Motion Estimation and Decomposition for Robust Animation Filtering. Computer Graphics Forum (Proceedings of EGSR) 34, 4 (June 2015). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ExtraNet: real-time extrapolated rendering for low-latency temporal supersampling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 40, Issue 6
        December 2021
        1351 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3478513
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 December 2021
        Published in tog Volume 40, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader