skip to main content
research-article

Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design

Published:21 June 2021Publication History
Skip Abstract Section

Abstract

Most modern commodity imaging systems we use directly for photography—or indirectly rely on for downstream applications—employ optical systems of multiple lenses that must balance deviations from perfect optics, manufacturing constraints, tolerances, cost, and footprint. Although optical designs often have complex interactions with downstream image processing or analysis tasks, today’s compound optics are designed in isolation from these interactions. Existing optical design tools aim to minimize optical aberrations, such as deviations from Gauss’ linear model of optics, instead of application-specific losses, precluding joint optimization with hardware image signal processing (ISP) and highly parameterized neural network processing. In this article, we propose an optimization method for compound optics that lifts these limitations. We optimize entire lens systems jointly with hardware and software image processing pipelines, downstream neural network processing, and application-specific end-to-end losses. To this end, we propose a learned, differentiable forward model for compound optics and an alternating proximal optimization method that handles function compositions with highly varying parameter dimensions for optics, hardware ISP, and neural nets. Our method integrates seamlessly atop existing optical design tools, such as Zemax. We can thus assess our method across many camera system designs and end-to-end applications. We validate our approach in an automotive camera optics setting—together with hardware ISP post processing and detection—outperforming classical optics designs for automotive object detection and traffic light state detection. For human viewing tasks, we optimize optics and processing pipelines for dynamic outdoor scenarios and dynamic low-light imaging. We outperform existing compartmentalized design or fine-tuning methods qualitatively and quantitatively, across all domain-specific applications tested.

References

  1. Donald Baxter, Frederic Cao, Henrik Eliasson, and Jonathan Phillips. 2012. Development of the I3A CPIQ spatial metrics. Proceedings of SPIE 8293 (2012), 1. DOI:https://doi.org/10.1117/12.905752Google ScholarGoogle ScholarCross RefCross Ref
  2. Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Julie Chang, Vincent Sitzmann, Xiong Dun, Wolfgang Heidrich, and Gordon Wetzstein. 2018. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification. Scientific Reports 8, 1 (2018), 12324.Google ScholarGoogle ScholarCross RefCross Ref
  4. Julie Chang and Gordon Wetzstein. 2019. Deep optics for monocular depth estimation and 3D object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  5. Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  6. Qifeng Chen, Jia Xu, and Vladlen Koltun. 2017. Fast image processing with fully-convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  7. Jung-Min Choi, Sung-Joon Jang, Sang-Seol Lee, Youngbae Hwang, and Byeong Ho Choi. 2014. Memory optimization of bilateral filter and its hardware implementation. In Proceedings of the 18th IEEE International Symposium on Consumer Electronics (ISCE). 1–2.Google ScholarGoogle ScholarCross RefCross Ref
  8. Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing 16, 8 (2007), 2080–2095. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Neural architecture search: A survey. Journal of Machine Learning Research 20, 55 (2019), 1–21.Google ScholarGoogle Scholar
  10. Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image smoothing via unsupervised learning. ACM Transactions on Graphics 37, 6 (2018), Article 259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fengzhou Fang, Xiaodong Zhang, Albert Weckenmann, Guoxiong Zhang, and Chris Evans. 2013. Manufacturing and measurement of freeform optics. CIRP Annals 62, 2 (2013), 823–846.Google ScholarGoogle ScholarCross RefCross Ref
  12. Grant R. Fowles. 1989. Introduction to Modern Optics. Courier Corporation.Google ScholarGoogle Scholar
  13. Andreas Fregin, Julian Müller, Ulrich Kre***el, and Klaus Dietmayer. 2018. The DriveU traffic light dataset: Introduction and comparison with existing datasets. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).Google ScholarGoogle ScholarCross RefCross Ref
  14. Galvoptics. 2020. Photopic Eye Response Filter. Retrieved April 2, 2021 from https://www.galvoptics.co.uk/optical-components/optical-filters/photopic-eye-response-filter/Google ScholarGoogle Scholar
  15. Kenneth Garrard, Thomas Bruegge, Jeff Hoffman, Thomas Dow, and Alex Sohn. 2005. Design tools for freeform optics. In Current Developments in Lens Design and Optical Engineering VI, Vol. 5874. International Society for Optics and Photonics, 58740A.Google ScholarGoogle ScholarCross RefCross Ref
  16. Carl Friedrich Gauss. 1841. Dioptrische Untersuchungen. Dieterich.Google ScholarGoogle Scholar
  17. Joseph M. Geary. 2002. Introduction to Lens Design: With Practical ZEMAX Examples. Willmann-Bell Richmond.Google ScholarGoogle Scholar
  18. Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32, 11 (2013), 1231–1237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics 35, 6 (2016), 191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Michaël Gharbi, Jiawen Chen, Jon Barron, Samuel W. Hasinoff, and Frédo Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics 36, 4 (2017), Article 118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. 1998. NeuroAnimator: Fast neural network emulation and control of physics-based models. In Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Johannes Hanika and Carsten Dachsbacher. 2014. Efficient Monte Carlo rendering with realistic lenses. Computer Graphics Forum 33 (2014), 323–332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. James E. Harvey, Ryan G. Irvin, and Richard N. Pfisterer. 2015. Modeling physical optics phenomena by complex ray tracing. Optical Engineering 54, 3 (2015), 035105.Google ScholarGoogle ScholarCross RefCross Ref
  24. Samuel W. Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jon Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Transactions on Graphics 35, 6 (2016), Article 192, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling high-level image processing code into hardware pipelines. ACM Transactions on Graphics 33, 4 (2014), Article 144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Felix Heide, Markus Steinberger, Yun-Ta Tsai, Mushfiqur Rouf, Dawid Pająk, Dikpal Reddy, Orazio Gallo, et al. 2014. FlexISP: A flexible camera image processing framework. ACM Transactions on Graphics 33, 6 (2014), Article 231, 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Michael Hirsch, Suvrit Sra, Bernhard Schölkopf, and Stefan Harmeling. 2010. Efficient filter flow for space-variant multiframe blind deconvolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 607–614.Google ScholarGoogle ScholarCross RefCross Ref
  28. ISO. 2017. ISO 71696. Photography - Electronic still picture imaging – Resolution and spatial frequency responses. 1–49. Retrieved January 5, 2019 from https://www.iso.org/standard/71696.htmlGoogle ScholarGoogle Scholar
  29. Michael J. Kidger. 2002. Fundamental Optical Design. SPIE Press.Google ScholarGoogle Scholar
  30. Rudolf Kingslake and Roger B. Johnson. 2009. Lens Design Fundamentals. Academic Press.Google ScholarGoogle Scholar
  31. Craig E. Kolb, Don P. Mitchell, and Pat Hanrahan. 1995. A realistic camera model for computer graphics. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kowa. 2020. LM6NCL. Retrieved April 2, 2021 from https://lenses.kowa-usa.com/ncl-series/490-lm6ncl.htmlGoogle ScholarGoogle Scholar
  33. Tzu-Mao Li, Michaël Gharbi, Andrew Adams, Frédo Durand, and Jonathan Ragan-Kelley. 2018. Differentiable programming for image processing and deep learning in Halide. ACM Transactions on Graphics 37, 4 (2018), Article 139, 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Daniel Malacara-Hernández and Zacarías Malacara-Hernández. 2016. Handbook of Optical Design. CRC Press, Boca Raton, FL.Google ScholarGoogle Scholar
  35. Christopher A. Metzler, Hayato Ikoma, Yifan Peng, and Gordon Wetzstein. 2020. Deep optics for single-shot high-dynamic-range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ali Mosleh, Avinash Sharma, Emmanuel Onzon, Fahim Mannan, Nicolas Robidoux, and Felix Heide. 2020. Hardware-in-the-loop end-to-end optimization of camera image processing pipelines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  37. ON Semiconductor. 2015. MT9P001: 1/2.5-Inch 5 Mp CMOS Digital Image Sensor. Retreived on 03 May, 2021 from https://www.onsemi.com/pdf/datasheet/mt9p001-d.pdf.Google ScholarGoogle Scholar
  38. Jun Nishimura, Timo Gerasimow, Rao Sushma, Alexsandar Sutic, Chyuan-Tyng Wu, and Gilad Michael. 2018. Automatic ISP image quality tuning using nonlinear optimization. In Proceedings of the International Conference on Image Processing (ICIP).Google ScholarGoogle ScholarCross RefCross Ref
  39. Yifan Peng, Qilin Sun, Xiong Dun, Gordon Wetzstein, Wolfgang Heidrich, and Felix Heide. 2019. Learned large field-of-view imaging with thin-plate optics. ACM Transactions on Graphics 38, 6 (2019), 219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jonathan B. Phillips and Henrik Eliasson. 2018. Camera Image Quality Benchmarking.Wiley Publishing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Rajeev Ramanath, Wesley E. Snyder, Youngjun Yoo, and Mark S. Drew. 2005. Color image processing pipeline. IEEE Signal Processing Magazine 22, 1 (2005), 34–43.Google ScholarGoogle ScholarCross RefCross Ref
  42. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Max-Gerd Retzlaff, Johannes Hanika, Jürgen Beyerer, and Carsten Dachsbacher. 2016. Potential and challenges of using computer graphics for the simulation of optical measurement systems.GMA/ITG Fachtagung: Sensoren und Messsysteme 18 (2016), 322–329.Google ScholarGoogle Scholar
  44. Emanuel Schrade, Johannes Hanika, and Carsten Dachsbacher. 2016. Sparse high-degree polynomials for wide-angle lenses. Computer Graphics Forum 35 (2016), 89–97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ling Shao, Ruomei Yan, Xuelong Li, and Yan Liu. 2014. From heuristic optimization to dictionary learning: A review and comprehensive comparison of image denoising algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001–1013.Google ScholarGoogle ScholarCross RefCross Ref
  46. Yichang Shih, Brian Guenter, and Neel Joshi. 2012. Image enhancement using calibrated lens simulations. In Proceedings of the European Conference on Computer Vision. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics 37, 4 (2018), 114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Georgii Georgievich Sliusarev. 1984. Abberation and Optical Design Theory (2nd ed.). Adam Higler Ltd., Bristol, England.Google ScholarGoogle Scholar
  49. Warren J. Smith. 2005. Modern Lens Design (2nd ed.). McGraw-Hill, New York, NY.Google ScholarGoogle Scholar
  50. EMVA. 2010. Standard for Characterization of image Sensors and Cameras. Release 3.https://www.emva.org/wp-content/uploads/EMVA1288-3.1a.pdf.Google ScholarGoogle Scholar
  51. Benjamin Steinert, Holger Dammertz, Johannes Hanika, and Hendrik P. A. Lensch. 2011. General spectral camera lens simulation. Computer Graphics Forum 30 (2011), 1643–1654.Google ScholarGoogle ScholarCross RefCross Ref
  52. David G. Stork and Patrick R. Gill. 2014. Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors. International Journal on Advances in Systems and Measurements 7, 3 (2014), 4.Google ScholarGoogle Scholar
  53. Haiyin Sun. 2016. Lens Design: A Practical Guide. CRC Press, Boca Raton, FL.Google ScholarGoogle ScholarCross RefCross Ref
  54. Libin Sun, Neel Joshi, Brian Guenter, and James Hays. 2015. Lens factory: Automatic lens generation using off-the-shelf components. arXiv:1506.08956Google ScholarGoogle Scholar
  55. Qilin Sun, Ethan Tseng, Qiang Fu, Wolfgang Heidrich, and Felix Heide. 2020. Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  56. Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Proceedings of the 6th International Conference on Computer Vision. IEEE, Los Alamitos, CA, 839–846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ethan Tseng, Felix Yu, Yuting Yang, Fahim Mannan, Karl S. T. Arnaud, Derek Nowrouzezahrai, Jean-François Lalonde, and Felix Heide. 2019. Hyperparameter optimization in black-box image processing using differentiable proxies. ACM Transactions on Graphics 38, 4 (2019), 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Bruce H. Walker. 2008. Optical Engineering Fundamentals. Vol. 82. SPIE Press, Bellingham, WA.Google ScholarGoogle Scholar
  59. Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep edge-aware filters. In Proceedings of the 32nd International Conference on Machine Learning (ICML). 1669–1678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Yangyang Xu and Wotao Yin. 2013. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on Imaging Sciences 6, 3 (2013), 1758–1789.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. 2020. BDD100K: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  62. Hao Zhang, Wenjiang Liu, Ruolin Wang, Tao Liu, and Mengtian Rong. 2016. Hardware architecture design of block-matching and 3D-filtering denoising algorithm. Journal of Shanghai Jiaotong University (Science) 21, 2 (2016), 173–183.Google ScholarGoogle ScholarCross RefCross Ref
  63. Lei Zhang, Xiaolin Wu, Antoni Buades, and Xin Li. 2011. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging 20, 2 (2011), 023016.Google ScholarGoogle ScholarCross RefCross Ref
  64. Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 40, Issue 2
      April 2021
      174 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3454118
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 June 2021
      • Accepted: 1 January 2021
      • Revised: 1 December 2020
      • Received: 1 August 2020
      Published in tog Volume 40, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format