Abstract
Most modern commodity imaging systems we use directly for photography—or indirectly rely on for downstream applications—employ optical systems of multiple lenses that must balance deviations from perfect optics, manufacturing constraints, tolerances, cost, and footprint. Although optical designs often have complex interactions with downstream image processing or analysis tasks, today’s compound optics are designed in isolation from these interactions. Existing optical design tools aim to minimize optical aberrations, such as deviations from Gauss’ linear model of optics, instead of application-specific losses, precluding joint optimization with hardware image signal processing (ISP) and highly parameterized neural network processing. In this article, we propose an optimization method for compound optics that lifts these limitations. We optimize entire lens systems jointly with hardware and software image processing pipelines, downstream neural network processing, and application-specific end-to-end losses. To this end, we propose a learned, differentiable forward model for compound optics and an alternating proximal optimization method that handles function compositions with highly varying parameter dimensions for optics, hardware ISP, and neural nets. Our method integrates seamlessly atop existing optical design tools, such as Zemax. We can thus assess our method across many camera system designs and end-to-end applications. We validate our approach in an automotive camera optics setting—together with hardware ISP post processing and detection—outperforming classical optics designs for automotive object detection and traffic light state detection. For human viewing tasks, we optimize optics and processing pipelines for dynamic outdoor scenarios and dynamic low-light imaging. We outperform existing compartmentalized design or fine-tuning methods qualitatively and quantitatively, across all domain-specific applications tested.
- Donald Baxter, Frederic Cao, Henrik Eliasson, and Jonathan Phillips. 2012. Development of the I3A CPIQ spatial metrics. Proceedings of SPIE 8293 (2012), 1. DOI:https://doi.org/10.1117/12.905752Google ScholarCross Ref
- Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- Julie Chang, Vincent Sitzmann, Xiong Dun, Wolfgang Heidrich, and Gordon Wetzstein. 2018. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification. Scientific Reports 8, 1 (2018), 12324.Google ScholarCross Ref
- Julie Chang and Gordon Wetzstein. 2019. Deep optics for monocular depth estimation and 3D object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
- Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Qifeng Chen, Jia Xu, and Vladlen Koltun. 2017. Fast image processing with fully-convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
- Jung-Min Choi, Sung-Joon Jang, Sang-Seol Lee, Youngbae Hwang, and Byeong Ho Choi. 2014. Memory optimization of bilateral filter and its hardware implementation. In Proceedings of the 18th IEEE International Symposium on Consumer Electronics (ISCE). 1–2.Google ScholarCross Ref
- Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing 16, 8 (2007), 2080–2095. Google ScholarDigital Library
- Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Neural architecture search: A survey. Journal of Machine Learning Research 20, 55 (2019), 1–21.Google Scholar
- Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image smoothing via unsupervised learning. ACM Transactions on Graphics 37, 6 (2018), Article 259. Google ScholarDigital Library
- Fengzhou Fang, Xiaodong Zhang, Albert Weckenmann, Guoxiong Zhang, and Chris Evans. 2013. Manufacturing and measurement of freeform optics. CIRP Annals 62, 2 (2013), 823–846.Google ScholarCross Ref
- Grant R. Fowles. 1989. Introduction to Modern Optics. Courier Corporation.Google Scholar
- Andreas Fregin, Julian Müller, Ulrich Kre***el, and Klaus Dietmayer. 2018. The DriveU traffic light dataset: Introduction and comparison with existing datasets. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).Google ScholarCross Ref
- Galvoptics. 2020. Photopic Eye Response Filter. Retrieved April 2, 2021 from https://www.galvoptics.co.uk/optical-components/optical-filters/photopic-eye-response-filter/Google Scholar
- Kenneth Garrard, Thomas Bruegge, Jeff Hoffman, Thomas Dow, and Alex Sohn. 2005. Design tools for freeform optics. In Current Developments in Lens Design and Optical Engineering VI, Vol. 5874. International Society for Optics and Photonics, 58740A.Google ScholarCross Ref
- Carl Friedrich Gauss. 1841. Dioptrische Untersuchungen. Dieterich.Google Scholar
- Joseph M. Geary. 2002. Introduction to Lens Design: With Practical ZEMAX Examples. Willmann-Bell Richmond.Google Scholar
- Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32, 11 (2013), 1231–1237. Google ScholarDigital Library
- Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics 35, 6 (2016), 191. Google ScholarDigital Library
- Michaël Gharbi, Jiawen Chen, Jon Barron, Samuel W. Hasinoff, and Frédo Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics 36, 4 (2017), Article 118. Google ScholarDigital Library
- Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. 1998. NeuroAnimator: Fast neural network emulation and control of physics-based models. In Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). ACM, New York, NY. Google ScholarDigital Library
- Johannes Hanika and Carsten Dachsbacher. 2014. Efficient Monte Carlo rendering with realistic lenses. Computer Graphics Forum 33 (2014), 323–332. Google ScholarDigital Library
- James E. Harvey, Ryan G. Irvin, and Richard N. Pfisterer. 2015. Modeling physical optics phenomena by complex ray tracing. Optical Engineering 54, 3 (2015), 035105.Google ScholarCross Ref
- Samuel W. Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jon Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Transactions on Graphics 35, 6 (2016), Article 192, 12 pages. Google ScholarDigital Library
- James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling high-level image processing code into hardware pipelines. ACM Transactions on Graphics 33, 4 (2014), Article 144. Google ScholarDigital Library
- Felix Heide, Markus Steinberger, Yun-Ta Tsai, Mushfiqur Rouf, Dawid Pająk, Dikpal Reddy, Orazio Gallo, et al. 2014. FlexISP: A flexible camera image processing framework. ACM Transactions on Graphics 33, 6 (2014), Article 231, 13 pages. Google ScholarDigital Library
- Michael Hirsch, Suvrit Sra, Bernhard Schölkopf, and Stefan Harmeling. 2010. Efficient filter flow for space-variant multiframe blind deconvolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 607–614.Google ScholarCross Ref
- ISO. 2017. ISO 71696. Photography - Electronic still picture imaging – Resolution and spatial frequency responses. 1–49. Retrieved January 5, 2019 from https://www.iso.org/standard/71696.htmlGoogle Scholar
- Michael J. Kidger. 2002. Fundamental Optical Design. SPIE Press.Google Scholar
- Rudolf Kingslake and Roger B. Johnson. 2009. Lens Design Fundamentals. Academic Press.Google Scholar
- Craig E. Kolb, Don P. Mitchell, and Pat Hanrahan. 1995. A realistic camera model for computer graphics. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). Google ScholarDigital Library
- Kowa. 2020. LM6NCL. Retrieved April 2, 2021 from https://lenses.kowa-usa.com/ncl-series/490-lm6ncl.htmlGoogle Scholar
- Tzu-Mao Li, Michaël Gharbi, Andrew Adams, Frédo Durand, and Jonathan Ragan-Kelley. 2018. Differentiable programming for image processing and deep learning in Halide. ACM Transactions on Graphics 37, 4 (2018), Article 139, 13 pages. Google ScholarDigital Library
- Daniel Malacara-Hernández and Zacarías Malacara-Hernández. 2016. Handbook of Optical Design. CRC Press, Boca Raton, FL.Google Scholar
- Christopher A. Metzler, Hayato Ikoma, Yifan Peng, and Gordon Wetzstein. 2020. Deep optics for single-shot high-dynamic-range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Ali Mosleh, Avinash Sharma, Emmanuel Onzon, Fahim Mannan, Nicolas Robidoux, and Felix Heide. 2020. Hardware-in-the-loop end-to-end optimization of camera image processing pipelines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- ON Semiconductor. 2015. MT9P001: 1/2.5-Inch 5 Mp CMOS Digital Image Sensor. Retreived on 03 May, 2021 from https://www.onsemi.com/pdf/datasheet/mt9p001-d.pdf.Google Scholar
- Jun Nishimura, Timo Gerasimow, Rao Sushma, Alexsandar Sutic, Chyuan-Tyng Wu, and Gilad Michael. 2018. Automatic ISP image quality tuning using nonlinear optimization. In Proceedings of the International Conference on Image Processing (ICIP).Google ScholarCross Ref
- Yifan Peng, Qilin Sun, Xiong Dun, Gordon Wetzstein, Wolfgang Heidrich, and Felix Heide. 2019. Learned large field-of-view imaging with thin-plate optics. ACM Transactions on Graphics 38, 6 (2019), 219. Google ScholarDigital Library
- Jonathan B. Phillips and Henrik Eliasson. 2018. Camera Image Quality Benchmarking.Wiley Publishing. Google ScholarDigital Library
- Rajeev Ramanath, Wesley E. Snyder, Youngjun Yoo, and Mark S. Drew. 2005. Color image processing pipeline. IEEE Signal Processing Magazine 22, 1 (2005), 34–43.Google ScholarCross Ref
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Max-Gerd Retzlaff, Johannes Hanika, Jürgen Beyerer, and Carsten Dachsbacher. 2016. Potential and challenges of using computer graphics for the simulation of optical measurement systems.GMA/ITG Fachtagung: Sensoren und Messsysteme 18 (2016), 322–329.Google Scholar
- Emanuel Schrade, Johannes Hanika, and Carsten Dachsbacher. 2016. Sparse high-degree polynomials for wide-angle lenses. Computer Graphics Forum 35 (2016), 89–97. Google ScholarDigital Library
- Ling Shao, Ruomei Yan, Xuelong Li, and Yan Liu. 2014. From heuristic optimization to dictionary learning: A review and comprehensive comparison of image denoising algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001–1013.Google ScholarCross Ref
- Yichang Shih, Brian Guenter, and Neel Joshi. 2012. Image enhancement using calibrated lens simulations. In Proceedings of the European Conference on Computer Vision. Google ScholarDigital Library
- Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics 37, 4 (2018), 114. Google ScholarDigital Library
- Georgii Georgievich Sliusarev. 1984. Abberation and Optical Design Theory (2nd ed.). Adam Higler Ltd., Bristol, England.Google Scholar
- Warren J. Smith. 2005. Modern Lens Design (2nd ed.). McGraw-Hill, New York, NY.Google Scholar
- EMVA. 2010. Standard for Characterization of image Sensors and Cameras. Release 3.https://www.emva.org/wp-content/uploads/EMVA1288-3.1a.pdf.Google Scholar
- Benjamin Steinert, Holger Dammertz, Johannes Hanika, and Hendrik P. A. Lensch. 2011. General spectral camera lens simulation. Computer Graphics Forum 30 (2011), 1643–1654.Google ScholarCross Ref
- David G. Stork and Patrick R. Gill. 2014. Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors. International Journal on Advances in Systems and Measurements 7, 3 (2014), 4.Google Scholar
- Haiyin Sun. 2016. Lens Design: A Practical Guide. CRC Press, Boca Raton, FL.Google ScholarCross Ref
- Libin Sun, Neel Joshi, Brian Guenter, and James Hays. 2015. Lens factory: Automatic lens generation using off-the-shelf components. arXiv:1506.08956Google Scholar
- Qilin Sun, Ethan Tseng, Qiang Fu, Wolfgang Heidrich, and Felix Heide. 2020. Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Proceedings of the 6th International Conference on Computer Vision. IEEE, Los Alamitos, CA, 839–846. Google ScholarDigital Library
- Ethan Tseng, Felix Yu, Yuting Yang, Fahim Mannan, Karl S. T. Arnaud, Derek Nowrouzezahrai, Jean-François Lalonde, and Felix Heide. 2019. Hyperparameter optimization in black-box image processing using differentiable proxies. ACM Transactions on Graphics 38, 4 (2019), 27. Google ScholarDigital Library
- Bruce H. Walker. 2008. Optical Engineering Fundamentals. Vol. 82. SPIE Press, Bellingham, WA.Google Scholar
- Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep edge-aware filters. In Proceedings of the 32nd International Conference on Machine Learning (ICML). 1669–1678. Google ScholarDigital Library
- Yangyang Xu and Wotao Yin. 2013. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on Imaging Sciences 6, 3 (2013), 1758–1789.Google ScholarDigital Library
- Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. 2020. BDD100K: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Hao Zhang, Wenjiang Liu, Ruolin Wang, Tao Liu, and Mengtian Rong. 2016. Hardware architecture design of block-matching and 3D-filtering denoising algorithm. Journal of Shanghai Jiaotong University (Science) 21, 2 (2016), 173–183.Google ScholarCross Ref
- Lei Zhang, Xiaolin Wu, Antoni Buades, and Xin Li. 2011. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging 20, 2 (2011), 023016.Google ScholarCross Ref
- Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Index Terms
Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design
Comments