Abstract
Although object detection algorithms based on deep learning have been widely used in many scenarios, they face challenges under some degraded conditions, such as low-light. A conventional solution is that image enhancement approaches are used as a separate pre-processing module to improve the quality of degraded image. However, this two-step approach makes it difficult to unify the goals of enhancement and detection, that is, low-light enhancement operations are not always helpful for subsequent object detection. Recently, some works try to integrate enhancement and detection in an end-to-end network, but still suffer from complex network structure, training convergence problem and demanding reference images. To address above problems, a plug-and-play image enhancement model is proposed in this paper, namely, low-light image enhancement (LLIE) model, which can be easily embedded into some off-the-shelf object detection methods in an end-to-end manner. LLIE is composed of a parameter estimation module and image processing module. The former learns to regress lighting enhancement parameters according to the feedback of detection network, and the latter enhances degraded image adaptively to promote subsequent detection model under low-light condition. Extensive object detection experiments on several low-light image data sets show that the performance of detector is significantly improved when LLIE is integrated.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The data sets used in this article are cited with references at their respective place.
References
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: Single shot multibox detector. In: Proceeding of the European Conference on Computer Vision (ECCV), vol. 9905, pp. 21–37 (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018). https://doi.org/10.1109/CVPR.2018.00644
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Proceeding of the European Conference on Computer Vision (ECCV), vol. 12346, pp. 213–229 (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Lore, K.G., Akintayo, A., Sarkar, S.: Llnet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 61, 650–662 (2017). https://doi.org/10.1016/j.patcog.2016.06.008
Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: deep light enhancement without paired supervision. IEEE Trans. Image Process. 30, 2340–2349 (2021). https://doi.org/10.1109/TIP.2021.3051462
Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1780–1789 (2020). https://doi.org/10.1109/CVPR42600.2020.00185
Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: a semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3063–3072 (2020). https://doi.org/10.1109/CVPR42600.2020.00313
Huang, S.-C., Le, T.-H., Jaw, D.-W.: Dsnet: joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2623–2633 (2021). https://doi.org/10.1109/TPAMI.2020.2977911
Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3339–3348 (2018). https://doi.org/10.1109/CVPR.2018.00352
Hnewa, M., Radha, H.: Multiscale domain adaptive yolo for cross-domain object detection. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 3323–3327 (2021). https://doi.org/10.1109/ICIP42928.2021.9506039
Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., Zhang, L.: Image-adaptive yolo for object detection in adverse weather conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1792–1800 (2022)
Kong, T., Yao, A., Chen, Y., Sun, F.: Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 845–853 (2016). https://doi.org/10.1109/CVPR.2016.98
Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems. 29 (2016)
Chen, X., Li, H., Wu, Q., Ngan, K.N., Xu, L.: High-quality r-cnn object detection using multi-path detection calibration network. IEEE Trans. Circuits Syst. Video Technol. 31(2), 715–727 (2021). https://doi.org/10.1109/TCSVT.2020.2987465
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017). https://doi.org/10.1109/CVPR.2017.690
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020). https://doi.org/10.1109/TPAMI.2018.2858826
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Liang, X., Zhang, J., Zhuo, L., Li, Y., Tian, Q.: Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1758–1770 (2020). https://doi.org/10.1109/TCSVT.2019.2905881
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017). https://doi.org/10.1109/CVPR.2017.106
Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 39(3), 355–368 (1987)
Land, E.H.: The retinex theory of color vision. Scient. Am. 237(6), 108–129 (1977)
Pisano, E.D., Zong, S., Hemminger, B.M., DeLuca, M., Johnston, R.E., Muller, K., Braeuning, M.P., Pizer, S.M.: Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J. Digit. Imaging. 11(4), 193–200 (1998). https://doi.org/10.1007/BF03178082
Chen, B.-H., Wu, Y.-L., Shi, L.-F.: A fast image contrast enhancement algorithm using entropy-preserving mapping prior. IEEE Trans. Circuits Syst. Video Technol. 29(1), 38–49 (2019). https://doi.org/10.1109/TCSVT.2017.2773461
Wang, S., Zheng, J., Hu, H.-M., Li, B.: Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Trans. Image Process. 22(9), 3538–3548 (2013). https://doi.org/10.1109/TIP.2013.2261309
Fu, X., Zeng, D., Huang, Y., Zhang, X.-P., Ding, X.: A weighted variational model for simultaneous reflectance and illumination estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2782–2790 (2016). https://doi.org/10.1109/CVPR.2016.304
Guo, X., Li, Y., Ling, H.: Lime: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 26(2), 982–993 (2017). https://doi.org/10.1109/TIP.2016.2639450
Li, M., Liu, J., Yang, W., Sun, X., Guo, Z.: Structure-revealing low-light image enhancement via robust retinex model. IEEE Trans. Image Process. 27(6), 2828–2841 (2018). https://doi.org/10.1109/TIP.2018.2810539
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: British Machine Vision Conference, p. 155 (2018)
Polesel, A., Ramponi, G., Mathews, V.J.: Image enhancement via adaptive unsharp masking. IEEE Trans. Image Process. 9(3), 505–510 (2000). https://doi.org/10.1109/83.826787
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comput. Vis. Image Underst. 178, 30–42 (2019) https://doi.org/10.1016/j.cviu.2018.10.010
Everingham, M., Winn, J.: The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern Anal. Stat. Model. Comput. Learn., Tech. Rep. 2007(1-45), 5 (2012)
Everingham, M., Eslami, S., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5
Acknowledgements
This research is supported by the National Natural Science Foundation of China under Grant No.U21B2038, U19B2039, U1811463 and 62172023, Beijing Natural Science Foundation (4222021) and R&D Program of Beijing Municipal Education Commission (KZ202210005008). The authors would like to thank the anonymous reviewers and editors for their constructive comments and suggestions.
Author information
Authors and Affiliations
Contributions
Yuan wrote the main manuscript and conducted related experiments; Hu provided ideas, provided guidance on paper writing and experimental guidance, and provided funding support; Sun, Wang and Yin provided experimental guidance and funding support.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by B. Bao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yuan, J., Hu, Y., Sun, Y. et al. A plug-and-play image enhancement model for end-to-end object detection in low-light condition. Multimedia Systems 30, 27 (2024). https://doi.org/10.1007/s00530-023-01228-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-023-01228-1