Abstract
Despite the significant progress made in object detection algorithms, their potential to operate effectively under the low-light environment remains to be fully explored. Recent methods realize dark object detection on the entire representation of dark images; however, they do not further consider the potential entanglement between dark disturbance and discriminative information in dark images, and thus, the learned representation may be sub-optimal. Towards this issue, we propose supervised contrastive detection (SCDet), a novel unified framework to learn the potential composition of dark images, and decouple the discriminative component for facilitating dark object detection. Specifically, we introduce the dense decoupling contrastive (DDC) pretext task to investigate the feature consistency based on a dark transformation, allowing the learned representation to be independent of the potential entanglement to realize decoupling. Moreover, to further drive the decoupled representation to be discriminative instead of a collapse solution for dark object detection, we incorporate the supervision detection task as an extra optimization objective, resulting in the joint optimization pattern. The two tasks are complementary to each other: the DDC task regularizes the detection to learn more decoupling-friendly representation, while the supervision detection task guides the discriminative representation decoupling. As a result, the SCDet achieves dark object detection by decoding the decoupled discriminative representation of dark images. Extensive experiments on four datasets demonstrate the effectiveness of our method in both synthetic and real-world scenarios. Code is available at https://github.com/TxLin7/SCDet.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)
Zhang, Y., Zhang, J., Guo, X.: Kindling the darkness: a practical low-light image enhancer. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1632–1640 (2019)
Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1780–1789 (2020)
Lv, F., Lu, F., Wu, J., Lim, C.: MBLLEN: low-light image/video enhancement using CNNs. In: BMVC, vol. 220, p. 4 (2018)
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement (2018). arXiv preprint arXiv:1808.04560
Kim, Y.-T.: Contrast enhancement using brightness preserving bi-histogram equalization. IEEE Trans. Consum. Electron. 43(1), 1–8 (1997)
Lee, C., Lee, C., Kim, C.-S.: Contrast enhancement based on layered difference representation of 2D histograms. IEEE Trans. Image Process. 22(12), 5372–5384 (2013)
Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: a domain adaptive representation learning paradigm for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12456–12465 (2019)
Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3339–3348 (2018)
Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5001–5009 (2018)
Chen, X., He, K.: Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–15758 (2021)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comput. Vis. Image Underst. 178, 30–42 (2019)
Morawski, I., Chen, Y.-A., Lin, Y.-S., Dangi, S., He, K., Hsu, W.H.: GENISP: neural ISP for low-light machine cognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 630–639 (2022)
Neumann, L., Karg, M., Zhang, S., Scharfenberger, C., Piegert, E., Mistr, S., Prokofyeva, O., Thiel, R., Vedaldi, A., Zisserman, A., : Nightowls: a pedestrians at night dataset. In: Asian Conference on Computer Vision, pp. 691–705. Springer (2018)
Yang, W., Yuan, Y., Ren, W., Liu, J., Scheirer, W.J., Wang, Z., Zhang, T., Zhong, Q., Xie, D., Pu, S.: Advancing image understanding in poor visibility environments: a collective benchmark study. IEEE Trans. Image Process. 29, 5737–5752 (2020)
Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)
Lore, K.G., Akintayo, A., Sarkar, S.: LLNet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recogn. 61, 650–662 (2017)
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Liu, T., Chen, Z., Yang, Y., Wu, Z., Li, H.: Lane detection in low-light conditions using an efficient data enhancement: light conditions style transfer. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1394–1399. IEEE (2020)
Liu, S., Feng, C., Wang, X., Wang, H., Zhu, R., Li, Y., Lei, L.: Deep-flexisp: a three-stage framework for night photography rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1211–1220 (2022)
Punnappurath, A., Abuolaim, A., Abdelhamed, A., Levinshtein, A., Brown, M.S.: Day-to-night image synthesis for training nighttime neural isps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10769–10778 (2022)
Zhang, Y., Qin, H., Wang, X., Li, H.: Rethinking noise synthesis and modeling in raw denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4593–4601 (2021)
Wei, K., Fu, Y., Zheng, Y., Yang, J.: Physics-based noise modeling for extreme low-light photography. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8520–8537 (2021)
Stark, J.A.: Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Trans. Image Process. 9(5), 889–896 (2000)
Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Graph. Process. 39(3), 355–368 (1987)
Ibrahim, H., Kong, N.S.P.: Brightness preserving dynamic histogram equalization for image contrast enhancement. IEEE Trans. Consum. Electron. 53(4), 1752–1758 (2007)
Rahman, Z.-U., Jobson, D.J., Woodell, G.A.: Retinex processing for automatic image enhancement. J. Electron. Imaging 13(1), 100–110 (2004)
Fu, X., Zeng, D., Huang, Y., Liao, Y., Ding, X., Paisley, J.: A fusion-based enhancing method for weakly illuminated images. Signal Process. 129, 82–96 (2016)
Fu, X., Zeng, D., Huang, Y., Zhang, X.-P., Ding, X.: A weighted variational model for simultaneous reflectance and illumination estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2782–2790 (2016)
Guo, X., Li, Y., Ling, H.: Lime: low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 26(2), 982–993 (2016)
Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: deep light enhancement without paired supervision. IEEE Trans. Image Process. 30, 2340–2349 (2021)
Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: a semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3063–3072 (2020)
Zhang, J., Wang, H., Wu, X., Zuo, W.: Invertible network for unpaired low-light image enhancement. Vis. Comput. 1–12 (2023)
Yu, N., Li, J., Hua, Z.: Fla-net: multi-stage modular network for low-light image enhancement. Vis. Comput. 39, 1251–1270 (2022)
Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., Zhang, L.: Image-adaptive yolo for object detection in adverse weather conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1792–1800 (2022)
Liu, T., Chen, Z., Yang, Y., Wu, Z., Li, H.: Lane detection in low-light conditions using an efficient data enhancement: light conditions style transfer. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1394–1399. IEEE (2020)
Zhang, B., Chen, T., Wang, B., Wu, X., Zhang, L., Fan, J.: Densely semantic enhancement for domain adaptive region-free detectors. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1339–1352 (2021)
Sasagawa, Y., Nagahara, H.: Yolo in the dark-domain adaptation method for merging multiple models. In: European Conference on Computer Vision, pp. 345–359. Springer (2020)
Hu, W., Wang, T., Wang, Y., Chen, Z., Huang, G.: LE–MSFE–DDNET: a defect detection network based on low-light enhancement and multi-scale feature extraction. Vis. Comput. 38(11), 3731–3745 (2022)
Wang, S., Yang, J., Chen, D., Huang, J., Zhang, Y., Liu, W., Zheng, Z., Li, Y.: Litecortexnet: toward efficient object detection at night. Vis. Comput. 38(9–10), 3073–3085 (2022)
Hong, Y., Wei, K., Chen, L., Fu, Y.: Crafting object detection in very low light. In: BMVC, vol. 1, p. 3 (2021)
Cui, Z., Qi, G.-J., Gu, L., You, S., Zhang, Z., Harada, T.: Multitask aet with orthogonal tangent regularity for dark object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2553–2562 (2021)
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. Adv. Neural Inf. Process. Syst. 33, 9912–9924 (2020)
Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 33, 21271–21284 (2020)
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Adv. Neural Inf. Process. Syst. 33, 18661–18673 (2020)
Zhang, L., Chen, X., Zhang, J., Dong, R., Ma, K.: Contrastive deep supervision (2022). arXiv preprint arXiv:2207.05306
Liu, S., Li, Z., Sun, J.: Self-emd: self-supervised object detection without imagenet (2020). arXiv preprint arXiv:2011.13677
Yang, C., Wu, Z., Zhou, B., Lin, S.: Instance localization for self-supervised detection pretraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3987–3996 (2021)
Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense contrastive learning for self-supervised visual pre-training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3024–3033 (2021)
Xie, E., Ding, J., Wang, W., Zhan, X., Xu, H., Sun, P., Li, Z., Luo, P.: Detco: unsupervised contrastive learning for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8392–8401 (2021)
Xie, Z., Lin, Y., Zhang, Z., Cao, Y., Lin, S., Hu, H.: Propagate yourself: exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16684–16693 (2021)
Schwartz, E., Giryes, R., Bronstein, A.M.: Deepisp: toward learning an end-to-end image processing pipeline. IEEE Trans. Image Process. 28(2), 912–923 (2018)
Heide, F., Steinberger, M., Tsai, Y.-T., Rouf, M., Pająk, D., Reddy, D., Gallo, O., Liu, J., Heidrich, W., Egiazarian, K.: Flexisp: a flexible camera image processing framework. ACM Trans. Graph. (ToG) 33(6), 1–13 (2014)
Brooks, T., Mildenhall, B., Xue, T., Chen, J., Sharlet, D., Barron, J.T.: Unprocessing images for learned raw denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11036–11045 (2019)
Foi, A., Trimeche, M., Katkovnik, V., Egiazarian, K.: Practical Poissonian-Gaussian noise modeling and fitting for single-image raw-data. IEEE Trans. Image Process. 17(10), 1737–1754 (2008)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)
Yi-de, M., Qing, L., Zhi-Bai, Q.: Automated image segmentation using improved PCNN model based on cross-entropy. In: Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004, pp. 743–746. IEEE (2004)
Huang, S.-C., Le, T.-H., Jaw, D.-W.: DSNet: joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2623–2633 (2020)
Acknowledgements
This work was supported in part by the Key Areas Research and Development Program of Guangzhou Grant 2023B01J0029, Science and technology research in key areas in Foshan under Grant 2020001006832, the Key-Area Research and Development Program of Guangdong Province under Grant 2018B010109007 and 2019B010153002, the Science and technology projects of Guangzhou under Grant 202007040006, the Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069, the Guangdong Basic and Applied Basic Research Foundation under Grant 2023A1515012534, and the National Statistical Science Research Project of China (No. 2022LY096).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Human and animals rights
This article does not include any research conducted by the author on human participants.
Informed consent
Obtaining informed consent from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, T., Huang, G., Yuan, X. et al. SCDet: decoupling discriminative representation for dark object detection via supervised contrastive learning. Vis Comput 40, 3357–3369 (2024). https://doi.org/10.1007/s00371-023-03039-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03039-x