Skip to main content
Log in

SCDet: decoupling discriminative representation for dark object detection via supervised contrastive learning

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Despite the significant progress made in object detection algorithms, their potential to operate effectively under the low-light environment remains to be fully explored. Recent methods realize dark object detection on the entire representation of dark images; however, they do not further consider the potential entanglement between dark disturbance and discriminative information in dark images, and thus, the learned representation may be sub-optimal. Towards this issue, we propose supervised contrastive detection (SCDet), a novel unified framework to learn the potential composition of dark images, and decouple the discriminative component for facilitating dark object detection. Specifically, we introduce the dense decoupling contrastive (DDC) pretext task to investigate the feature consistency based on a dark transformation, allowing the learned representation to be independent of the potential entanglement to realize decoupling. Moreover, to further drive the decoupled representation to be discriminative instead of a collapse solution for dark object detection, we incorporate the supervision detection task as an extra optimization objective, resulting in the joint optimization pattern. The two tasks are complementary to each other: the DDC task regularizes the detection to learn more decoupling-friendly representation, while the supervision detection task guides the discriminative representation decoupling. As a result, the SCDet achieves dark object detection by decoding the decoupled discriminative representation of dark images. Extensive experiments on four datasets demonstrate the effectiveness of our method in both synthetic and real-world scenarios. Code is available at https://github.com/TxLin7/SCDet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767

  2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)

  3. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)

  4. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  5. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)

  6. Zhang, Y., Zhang, J., Guo, X.: Kindling the darkness: a practical low-light image enhancer. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1632–1640 (2019)

  7. Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1780–1789 (2020)

  8. Lv, F., Lu, F., Wu, J., Lim, C.: MBLLEN: low-light image/video enhancement using CNNs. In: BMVC, vol. 220, p. 4 (2018)

  9. Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement (2018). arXiv preprint arXiv:1808.04560

  10. Kim, Y.-T.: Contrast enhancement using brightness preserving bi-histogram equalization. IEEE Trans. Consum. Electron. 43(1), 1–8 (1997)

    Article  Google Scholar 

  11. Lee, C., Lee, C., Kim, C.-S.: Contrast enhancement based on layered difference representation of 2D histograms. IEEE Trans. Image Process. 22(12), 5372–5384 (2013)

    Article  Google Scholar 

  12. Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: a domain adaptive representation learning paradigm for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12456–12465 (2019)

  13. Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3339–3348 (2018)

  14. Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5001–5009 (2018)

  15. Chen, X., He, K.: Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–15758 (2021)

  16. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

  17. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)

  18. Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comput. Vis. Image Underst. 178, 30–42 (2019)

    Article  Google Scholar 

  19. Morawski, I., Chen, Y.-A., Lin, Y.-S., Dangi, S., He, K., Hsu, W.H.: GENISP: neural ISP for low-light machine cognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 630–639 (2022)

  20. Neumann, L., Karg, M., Zhang, S., Scharfenberger, C., Piegert, E., Mistr, S., Prokofyeva, O., Thiel, R., Vedaldi, A., Zisserman, A., : Nightowls: a pedestrians at night dataset. In: Asian Conference on Computer Vision, pp. 691–705. Springer (2018)

  21. Yang, W., Yuan, Y., Ren, W., Liu, J., Scheirer, W.J., Wang, Z., Zhang, T., Zhong, Q., Xie, D., Pu, S.: Advancing image understanding in poor visibility environments: a collective benchmark study. IEEE Trans. Image Process. 29, 5737–5752 (2020)

    Article  Google Scholar 

  22. Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)

  23. Lore, K.G., Akintayo, A., Sarkar, S.: LLNet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recogn. 61, 650–662 (2017)

    Article  Google Scholar 

  24. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

  25. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)

  26. Liu, T., Chen, Z., Yang, Y., Wu, Z., Li, H.: Lane detection in low-light conditions using an efficient data enhancement: light conditions style transfer. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1394–1399. IEEE (2020)

  27. Liu, S., Feng, C., Wang, X., Wang, H., Zhu, R., Li, Y., Lei, L.: Deep-flexisp: a three-stage framework for night photography rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1211–1220 (2022)

  28. Punnappurath, A., Abuolaim, A., Abdelhamed, A., Levinshtein, A., Brown, M.S.: Day-to-night image synthesis for training nighttime neural isps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10769–10778 (2022)

  29. Zhang, Y., Qin, H., Wang, X., Li, H.: Rethinking noise synthesis and modeling in raw denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4593–4601 (2021)

  30. Wei, K., Fu, Y., Zheng, Y., Yang, J.: Physics-based noise modeling for extreme low-light photography. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8520–8537 (2021)

    Google Scholar 

  31. Stark, J.A.: Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Trans. Image Process. 9(5), 889–896 (2000)

    Article  Google Scholar 

  32. Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Graph. Process. 39(3), 355–368 (1987)

    Article  Google Scholar 

  33. Ibrahim, H., Kong, N.S.P.: Brightness preserving dynamic histogram equalization for image contrast enhancement. IEEE Trans. Consum. Electron. 53(4), 1752–1758 (2007)

    Article  Google Scholar 

  34. Rahman, Z.-U., Jobson, D.J., Woodell, G.A.: Retinex processing for automatic image enhancement. J. Electron. Imaging 13(1), 100–110 (2004)

    Article  Google Scholar 

  35. Fu, X., Zeng, D., Huang, Y., Liao, Y., Ding, X., Paisley, J.: A fusion-based enhancing method for weakly illuminated images. Signal Process. 129, 82–96 (2016)

    Article  Google Scholar 

  36. Fu, X., Zeng, D., Huang, Y., Zhang, X.-P., Ding, X.: A weighted variational model for simultaneous reflectance and illumination estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2782–2790 (2016)

  37. Guo, X., Li, Y., Ling, H.: Lime: low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 26(2), 982–993 (2016)

    Article  MathSciNet  Google Scholar 

  38. Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: deep light enhancement without paired supervision. IEEE Trans. Image Process. 30, 2340–2349 (2021)

    Article  Google Scholar 

  39. Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: a semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3063–3072 (2020)

  40. Zhang, J., Wang, H., Wu, X., Zuo, W.: Invertible network for unpaired low-light image enhancement. Vis. Comput. 1–12 (2023)

  41. Yu, N., Li, J., Hua, Z.: Fla-net: multi-stage modular network for low-light image enhancement. Vis. Comput. 39, 1251–1270 (2022)

    Google Scholar 

  42. Liu, W., Ren, G., Yu, R., Guo, S., Zhu, J., Zhang, L.: Image-adaptive yolo for object detection in adverse weather conditions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1792–1800 (2022)

  43. Liu, T., Chen, Z., Yang, Y., Wu, Z., Li, H.: Lane detection in low-light conditions using an efficient data enhancement: light conditions style transfer. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1394–1399. IEEE (2020)

  44. Zhang, B., Chen, T., Wang, B., Wu, X., Zhang, L., Fan, J.: Densely semantic enhancement for domain adaptive region-free detectors. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1339–1352 (2021)

    Article  Google Scholar 

  45. Sasagawa, Y., Nagahara, H.: Yolo in the dark-domain adaptation method for merging multiple models. In: European Conference on Computer Vision, pp. 345–359. Springer (2020)

  46. Hu, W., Wang, T., Wang, Y., Chen, Z., Huang, G.: LE–MSFE–DDNET: a defect detection network based on low-light enhancement and multi-scale feature extraction. Vis. Comput. 38(11), 3731–3745 (2022)

    Article  Google Scholar 

  47. Wang, S., Yang, J., Chen, D., Huang, J., Zhang, Y., Liu, W., Zheng, Z., Li, Y.: Litecortexnet: toward efficient object detection at night. Vis. Comput. 38(9–10), 3073–3085 (2022)

    Article  Google Scholar 

  48. Hong, Y., Wei, K., Chen, L., Fu, Y.: Crafting object detection in very low light. In: BMVC, vol. 1, p. 3 (2021)

  49. Cui, Z., Qi, G.-J., Gu, L., You, S., Zhang, Z., Harada, T.: Multitask aet with orthogonal tangent regularity for dark object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2553–2562 (2021)

  50. Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. Adv. Neural Inf. Process. Syst. 33, 9912–9924 (2020)

    Google Scholar 

  51. Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 33, 21271–21284 (2020)

    Google Scholar 

  52. Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Adv. Neural Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  53. Zhang, L., Chen, X., Zhang, J., Dong, R., Ma, K.: Contrastive deep supervision (2022). arXiv preprint arXiv:2207.05306

  54. Liu, S., Li, Z., Sun, J.: Self-emd: self-supervised object detection without imagenet (2020). arXiv preprint arXiv:2011.13677

  55. Yang, C., Wu, Z., Zhou, B., Lin, S.: Instance localization for self-supervised detection pretraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3987–3996 (2021)

  56. Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense contrastive learning for self-supervised visual pre-training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3024–3033 (2021)

  57. Xie, E., Ding, J., Wang, W., Zhan, X., Xu, H., Sun, P., Li, Z., Luo, P.: Detco: unsupervised contrastive learning for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8392–8401 (2021)

  58. Xie, Z., Lin, Y., Zhang, Z., Cao, Y., Lin, S., Hu, H.: Propagate yourself: exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16684–16693 (2021)

  59. Schwartz, E., Giryes, R., Bronstein, A.M.: Deepisp: toward learning an end-to-end image processing pipeline. IEEE Trans. Image Process. 28(2), 912–923 (2018)

    Article  MathSciNet  Google Scholar 

  60. Heide, F., Steinberger, M., Tsai, Y.-T., Rouf, M., Pająk, D., Reddy, D., Gallo, O., Liu, J., Heidrich, W., Egiazarian, K.: Flexisp: a flexible camera image processing framework. ACM Trans. Graph. (ToG) 33(6), 1–13 (2014)

    Article  Google Scholar 

  61. Brooks, T., Mildenhall, B., Xue, T., Chen, J., Sharlet, D., Barron, J.T.: Unprocessing images for learned raw denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11036–11045 (2019)

  62. Foi, A., Trimeche, M., Katkovnik, V., Egiazarian, K.: Practical Poissonian-Gaussian noise modeling and fitting for single-image raw-data. IEEE Trans. Image Process. 17(10), 1737–1754 (2008)

    Article  MathSciNet  Google Scholar 

  63. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

  64. Yi-de, M., Qing, L., Zhi-Bai, Q.: Automated image segmentation using improved PCNN model based on cross-entropy. In: Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004, pp. 743–746. IEEE (2004)

  65. Huang, S.-C., Le, T.-H., Jaw, D.-W.: DSNet: joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2623–2633 (2020)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Key Areas Research and Development Program of Guangzhou Grant 2023B01J0029, Science and technology research in key areas in Foshan under Grant 2020001006832, the Key-Area Research and Development Program of Guangdong Province under Grant 2018B010109007 and 2019B010153002, the Science and technology projects of Guangzhou under Grant 202007040006, the Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069, the Guangdong Basic and Applied Basic Research Foundation under Grant 2023A1515012534, and the National Statistical Science Research Project of China (No. 2022LY096).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoheng Huang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Human and animals rights

This article does not include any research conducted by the author on human participants.

Informed consent

Obtaining informed consent from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, T., Huang, G., Yuan, X. et al. SCDet: decoupling discriminative representation for dark object detection via supervised contrastive learning. Vis Comput 40, 3357–3369 (2024). https://doi.org/10.1007/s00371-023-03039-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03039-x

Keywords

Navigation