Abstract
Efficiently detecting objects in the complex background at night with low illumination remains a challenge for image processing. To address this issue, this paper proposes LiteCortexNet, a lightweight deep learning object detection model inspired by the visual cortex. The model performs intrinsic image decomposition end-to-end to obtain the illumination-independent reflection component, fuses it with the output result of the depth-wise separable convolutional encoder, and then, sends it to the lightweight detection head for object classification and positioning. Leveraging the channel-wise attention mechanism, our model is optimized for detecting small objects as well as obscured objects. In order to evaluate our method, an image dataset of railway maintenance tools was constructed. Experimental results show that the proposed model achieves 90.56% mAP at 66FPS on this dataset, which outperforms state-of-the-art object detection models such as YoloV4 (Bochkovskiy et al. in arXiv:2004.10934) (82.34% mAP at 45FPS).
Similar content being viewed by others
Data Availability Statement
The data used to support the findings of this study are available from the corresponding author upon request.
References
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)
Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Selective refinement network for high performance face detection. Proc. AAAI Conf. Artif. Intell. 33, 8231–8238 (2019)
Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Masood, A., Sheng, B., Yang, P., Li, P., Li, H., Kim, J., Feng, D.D.: Automated decision support system for lung cancer detection and classification via enhanced rfcn with multilayer fusion rpn. IEEE Trans. Ind. Inform. 16(12), 7791–7801 (2020)
Nazir, A., Cheema, M.N., Sheng, B., Li, H., Li, P., Yang, P., Jung, Y., Qin, J., Kim, J., Feng, D.D.: Off-enet: an optimally fused fully end-to-end network for automatic dense volumetric 3d intracranial blood vessels segmentation. IEEE Trans. Image Process. 29, 7192–7202 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds) Advances in Neural Information Processing Systems 28, pp. 91–99. Curran Associates, Inc. (2015)
Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Computer Vision and Pattern Recognition (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Berg, A.C: Ssd: single shot multibox detector. In: European Conference on Computer Vision (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pp. 886–893. Ieee (2005)
Girshick, R., Felzenszwalb, P., McAllester, D.: Object detection with grammar models. In: Advances in neural information processing systems, 24 (2011)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11534–11542, (2020)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, F.F: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA (2009)
Loh, Y.P., Chan, C.S.: Getting to know low-light images with the exclusively dark dataset. Comput. Vis. Image Understand. 178, 30–42 (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: The IEEE international conference on computer vision (ICCV) (2017)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 761–769 (2016)
Madawi, K.E., Rashed, H., Sallab, A.E., Nasr, O., Yogamani, S.: Rgb and lidar fusion based 3d semantic segmentation for autonomous driving. In: 2019 IEEE intelligent transportation systems conference—ITSC (2019)
Andraši, P., Radišić, T., Muštra, M., Ivošević, J.: Night-time detection of uavs using thermal infrared camera. Transp. Res. Procedia 28, 183–190 (2017)
Land, E.H.: The retinex. Am. Sci. 52(2), 247–264 (1964)
Wei, B., Shu, S., Yang, R., Zhang, Y., Miao, J.: Mccann’s retinex enhancement method based on yuv for space station image. In: 2019 international conference on communications, information system and computer engineering (CISCE), pp. 249–253. IEEE (2019)
Zhou, J., Zhang, D., Zou, P., Zhang, W., Zhang, W.: Retinex-based laplacian pyramid method for image defogging. IEEE Access 7, 122459–122472 (2019)
Yuan, Y., Sheng, B., Li, P., Bi, L., Kim, J., Wu, E.: Deep intrinsic image decomposition using joint parallel learning. In: Computer graphics international conference, pp. 336–341. Springer (2019)
Qu, H., Yuan, T., Sheng, Z., Zhang, Y.: A pedestrian detection method based on yolov3 model and image enhanced by retinex. In: 2018 11th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp. 1–5. IEEE, (2018)
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: The IEEE international conference on computer vision (ICCV), (2017)
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: The IEEE conference on computer vision and pattern recognition (CVPR), (2018)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta. C., Bengio, Y (2014) Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550. 2014
Iandola, F.N, Han, S., Moskewicz, M.W, Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and less than 0.5MB model size. arXiv:1602.07360, (2016)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, (2017)
Bradski, G., Kaehler, A.: Opencv. Dr. Dobb’s J. Softw. Tools 3, 2 (2000)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1251–1258 (2017)
Ding, X., Guo, Y., Ding, G., Han, J.: Acnet: strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In: Proceedings of the IEEE international conference on computer vision, pp. 1911–1920 (2019)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. In: arXiv:1904.07850 (2019)
Shi-Min, H., Liang, D., Yang, G.-Y., Yang, G.-W., Zhou, W.-Y.: Jittor: a novel deep learning framework with meta-operators and unified graph execution. Sci. China Inf. Sci. 63(222103), 1–21 (2020)
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam. (2018)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European conference on computer vision, pp. 740–755. Springer, (2014)
Boyu, L., Chen, J.-C., Chellappa, R.: Uid-gan: unsupervised image deblurring via disentangled representations. IEEE Trans. Biometrics Behav. Ident. Sci. 2(1), 26–39 (2019)
Funding
This work was supported in part by (1) The National Natural Science Foundation of China (No. 62171328). (2) Education Sciences Planning of Hubei Province of China (No.2019GA090).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, S., Yang, J., Chen, D. et al. LiteCortexNet: toward efficient object detection at night. Vis Comput 38, 3073–3085 (2022). https://doi.org/10.1007/s00371-022-02560-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02560-9