Skip to main content
Log in

LIST: low illumination scene text detector with automatic feature enhancement

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Low illumination, under which discriminative clues are buried in the captured images, is an under-investigated but noteworthy issue in wild scene text detection. Existing deep learning approaches suffer from the scarcity of training data and illumination-sensitive feature representation. To address these issues, we propose a Low Illumination Scene Text (LIST) Detector training with authentic synthetic data and integrating dedicated feature enhancement modules. Specifically, we adopt a lightweight and non-reference low-light scene text image synthesis network to acquire adequate training data through pixel-wisely adjusting the dynamic range curve of normal-light images. Moreover, illumination invariant feature representation is learned through dual path feature extraction stem with intensity adjusted inputs, and feature fusion branch with automatically designed fusion cell. In the end, the enhanced feature is fed into a segmentation-based layer to localize arbitrary shape text instances. We construct a labeled real-world scene text image dataset called “DarkText” and conduct extensive experiments to validate the advantages of our proposed framework over state-of-the-art competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ch’ng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 935–942. IEEE (2017)

  2. Deng, D., Liu, H., Li, X., Cai, D.: Pixellink: detecting scene text via instance segmentation. In: the AAAI conference on artificial intelligence (AAAI), vol. 32 (2018)

  3. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)

  4. Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1780–1789 (2020)

  5. Huang, S.C., Cheng, F.C., Chiu, Y.S.: Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE Trans. Image Process. TIP 22(3), 1032–1041 (2012)

    Article  MathSciNet  Google Scholar 

  6. Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: deep light enhancement without paired supervision. IEEE Trans. Image Process. TIP 30, 2340–2349 (2021)

    Article  Google Scholar 

  7. Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., et al.: ICDAR2015 competition on robust reading. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160 (2015)

  8. Land, E.H.: The retinex theory of color vision. Sci. Am. 237(6), 108–129 (1977)

    Article  Google Scholar 

  9. Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. (TIP) 27(8), 3676–3690 (2018)

    Article  MathSciNet  Google Scholar 

  10. Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: the AAAI conference on artificial intelligence (AAAI) 34, 11474–11481 (2020)

  11. Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. arXiv:1806.09055 (2018)

  12. Liu, J., Xu, D., Yang, W., Fan, M., Huang, H.: Benchmarking low-light image enhancement and beyond. Int. J. Comput. Vis. 129(4), 1153–1184 (2021)

    Article  Google Scholar 

  13. Liu, R., Ma, L., Zhang, J., Fan, X., Luo, Z.: Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,561–10,570 (2021)

  14. Long, S., He, X., Yao, C.: Scene text detection and recognition: the deep learning era (2020)

  15. Lore, K.G., Akintayo, A., Sarkar, S.: Llnet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 61, 650–662 (2017)

    Article  Google Scholar 

  16. Nayef, N., Patel, Y., Busta, M., Chowdhury, P.N., Karatzas, D., Khlif, W., Matas, J., Pal, U., Burie, J.C., Liu, C.l., et al.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition-rrc-mlt-2019. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1582–1587 (2019)

  17. Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Asian conference on computer vision (ACCV), pp. 770–783 (2010)

  18. Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2010)

    MathSciNet  MATH  Google Scholar 

  19. Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Gr. Image Process. 39(3), 355–368 (1987)

    Article  Google Scholar 

  20. Raisi, Z., Naiel, M.A., Fieguth, P., Wardell, S., Zelek, J.: Text detection and recognition in the wild: a review. arXiv:2006.04305 (2020)

  21. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems(NeurIPS) 28 (2015)

  22. Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: European Conference on Computer Vision (ECCV), pp. 56–72 (2016)

  23. Wang, R., Zhang, Q., Fu, C.W., Shen, X., Zheng, W.S., Jia, J.: Underexposed photo enhancement using deep illumination estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6849–6857 (2019)

  24. Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9336–9345 (2019)

  25. Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8440–8449 (2019)

  26. Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: British Machine Vision Conference (BMVC) (2018)

  27. Xiao, Y., Jiang, A., Ye, J., Wang, M.W.: Making of night vision: object detection under low-illumination. IEEE Access 8, 123,075-123,086 (2020)

    Article  Google Scholar 

  28. Xiong, Y., Liu, H., Gupta, S., Akin, B., Bender, G., Wang, Y., Kindermans, P.J., Tan, M., Singh, V., Chen, B.: Mobiledets: Searching for object detection architectures for mobile accelerators. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3825–3834 (2021)

  29. Xu, J., Yuan, M., Yan, D.-M. Wu, T.: Deep unfolding multi-scale regularizer network for image denoising. Comput. Vis. Media (2022)

  30. Xue, M., Shivakumara, P., Zhang, C., Xiao, Y., Lu, T., Pal, U., Lopresti, D., Yang, Z.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Transactions on Multimedia (TMM) (2020)

  31. Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3063–3072 (2020)

  32. Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090 (2012)

  33. Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015). https://doi.org/10.1109/TPAMI.2014.2366765

    Article  Google Scholar 

  34. Yuliang, L., Lianwen, J., Shuaitao, Z., Sheng, Z.: Detecting curve text in the wild: new dataset and new solution. arXiv:1712.02170 (2017)

  35. Yuan, M.-K., Dai, L.-Q., Yan, D.-M., Zhang, L.-Q., Xiao, J., Zhang, X.-P.: Fast and error-bounded space-variant bilateral filtering. J. Comput. Sci. Technol. 34(3), 550–568 (2019)

  36. Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: EAST: an efficient and accurate scene text detector. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5551–5560 (2017)

  37. Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3123–3131 (2021)

Download references

Acknowledgements

This work was supported in part by the National Key R &D Program of China (No.2019YFB2204104) and the National Natural Science Foundation of China (Nos. 62172415, 62102414, 52175493), and the Alibaba Group through Alibaba Innovative Research Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Yuan, M., Wang, T. et al. LIST: low illumination scene text detector with automatic feature enhancement. Vis Comput 38, 3231–3242 (2022). https://doi.org/10.1007/s00371-022-02570-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02570-7

Keywords

Navigation