Abstract
Low illumination, under which discriminative clues are buried in the captured images, is an under-investigated but noteworthy issue in wild scene text detection. Existing deep learning approaches suffer from the scarcity of training data and illumination-sensitive feature representation. To address these issues, we propose a Low Illumination Scene Text (LIST) Detector training with authentic synthetic data and integrating dedicated feature enhancement modules. Specifically, we adopt a lightweight and non-reference low-light scene text image synthesis network to acquire adequate training data through pixel-wisely adjusting the dynamic range curve of normal-light images. Moreover, illumination invariant feature representation is learned through dual path feature extraction stem with intensity adjusted inputs, and feature fusion branch with automatically designed fusion cell. In the end, the enhanced feature is fed into a segmentation-based layer to localize arbitrary shape text instances. We construct a labeled real-world scene text image dataset called “DarkText” and conduct extensive experiments to validate the advantages of our proposed framework over state-of-the-art competitors.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ch’ng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 935–942. IEEE (2017)
Deng, D., Liu, H., Li, X., Cai, D.: Pixellink: detecting scene text via instance segmentation. In: the AAAI conference on artificial intelligence (AAAI), vol. 32 (2018)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)
Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1780–1789 (2020)
Huang, S.C., Cheng, F.C., Chiu, Y.S.: Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE Trans. Image Process. TIP 22(3), 1032–1041 (2012)
Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: deep light enhancement without paired supervision. IEEE Trans. Image Process. TIP 30, 2340–2349 (2021)
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., et al.: ICDAR2015 competition on robust reading. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160 (2015)
Land, E.H.: The retinex theory of color vision. Sci. Am. 237(6), 108–129 (1977)
Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. (TIP) 27(8), 3676–3690 (2018)
Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: the AAAI conference on artificial intelligence (AAAI) 34, 11474–11481 (2020)
Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. arXiv:1806.09055 (2018)
Liu, J., Xu, D., Yang, W., Fan, M., Huang, H.: Benchmarking low-light image enhancement and beyond. Int. J. Comput. Vis. 129(4), 1153–1184 (2021)
Liu, R., Ma, L., Zhang, J., Fan, X., Luo, Z.: Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,561–10,570 (2021)
Long, S., He, X., Yao, C.: Scene text detection and recognition: the deep learning era (2020)
Lore, K.G., Akintayo, A., Sarkar, S.: Llnet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 61, 650–662 (2017)
Nayef, N., Patel, Y., Busta, M., Chowdhury, P.N., Karatzas, D., Khlif, W., Matas, J., Pal, U., Burie, J.C., Liu, C.l., et al.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition-rrc-mlt-2019. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1582–1587 (2019)
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Asian conference on computer vision (ACCV), pp. 770–783 (2010)
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2010)
Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Gr. Image Process. 39(3), 355–368 (1987)
Raisi, Z., Naiel, M.A., Fieguth, P., Wardell, S., Zelek, J.: Text detection and recognition in the wild: a review. arXiv:2006.04305 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems(NeurIPS) 28 (2015)
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: European Conference on Computer Vision (ECCV), pp. 56–72 (2016)
Wang, R., Zhang, Q., Fu, C.W., Shen, X., Zheng, W.S., Jia, J.: Underexposed photo enhancement using deep illumination estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6849–6857 (2019)
Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9336–9345 (2019)
Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8440–8449 (2019)
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: British Machine Vision Conference (BMVC) (2018)
Xiao, Y., Jiang, A., Ye, J., Wang, M.W.: Making of night vision: object detection under low-illumination. IEEE Access 8, 123,075-123,086 (2020)
Xiong, Y., Liu, H., Gupta, S., Akin, B., Bender, G., Wang, Y., Kindermans, P.J., Tan, M., Singh, V., Chen, B.: Mobiledets: Searching for object detection architectures for mobile accelerators. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3825–3834 (2021)
Xu, J., Yuan, M., Yan, D.-M. Wu, T.: Deep unfolding multi-scale regularizer network for image denoising. Comput. Vis. Media (2022)
Xue, M., Shivakumara, P., Zhang, C., Xiao, Y., Lu, T., Pal, U., Lopresti, D., Yang, Z.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Transactions on Multimedia (TMM) (2020)
Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3063–3072 (2020)
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090 (2012)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015). https://doi.org/10.1109/TPAMI.2014.2366765
Yuliang, L., Lianwen, J., Shuaitao, Z., Sheng, Z.: Detecting curve text in the wild: new dataset and new solution. arXiv:1712.02170 (2017)
Yuan, M.-K., Dai, L.-Q., Yan, D.-M., Zhang, L.-Q., Xiao, J., Zhang, X.-P.: Fast and error-bounded space-variant bilateral filtering. J. Comput. Sci. Technol. 34(3), 550–568 (2019)
Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: EAST: an efficient and accurate scene text detector. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5551–5560 (2017)
Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3123–3131 (2021)
Acknowledgements
This work was supported in part by the National Key R &D Program of China (No.2019YFB2204104) and the National Natural Science Foundation of China (Nos. 62172415, 62102414, 52175493), and the Alibaba Group through Alibaba Innovative Research Program.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, H., Yuan, M., Wang, T. et al. LIST: low illumination scene text detector with automatic feature enhancement. Vis Comput 38, 3231–3242 (2022). https://doi.org/10.1007/s00371-022-02570-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02570-7