Abstract
Text detection in underwater images is an open challenge because of the distortions caused by refraction, absorption of light, particles, and variations depending on depth, color, and nature of water. Unlike existing methods aimed at text detection in natural scene images, in this paper, we have proposed a novel method for text detection in underwater images through a new enhancement model. Based on observations that fine details of text in image share with high energy, spatial resolution, and brightness, we consider Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Fast Fourier Transform (FFT) for image enhancement to highlight the text features. The enhanced image is fed to a modified Character Region Awareness for Text Detection (CRAFT) model to detect text in underwater images. To explore enhancement methods, we evaluate six combinations of image enhancement techniques, namely, DCT-DWT-FFT, DCT-FFT-DWT, DWT-DCT-FFT, DWT-FFT-DCT, FFT-DCT-DWT, FFT-DWT-DCT. Experimental results on our dataset of underwater images and benchmark datasets of natural scene text detection, namely, MSRA-TD500, ICDAR 2019 MLT, ICDAR 2019 ArT, Total-Text, CTW1500, and COCO Text show that the proposed method performs well for both underwater and natural scene images and outperforms the existing methods on all the datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xue, M., et al.: Deep invariant texture features for water image classification. SN Appl. Sci. 2(12), 1–19 (2020). https://doi.org/10.1007/s42452-020-03882-w
Kezebou, L., Oludare, V., Panetta, K., Againa, S.S.: Underwater object tracking benchmark and dataset. In: Proceedings of the HST (2019). https://doi.org/10.1109/HST47167.2019.9032954
Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: ContourNet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the CVPR, pp. 11753–11762 (2020)
Cao, Y., Ma, S., Pan, H.: FDTA: Fully convolutional scene text detection with text attention. IEEE Access 155441–155449 (2020)
Zhang, W., Xiang, S.: Face anti-spoofing detection based on DWT-LBP-DCT features. Signal Process. Image Commun. (2020). https://doi.org/10.1016/j.image.2020.115990
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the CVPR, pp. 9365–9374 (2019)
Liu, C., Yang, C., Hou, J.B., Wu, L.H., Zhu, X.B., Xiao, L.: GCCNet: Grouped channel composition network for scene text detection. Neurocomputing 454, 135–151 (2021)
Shi, J., Chen, L., Su, F.: Accurate arbitrary-shaped scene text detection via iterative polynomial parameter regression. In: Ishikawa, H., Liu, C.-L., Pajdla, T., Shi, J. (eds.) ACCV 2020. LNCS, vol. 12624, pp. 241–256. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69535-4_15
Qin, X., Jiang, J., Yuan, C.A., Qiao, S., Fan, W.: Arbitrary shape natural scene text detection method based on soft attention mechanism and dilated convolution. IEEE Access 122685–122694 (2020)
Dai, P., Li, Y., Zhang, H., Li, J., Cao, X.: Accurate scene text detection via scale-aware data augmentation and shape similarity constraint. IEEE Trans. Multim. (2021). https://doi.org/10.1109/TMM.2021.3073575
Hu, Z., Wu, X., Wang, J.: TCATD: text contour attention for scene text detection. In: Proceedings of the ICPR, pp. 1083–1088 (2021)
Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 71–88. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_5
Deng, G., Ming, Y., Xue, J.-H.: RFRN: A recurrent feature refinement network for accurate and efficient scene text detection. Neurocomputing 453, 465–481 (2021)
Liu, J., Zhong, Q., Yuan, Y., Su, H., Du, B.: SemiText: scene text detection with semi-supervised learning. Neurocomputing 407, 343–353 (2020)
Xue, M., et al.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Trans. Multim. 23, 2706–2720 (2020)
Chowdhury, P.N., et al.: A new episodic learning-based network for text detection on human body in sports images. IEEE Trans Circuits Syst. Video Technol. (2021). https://doi.org/10.1109/TCSVT.2021.3092713
Chowdhury, T., Shivakumara, P., Pal, U., Tong, L., Raghavendra, R., Chanda, S.: DCINN: deformable convolution and inception based neural network for tattoo text detection through skin region. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II, pp. 335–350. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_22
Zhou, X., et al.: East: an efficient and accurate scene text detector. In: Proceedings of the CVPR, pp. 2642–2651 (2017)
Roy, S., Shivakumara, P., Pal, U., Lu, T., Kumar, G.H.: Delaunay triangulation-based text detection from multi-view images of natural scene. Pattern Recogn. Lett. 129, 92–100 (2020)
Chng, C.K., Liu, Y., Sun, Y., Ng, C.C., Luo, C., Ni, Z.: ICDAR2019 robust reading challenge on arbitrarily-shaped text-RRC-ArT. In: Proceedings of the ICDAR, pp. 1571–1576 (2019)
Acknowledgements
The work of Cheng-Lin Liu was supported by the National Key Research and Development Program under Grant No. 2018AAA0100400 and the National Natural Science Foundation of China under Grant No. 61721004. This work also partially supported by TIH, ISI, Kolkata.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Banerjee, A., Shivakumara, P., Pal, S., Pal, U., Liu, CL. (2022). DCT-DWT-FFT Based Method for Text Detection in Underwater Images. In: Wallraven, C., Liu, Q., Nagahara, H. (eds) Pattern Recognition. ACPR 2021. Lecture Notes in Computer Science, vol 13189. Springer, Cham. https://doi.org/10.1007/978-3-031-02444-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-02444-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-02443-6
Online ISBN: 978-3-031-02444-3
eBook Packages: Computer ScienceComputer Science (R0)