Adversarial Rectification Network for Scene Text Regularization

Li, Jing; Wang, Qiu-Feng; Zhang, Rui; Huang, Kaizhu

doi:10.1007/978-3-030-63833-7_13

Jing Li¹⁴,
Qiu-Feng Wang¹⁴,
Rui Zhang¹⁵ &
…
Kaizhu Huang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12533))

Included in the following conference series:

International Conference on Neural Information Processing

2677 Accesses

Abstract

Scene text recognition with irregular layouts is a challenging yet important problem in computer vision. One widely used method is to employ a rectification network before the recognition stage. However, most previous rectification methods either did not consider recognition information or were integrated into end-to-end recognition models without considering rectification explicitly. To overcome this issue, we propose an adversarial learning-based rectification network that integrates transformation (from irregular texts to regular texts) with recognition information into a unified framework. In this framework, we optimize the rectification network with an extended Generative Adversarial Network that competes between rectifier and discriminator, together with the results of a recognizer. To evaluate the rectification performance, we generated a regular-irregular pair set from the benchmark datasets, and experimental results show that the proposed method can achieve significant improvement on the rectification performance with comparable recognition performance. Specifically, the PSNR and SSIM are improved by 0.81 and 0.051, respectively, which demonstrates its effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

Progressive rectification network for irregular text recognition

Article 14 January 2020

References

Huang, K., Hussain, A., Wang, Q., Zhang, R. (eds.): Deep Learning: Fundamentals, Theory and Applications. Cognitive Computation Trends, vol. 2. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-06073-2
Book Google Scholar
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Article Google Scholar
Bissacco, A., Cummins, M., Netzer, Y., et al.: PhotoOCR: reading text in uncontrolled conditions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 785–792 (2013)
Google Scholar
Shi, B., Yang, M., Wang, X., et al.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2018)
Article Google Scholar
Luo, C., Jin, L., Sun, Z.: MORAN: a multi-object rectified attention network for scene text recognition. Pattern Recogn. 90, 109–118 (2019)
Article Google Scholar
Guo, Z., Xu, H., Lu, F., et al.: Improving irregular text recognition by integrating gabor convolutional network. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 286–293. IEEE (2019)
Google Scholar
Bookstein, F.L.: Principal warps: thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11(6), 567–585 (1989)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Ledig, C., Theis, L., Huszár, F., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Google Scholar
Qian, Z., Huang, K., Wang, Q., et al.: Generative adversarial classifier for handwriting characters super-resolution. Pattern Recogn. 107, 107453 (2020)
Article Google Scholar
Li, C.X., Xu, T., Zhu, J., et al.: Triple generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 4088–4098 (2017)
Google Scholar
Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Jaderberg, M., Simonyan, K., Vedaldi, A., et al.: Synthetic data and artificial neural networks for natural scene text recognition. In: Workshop on Deep Learning, NIPS (2014)
Google Scholar
Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2315–2324 (2016)
Google Scholar
Mishra, A., Alahari, K., Jawahar, C.V.: Scene text recognition using higher order language priors. In: British Machine Vision Conference, pp. 1–11 (2012)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 International Conference on Computer Vision, pp. 1457–1464. IEEE (2011)
Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., et al.: ICDAR 2003 robust reading competitions. In: Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings, pp. 682–687. IEEE (2003)
Google Scholar
Karatzas, D., Shafait, F., Uchida, S., et al.: ICDAR 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1484–1493 IEEE (2013)
Google Scholar
Karatzas D, Gomez-Bigorda L, Nicolaou A, et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160. IEEE (2015)
Google Scholar
Phan, T.Q., Shivakumara, P., Tian, S., et al.: Recognizing text with perspective distortion in natural scenes. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 569–576 (2013)
Google Scholar
Risnumawan, A., Shivakumara, P., Chan, C.S., et al.: A robust arbitrary text detection system for natural scene images. Exp. Syst. Appl. 41(18), 8027–8048 (2014)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar

Download references

Acknowledgements

This study was funded by National Natural Science Foundation of China under no. 61876154 and 61876155; Natural Science Foundation of Jiangsu Province BK20181189 and BK20181190; Key Program Special Fund in XJTLU under no. KSF-A-10, KSF-A-01, KSF-P-02, KSF-E-26 and KSF-T-06; and XJTLU Research Development Fund RDF-16-02-49 and RDF-16-01-57.

Author information

Authors and Affiliations

Department of Intelligent Science, School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China
Jing Li, Qiu-Feng Wang & Kaizhu Huang
Department of Foundational Mathematics, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou, China
Rui Zhang

Authors

Jing Li
View author publications
You can also search for this author in PubMed Google Scholar
Qiu-Feng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kaizhu Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiu-Feng Wang .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, China
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Wang, QF., Zhang, R., Huang, K. (2020). Adversarial Rectification Network for Scene Text Regularization. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12533. Springer, Cham. https://doi.org/10.1007/978-3-030-63833-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-63833-7_13
Published: 20 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63832-0
Online ISBN: 978-3-030-63833-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adversarial Rectification Network for Scene Text Regularization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

Progressive rectification network for irregular text recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Adversarial Rectification Network for Scene Text Regularization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

Progressive rectification network for irregular text recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation