RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering

Huang, Yaoxiong; He, Mengchao; Jin, Lianwen; Wang, Yongpan

doi:10.1007/978-3-030-58539-6_10

Yaoxiong Huang^12,13,
Mengchao He¹³,
Lianwen Jin¹² &
…
Yongpan Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12351))

Included in the following conference series:

European Conference on Computer Vision

4252 Accesses
22 Citations

Abstract

Style transfer has attracted much interest owing to its various applications. Compared with English character or general artistic style transfer, Chinese character style transfer remains a challenge owing to the large size of the vocabulary (70224 characters in GB18010-2005) and the complexity of the structure. Recently some GAN-based methods were proposed for style transfer; however, they treated Chinese characters as a whole, ignoring the structures and radicals that compose characters. In this paper, a novel radical decomposition-and-rendering-based GAN (RD-GAN) is proposed to utilize the radical-level compositions of Chinese characters and achieves few-shot/zero-shot Chinese character style transfer. The RD-GAN consists of three components: a radical extraction module (REM), radical rendering module (RRM), and multi-level discriminator (MLD). Experiments demonstrate that our method has a powerful few-shot/zero-shot generalization ability by using the radical-level compositions of Chinese characters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cjkvi. https://github.com/cjkvi/cjkvi-ids
Rewrite. https://github.com/kaonashi-tyc/Rewrite
Zi2zi. https://github.com/kaonashi-tyc/zi2zi
Azadi, S., Fisher, M., Kim, V., Wang, Z., Shechtman, E., Darrell, T.: Multi-content GAN for few-shot font style transfer (2017)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: ICASSP (2016)
Google Scholar
Bai, F., Cheng, Z., Niu, Y., Pu, S., Zhou, S.: Edit probability for scene text recognition. In: CVPR (2018)
Google Scholar
Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In: NIPS (2016)
Google Scholar
Chen, T.Q., Schmidt, M.: Fast patch-based style transfer of arbitrary style. arXiv preprint arXiv:1612.04337 (2016)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Chapter Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227 (2014)
Jiang, Y., Lian, Z., Tang, Y., Xiao, J.: SCFont: structure-guided Chinese font generation via deep stacked networks. In: AAAI (2019)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kim, S., Hori, T., Watanabe, S.: Joint CTC-attention based end-to-end speech recognition using multi-task learning. In: ICASSP (2017)
Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015)
Article MathSciNet Google Scholar
Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: AAAI (2019)
Google Scholar
Lian, Z., Zhao, B., Chen, X., Xiao, J.: EasyFont: a style learning-based system to easily build your large-scale handwriting fonts. ACM Trans. Graph. (TOG) 38, 1–18 (2018)
Article Google Scholar
Lian, Z., Zhao, B., Xiao, J.: Automatic generation of large-scale handwriting fonts via style learning. In: SIGGRAPH ASIA 2016 Technical Briefs (2016)
Google Scholar
Lin, Q., Liang, L., Huang, Y., Jin, L.: Learning to generate realistic scene Chinese character images by multitask coupled GAN. In: Lai, J.-H., Liu, C.-L., Chen, X., Zhou, J., Tan, T., Zheng, N., Zha, H. (eds.) PRCV 2018. LNCS, vol. 11258, pp. 41–51. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03338-5_4
Chapter Google Scholar
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)
Google Scholar
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: NIPS (2016)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Lu, J., Xiong, C., Parikh, D., Socher, R.: Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: CVPR (2017)
Google Scholar
Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: NIPS (2016)
Google Scholar
Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y.Q., Shum, H.Y.: Natural image colorization. In: Proceedings of the 18th Eurographics Conference on Rendering Techniques (2007)
Google Scholar
Lyu, P., Bai, X., Yao, C., Zhu, Z., Huang, T., Liu, W.: Auto-encoder guided GAN for Chinese calligraphy synthesis. In: ICDAR, vol. 1 (2017)
Google Scholar
Ma, L.L., Liu, C.L.: A new radical-based approach to online handwritten Chinese character recognition. In: ICPR (2008)
Google Scholar
Miyazaki, T., et al.: Automatic generation of typographic font from small font subset. IEEE Comput. Graph. Appl. 40, 99–111 (2019)
Article Google Scholar
Myers, J.: Knowing Chinese character grammar. Cognition 147, 127–132 (2016)
Article Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV (2015)
Google Scholar
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NIPS (2016)
Google Scholar
Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., Bai, X.: Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In: CVPR (2016)
Google Scholar
Shen, X., Chen, Y.C., Tao, X., Jia, J.: Convolutional neural pyramid for image processing. In: CVPR (2017)
Google Scholar
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2035–2048 (2018)
Article Google Scholar
Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: CVPR (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Upchurch, P., Snavely, N., Bala, K.: From a to z: supervised transfer of style and content using deep neural network generators. arXiv preprint arXiv:1603.02003 (2016)
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Google Scholar
Wang, T., et al.: Decoupled attention network for text recognition. In: AAAI (2020)
Google Scholar
Wang, T.Q., Yin, F., Liu, C.L.: Radical-based Chinese character recognition via multi-labeled learning of deep residual networks. In: ICDAR (2017)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004)
Article Google Scholar
Wen, C., Chang, J., Zhang, Y.: Handwritten Chinese font generation with collaborative stroke refinement. arXiv preprint arXiv:1904.13268 (2019)
Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: ECCV (2018)
Google Scholar
Xie, S., Tu, Z.: Holistically-nested edge detection. In: ICCV (2015)
Google Scholar
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)
Google Scholar
Xu, S., Jin, T., Jiang, H., Lau, F.C.: Automatic generation of personal Chinese handwriting by capturing the characteristics of personal handwriting. In: IAAI (2009)
Google Scholar
Yang, H., Jin, L., Huang, W., Yang, Z., Lai, S., Sun, J.: Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector. IEEE Access 6, 30174–30183 (2018)
Article Google Scholar
Yang, Z., He, X., Gao, J., Deng, L., Smola, A.: Stacked attention networks for image question answering. In: CVPR (2016)
Google Scholar
Yiming Gao, J.W.: GAN-based unpaired Chinese character image translation via skeleton transformation and stroke rendering. In: AAAI (2020)
Google Scholar
Zhang, J., Du, J., Dai, L.: Track, attend, and parse (TAP): an end-to-end framework for online handwritten mathematical expression recognition. IEEE Trans. Multimedia 21, 221–233 (2018)
Article Google Scholar
Zhang, J., et al.: Watch, attend and parse: an end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recogn. 71, 196–206 (2017)
Article Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23, 1499–1503 (2016)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40
Chapter Google Scholar
Zhang, S., Liu, Y., Jin, L., Huang, Y., Lai, S.: EnsNet: Ensconce text in the wild. In: AAAI (2019)
Google Scholar
Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40, 849–862 (2017)
Article Google Scholar
Zhang, Y., Zhang, Y., Cai, W.: Separating style and content for generalized style transfer. In: CVPR (2018)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar

Download references

Acknowledgement

This research is supported in part by NSFC (Grant No.: 61936003), GD-NSF (no. 2017A030312006), Alibaba Innovative Research Foundation (no. D8200510), and Fundamental Research Funds for the Central Universities (no. D2190570).

Author information

Authors and Affiliations

School of Electronic and Information Engineering, South China University of Technology, Guangzhou, China
Yaoxiong Huang & Lianwen Jin
Alibaba Group, Hangzhou, China
Yaoxiong Huang, Mengchao He & Yongpan Wang

Authors

Yaoxiong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Mengchao He
View author publications
You can also search for this author in PubMed Google Scholar
Lianwen Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yongpan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lianwen Jin .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, Y., He, M., Jin, L., Wang, Y. (2020). RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12351. Springer, Cham. https://doi.org/10.1007/978-3-030-58539-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-58539-6_10
Published: 07 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58538-9
Online ISBN: 978-3-030-58539-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics