ABSTRACT
In this paper, we propose a CLIP-based extraction model of Chinese character content and font style. The model utilizes an embedding layer to encode Chinese characters and font style, a residual network to extract features from images, and a contrast loss to train the model. The results of experiments on a large-scale font dataset show that the features extracted by the CLIP model can effectively characterize different Chinese character contents and font styles, which also provides a good foundation for subsequent font generation tasks.
- Joshua B Tenenbaum and William T Freeman. 1996. Separating Style and Content. In Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper_files/paper/1996/file/70222949cc0db89ab32c9969754d4758-Paper.pdfGoogle Scholar
- Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. http://arxiv.org/abs/1508.06576Google Scholar
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2242–2251. DOI:https://doi.org/10.1109/ICCV.2017.244Google ScholarCross Ref
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 2672–2680. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdfGoogle Scholar
- Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of Machine Learning Research, PMLR, 8748–8763. http://arxiv.org/abs/2103.00020Google Scholar
- Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Learning Representations. http://arxiv.org/abs/1503.03585Google Scholar
- Yang Song and Stefano Ermon. 2019. Generative Modeling by Estimating Gradients of the Data Distribution. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 11918–11930. http://arxiv.org/abs/1907.05600Google Scholar
- Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 6840–6851. https://arxiv.org/abs/2006.11239Google Scholar
- Prafulla Dhariwal and Alex Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 8780–8794. http://arxiv.org/abs/2105.05233Google Scholar
- Jonathan Ho and Tim Salimans. 2022. Classifier-Free Diffusion Guidance. http://arxiv.org/abs/2207.12598Google Scholar
- Y. Tian. zi2zi: Master chinese calligraphy with conditional adversarial networks. https://github.com/kaonashi-tyc/zi2ziGoogle Scholar
- Yangchen Xie, Xinyuan Chen, Li Sun, and Yue Lu. 2021. DG-Font: Deformable Generative Networks for Unsupervised Font Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 5126–5136. DOI:https://doi.org/10.1109/CVPR46437.2021.00509Google ScholarCross Ref
- Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao, and Yu Qiao. 2023. Diff-Font: Diffusion Model for Robust One-Shot Font Generation. http://arxiv.org/abs/2212.05895Google Scholar
- Qisheng Liao, Gus Xia, and Zhinuo Wang. 2023. Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling. http://arxiv.org/abs/2305.19124Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 770–778. DOI:https://doi.org/10.1109/CVPR.2016.90Google ScholarCross Ref
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In International Conference on Learning Representations. http://arxiv.org/abs/1711.05101Google Scholar
- Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In International Conference on Learning Representations. http://arxiv.org/abs/1608.03983Google Scholar
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations, arXiv. http://arxiv.org/abs/2010.11929Google Scholar
Index Terms
- CLIP-based Pre-Training of Chinese Font Contents and Styles
Recommendations
Complete font generation of Chinese characters in personal handwriting style
IPCCC '15: Proceedings of the 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC)Since a complete Chinese font has typically several thousand or more Chinese characters and symbols, and most of them are much more complicated than English alphabets, it takes a lot of time and efforts for even professional font engineers to create a ...
HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering
AbstractWe propose a large-scale Hangul font recognizer that is capable of recognizing 3300 Hangul fonts. Large-scale Hangul font recognition is a challenging task. Typically, Hangul fonts are distinguished by small differences in detailed shapes, which ...
DropRegion training of inception font network for high-performance Chinese font recognition
A new approach, DropRegion-IFN, is proposed for Chinese font recognition.DropRegion is proposed as a new data augmentation and regularization technique.We design a new deep model named inception font network (IFN).Very high accuracies of 99.78% and ...
Comments