research-article

CLIP-based Pre-Training of Chinese Font Contents and Styles

Authors:
Shenglan Peng

School of Information and Engineering, Jingdezhen Ceramic University, China

School of Information and Engineering, Jingdezhen Ceramic University, China

0000-0001-8872-5417
View Profile

,
Wei Feng

School of Information and Engineering, Jingdezhen Ceramic University, China

School of Information and Engineering, Jingdezhen Ceramic University, China

0009-0001-6889-742X
View Profile

,
Donghong Yang

School of Information and Engineering, Jingdezhen Ceramic University, China

School of Information and Engineering, Jingdezhen Ceramic University, China

0009-0002-4518-8994
View Profile

BDE '23: Proceedings of the 2023 5th International Conference on Big Data EngineeringNovember 2023Pages 61–66https://doi.org/10.1145/3640872.3640882

Published:20 February 2024Publication History

BDE '23: Proceedings of the 2023 5th International Conference on Big Data Engineering

Pages 61–66

ABSTRACT

In this paper, we propose a CLIP-based extraction model of Chinese character content and font style. The model utilizes an embedding layer to encode Chinese characters and font style, a residual network to extract features from images, and a contrast loss to train the model. The results of experiments on a large-scale font dataset show that the features extracted by the CLIP model can effectively characterize different Chinese character contents and font styles, which also provides a good foundation for subsequent font generation tasks.

References

Joshua B Tenenbaum and William T Freeman. 1996. Separating Style and Content. In Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper_files/paper/1996/file/70222949cc0db89ab32c9969754d4758-Paper.pdfGoogle Scholar
Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. http://arxiv.org/abs/1508.06576Google Scholar
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2242–2251. DOI:https://doi.org/10.1109/ICCV.2017.244Google ScholarCross Ref
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 2672–2680. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdfGoogle Scholar
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of Machine Learning Research, PMLR, 8748–8763. http://arxiv.org/abs/2103.00020Google Scholar
Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Learning Representations. http://arxiv.org/abs/1503.03585Google Scholar
Yang Song and Stefano Ermon. 2019. Generative Modeling by Estimating Gradients of the Data Distribution. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 11918–11930. http://arxiv.org/abs/1907.05600Google Scholar
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 6840–6851. https://arxiv.org/abs/2006.11239Google Scholar
Prafulla Dhariwal and Alex Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 8780–8794. http://arxiv.org/abs/2105.05233Google Scholar
Jonathan Ho and Tim Salimans. 2022. Classifier-Free Diffusion Guidance. http://arxiv.org/abs/2207.12598Google Scholar
Y. Tian. zi2zi: Master chinese calligraphy with conditional adversarial networks. https://github.com/kaonashi-tyc/zi2ziGoogle Scholar
Yangchen Xie, Xinyuan Chen, Li Sun, and Yue Lu. 2021. DG-Font: Deformable Generative Networks for Unsupervised Font Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 5126–5136. DOI:https://doi.org/10.1109/CVPR46437.2021.00509Google ScholarCross Ref
Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao, and Yu Qiao. 2023. Diff-Font: Diffusion Model for Robust One-Shot Font Generation. http://arxiv.org/abs/2212.05895Google Scholar
Qisheng Liao, Gus Xia, and Zhinuo Wang. 2023. Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling. http://arxiv.org/abs/2305.19124Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 770–778. DOI:https://doi.org/10.1109/CVPR.2016.90Google ScholarCross Ref
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In International Conference on Learning Representations. http://arxiv.org/abs/1711.05101Google Scholar
Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In International Conference on Learning Representations. http://arxiv.org/abs/1608.03983Google Scholar
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations, arXiv. http://arxiv.org/abs/2010.11929Google Scholar

Index Terms

CLIP-based Pre-Training of Chinese Font Contents and Styles
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms

Index terms have been assigned to the content through auto-classification.

Recommendations

Complete font generation of Chinese characters in personal handwriting style
IPCCC '15: Proceedings of the 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC)

Since a complete Chinese font has typically several thousand or more Chinese characters and symbols, and most of them are much more complicated than English alphabets, it takes a lot of time and efforts for even professional font engineers to create a ...
Read More
HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering
Abstract
We propose a large-scale Hangul font recognizer that is capable of recognizing 3300 Hangul fonts. Large-scale Hangul font recognition is a challenging task. Typically, Hangul fonts are distinguished by small differences in detailed shapes, which ...
Read More
DropRegion training of inception font network for high-performance Chinese font recognition

A new approach, DropRegion-IFN, is proposed for Chinese font recognition.DropRegion is proposed as a new data augmentation and regularization technique.We design a new deep model named inception font network (IFN).Very high accuracies of 99.78% and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

BDE '23: Proceedings of the 2023 5th International Conference on Big Data Engineering
November 2023
80 pages
ISBN:9798400708695
DOI:10.1145/3640872

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 February 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CLIP model
content and style
contrastive learning
deep learning
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 10
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

CLIP-based Pre-Training of Chinese Font Contents and Styles

BDE '23: Proceedings of the 2023 5th International Conference on Big Data Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Complete font generation of Chinese characters in personal handwriting style

HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering

DropRegion training of inception font network for high-performance Chinese font recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

CLIP-based Pre-Training of Chinese Font Contents and Styles

BDE '23: Proceedings of the 2023 5th International Conference on Big Data Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Complete font generation of Chinese characters in personal handwriting style

HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering

DropRegion training of inception font network for high-performance Chinese font recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media