skip to main content
10.1145/3640872.3640882acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdeConference Proceedingsconference-collections
research-article

CLIP-based Pre-Training of Chinese Font Contents and Styles

Published:20 February 2024Publication History

ABSTRACT

In this paper, we propose a CLIP-based extraction model of Chinese character content and font style. The model utilizes an embedding layer to encode Chinese characters and font style, a residual network to extract features from images, and a contrast loss to train the model. The results of experiments on a large-scale font dataset show that the features extracted by the CLIP model can effectively characterize different Chinese character contents and font styles, which also provides a good foundation for subsequent font generation tasks.

References

  1. Joshua B Tenenbaum and William T Freeman. 1996. Separating Style and Content. In Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper_files/paper/1996/file/70222949cc0db89ab32c9969754d4758-Paper.pdfGoogle ScholarGoogle Scholar
  2. Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. http://arxiv.org/abs/1508.06576Google ScholarGoogle Scholar
  3. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2242–2251. DOI:https://doi.org/10.1109/ICCV.2017.244Google ScholarGoogle ScholarCross RefCross Ref
  4. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 2672–2680. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdfGoogle ScholarGoogle Scholar
  5. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of Machine Learning Research, PMLR, 8748–8763. http://arxiv.org/abs/2103.00020Google ScholarGoogle Scholar
  6. Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Learning Representations. http://arxiv.org/abs/1503.03585Google ScholarGoogle Scholar
  7. Yang Song and Stefano Ermon. 2019. Generative Modeling by Estimating Gradients of the Data Distribution. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 11918–11930. http://arxiv.org/abs/1907.05600Google ScholarGoogle Scholar
  8. Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 6840–6851. https://arxiv.org/abs/2006.11239Google ScholarGoogle Scholar
  9. Prafulla Dhariwal and Alex Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems, Curran Associates, Inc., 8780–8794. http://arxiv.org/abs/2105.05233Google ScholarGoogle Scholar
  10. Jonathan Ho and Tim Salimans. 2022. Classifier-Free Diffusion Guidance. http://arxiv.org/abs/2207.12598Google ScholarGoogle Scholar
  11. Y. Tian. zi2zi: Master chinese calligraphy with conditional adversarial networks. https://github.com/kaonashi-tyc/zi2ziGoogle ScholarGoogle Scholar
  12. Yangchen Xie, Xinyuan Chen, Li Sun, and Yue Lu. 2021. DG-Font: Deformable Generative Networks for Unsupervised Font Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 5126–5136. DOI:https://doi.org/10.1109/CVPR46437.2021.00509Google ScholarGoogle ScholarCross RefCross Ref
  13. Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao, and Yu Qiao. 2023. Diff-Font: Diffusion Model for Robust One-Shot Font Generation. http://arxiv.org/abs/2212.05895Google ScholarGoogle Scholar
  14. Qisheng Liao, Gus Xia, and Zhinuo Wang. 2023. Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling. http://arxiv.org/abs/2305.19124Google ScholarGoogle Scholar
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 770–778. DOI:https://doi.org/10.1109/CVPR.2016.90Google ScholarGoogle ScholarCross RefCross Ref
  16. Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In International Conference on Learning Representations. http://arxiv.org/abs/1711.05101Google ScholarGoogle Scholar
  17. Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic Gradient Descent with Warm Restarts. In International Conference on Learning Representations. http://arxiv.org/abs/1608.03983Google ScholarGoogle Scholar
  18. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle ScholarGoogle Scholar
  19. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations, arXiv. http://arxiv.org/abs/2010.11929Google ScholarGoogle Scholar

Index Terms

  1. CLIP-based Pre-Training of Chinese Font Contents and Styles
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        BDE '23: Proceedings of the 2023 5th International Conference on Big Data Engineering
        November 2023
        80 pages
        ISBN:9798400708695
        DOI:10.1145/3640872

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 February 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)10
        • Downloads (Last 6 weeks)7

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format