Abstract
Hand-drawn doodles present a difficult set of textures to model and synthesize. Unlike the typical natural images that are most often used in texture synthesis studies, the doodles examined here are characterized by the use of sharp, irregular, and imperfectly scribbled patterns, frequent imprecise strokes, haphazardly connected edges, and randomly or spatially shifting themes. The almost binary nature of the doodles examined makes it difficult to hide common mistakes such as discontinuities. Further, there is no color or shading to mask flaws and repetition; any process that relies on, even stochastic, region copying is readily discernible. To tackle the problem of synthesizing these textures, we model the underlying generation process of the doodle taking into account potential unseen, but related, expansion contexts. We demonstrate how to generate infinitely long textures, such that the texture can be extended far beyond a single image’s source material. This is accomplished by creating a novel learning mechanism that is taught to condition the generation process on its own generated context—what was generated in previous steps—not just upon the original.
Similar content being viewed by others
Notes
Note that more of the context image is filled in than the mask indicates. Because we are training with chains, much of the current input was already synthesized by G in previous steps. Since those areas were synthesized, they are not considered part of the original.
A mini-batch is standard neural network terminology that refers to a group of examples that are presented to train a network from which the error is aggregated.
References
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium, pp. 265–283 (2016)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: ICML, pp. 214–223 (2017)
Atalay, V., Gelenbe, E., Yalabik, N.: The random neural network model for texture generation. IJPRAI 6(01), 131–141 (1992)
Automatic1111.: Automatic1111/stable-diffusion-webui. https://github.com/AUTOMATIC1111/stable-diffusion-webui (2022)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. (TOG) 28(3), 24 (2009)
Bau, D., Zhu, J.Y., Wulff, J., Peebles, W., Strobelt, H., Zhou, B., Torralba, A.: Seeing what a gan cannot generate. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 4502–4511 (2019)
Cai, N., Su, Z., Lin, Z., Wang, H., Yang, Z., Ling, B.W.K.: Blind inpainting using the fully convolutional neural network. Vis. Comput. 33(2), 249–261 (2017)
Chang, H., Zhang, H., Jiang, L., Liu, C., Freeman, W.T.: Maskgit: masked generative image transformer. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 11315–11325 (2022)
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. NeurIPS 34, 8780–8794 (2021)
Durall, R., Chatzimichailidis, A., Labus, P., Keuper, J.: Combating mode collapse in gan training: an empirical analysis using hessian eigenvalues. arXiv:2012.09673 (2020)
Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 341–346 (2001)
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: ICCV, vol. 2, pp. 1033–1038. IEEE (1999)
Elharrouss, O., Almaadeed, N., Al-Maadeed, S., Akbari, Y.: Image inpainting: a review. Neural Process. Lett. 51(2), 2007–2028 (2020)
Fukushima, K.: A neural network model for selective attention in visual pattern recognition. Biol. Cybern. 55(1), 5–15 (1986)
Gal, R., Alaluf, Y., Atzmon, Y., Patashnik, O., Bermano, A.H., Chechik, G., Cohen-Or, D.: An image is worth one word: personalizing text-to-image generation using textual inversion. arXiv:2208.01618 (2022)
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv:1508.06576 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 2414–2423 (2016)
Gonthier, N., Gousseau, Y., Ladjal, S.: Github code for high resolution neural texture synthesis with long range constraints. https://github.com/ngonthier/multiresolution_texture (2020)
Gonthier, N., Gousseau, Y., Ladjal, S.: High-resolution neural texture synthesis with long-range constraints. J. Math. Imaging Vis. 64(5), 478–492 (2022)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. arXiv:1406.2661 (2014)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 16000–16009 (2022)
Hertz, A., Hanocka, R., Giryes, R., Cohen-Or, D.: Deep geometric texture synthesis. arXiv:2007.00074 (2020)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. NeurIPS 33, 6840–6851 (2020)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 1125–1134 (2017)
Karnewar, A., Wang, O.: Msg-gan: multi-scale gradients for generative adversarial networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 7799–7808 (2020)
Kramer, M.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Kushwaha, V., Nandi, G., et al.: Study of prevention of mode collapse in generative adversarial network (GAN). In: 2020 IEEE 4th Conference on Information and Communication Technology (CICT), pp. 1–6. IEEE (2020)
Kwatra, V., Schödl, A., Essa, I., Turk, G., Bobick, A.: Graphcut textures: image and video synthesis using graph cuts. ACM Trans. Graph. (TOG) 22(3), 277–286 (2003)
Kwatra, V., Essa, I., Bobick, A., Kwatra, N.: Texture optimization for example-based synthesis. ACM SIGGRAPH 2005, 795–802 (2005)
Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Computer Vision—ECCV 2016, pp. 702–716 (2016)
Li, L., Chen, M., Shi, H., Duan, Z., Xiong, X.: Multiscale structure and texture feature fusion for image inpainting. IEEE Access 10, 82668–82679 (2022)
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 3920–3928 (2017)
Liu, G., Gousseau, Y., Xia, G.S.: Texture synthesis through convolutional neural networks and spectrum constraints. arXiv:1605.01141 (2016)
Niklasson, E., Mordvintsev, A., Randazzo, E., Levin, M.: Self-organising textures. Distill https://distill.pub/selforg/2021/textures (2021)
OpenAI.: Dalle: introducing outpainting. https://openai.com/blog/dall-e-introducing-outpainting/ (2022)
Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40, 49–70 (2000)
Raad, L., Davy, A., Desolneux, A., Morel, J.M.: A survey of exemplar-based texture synthesis. Ann. Math. Sci. Appl. 3(1), 89–148 (2018)
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125 (2022)
Rodriguez-Pardo, C., Garces, E.: Seamlessgan: self-supervised synthesis of tileable texture maps. arXiv:2201.05120 (2022)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 10684–10695 (2022)
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: Dreambooth: fine tuning text-to-image diffusion models for subject-driven generation. arXiv:2208.12242 (2022)
Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., Ghasemipour, S.K.S., Ayan, B.K., Mahdavi, S.S., Lopes, R.G., Salimans, T., Ho, J., Fleet, D.J., Norouzi, M.: Photorealistic text-to-image diffusion models with deep language understanding. arXiv:2205.11487 (2022)
Sajjadi, M.S., Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 4491–4500 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML, pp. 2256–2265 (2015)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: NeurIPS, vol. 32 (2019)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. arXiv:2011.13456 (2020)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 2818–2826 (2016)
Wang, Y., Tao, X., Qi, X., Shen, X., Jia, J.: Image inpainting via generative multi-column convolutional neural networks. In: NeurIPS, vol. 31 (2018)
Wei, L.Y.: State of the art in example-based texture synthesis. https://github.com/1iyiwei/texture (2018)
Wei, L.Y., Levoy, M.: Fast texture synthesis using tree-structured vector quantization. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 479–488 (2000)
Wei, L.Y., Lefebvre, S., Kwatra, V., Turk, G.: State of the art in example-based texture synthesis. Eurographics 2009, State of the Art Report, EG-STAR, pp. 93–117 (2009)
Wu, X.: Creative painting with latent diffusion models. arXiv:2209.14697 (2022)
Xian, W., Sangkloy, P., Agrawal, V., Raj, A., Lu, J., Fang, C., Yu, F., Hays, J.: Texturegan: controlling deep image synthesis with texture patches. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 8456–8465 (2018)
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Yin, L., Chua, A.: Long range constraints for neural texture synthesis using sliced wasserstein loss. arXiv:2211.11137 (2022)
Yu, N., Barnes, C., Shechtman, E., Amirghodsi, S., Lukac, M.: Texture mixer: a network for controllable synthesis and interpolation of texture. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 12164–12173 (2019)
Zhou, Y., Zhu, Z., Bai, X., Lischinski, D., Cohen-Or, D., Huang, H.: Non-stationary texture synthesis by adversarial expansion. ACM Trans. Graph. (TOG) 37(4), 1–13 (2018)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2223–2232 (2017)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There are no potential conflict to disclose. This research was funded by the Author’s Institution, Google, Inc.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Baluja, S. The infinite doodler: expanding textures within tightly constrained manifolds. Vis Comput 39, 3271–3283 (2023). https://doi.org/10.1007/s00371-023-03025-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03025-3