Skip to main content
Log in

The infinite doodler: expanding textures within tightly constrained manifolds

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Hand-drawn doodles present a difficult set of textures to model and synthesize. Unlike the typical natural images that are most often used in texture synthesis studies, the doodles examined here are characterized by the use of sharp, irregular, and imperfectly scribbled patterns, frequent imprecise strokes, haphazardly connected edges, and randomly or spatially shifting themes. The almost binary nature of the doodles examined makes it difficult to hide common mistakes such as discontinuities. Further, there is no color or shading to mask flaws and repetition; any process that relies on, even stochastic, region copying is readily discernible. To tackle the problem of synthesizing these textures, we model the underlying generation process of the doodle taking into account potential unseen, but related, expansion contexts. We demonstrate how to generate infinitely long textures, such that the texture can be extended far beyond a single image’s source material. This is accomplished by creating a novel learning mechanism that is taught to condition the generation process on its own generated context—what was generated in previous steps—not just upon the original.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. Note that more of the context image is filled in than the mask indicates. Because we are training with chains, much of the current input was already synthesized by G in previous steps. Since those areas were synthesized, they are not considered part of the original.

  2. A mini-batch is standard neural network terminology that refers to a group of examples that are presented to train a network from which the error is aggregated.

References

  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium, pp. 265–283 (2016)

  2. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: ICML, pp. 214–223 (2017)

  3. Atalay, V., Gelenbe, E., Yalabik, N.: The random neural network model for texture generation. IJPRAI 6(01), 131–141 (1992)

    Google Scholar 

  4. Automatic1111.: Automatic1111/stable-diffusion-webui. https://github.com/AUTOMATIC1111/stable-diffusion-webui (2022)

  5. Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. (TOG) 28(3), 24 (2009)

    Article  Google Scholar 

  6. Bau, D., Zhu, J.Y., Wulff, J., Peebles, W., Strobelt, H., Zhou, B., Torralba, A.: Seeing what a gan cannot generate. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 4502–4511 (2019)

  7. Cai, N., Su, Z., Lin, Z., Wang, H., Yang, Z., Ling, B.W.K.: Blind inpainting using the fully convolutional neural network. Vis. Comput. 33(2), 249–261 (2017)

    Article  Google Scholar 

  8. Chang, H., Zhang, H., Jiang, L., Liu, C., Freeman, W.T.: Maskgit: masked generative image transformer. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 11315–11325 (2022)

  9. Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. NeurIPS 34, 8780–8794 (2021)

    Google Scholar 

  10. Durall, R., Chatzimichailidis, A., Labus, P., Keuper, J.: Combating mode collapse in gan training: an empirical analysis using hessian eigenvalues. arXiv:2012.09673 (2020)

  11. Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 341–346 (2001)

  12. Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: ICCV, vol. 2, pp. 1033–1038. IEEE (1999)

  13. Elharrouss, O., Almaadeed, N., Al-Maadeed, S., Akbari, Y.: Image inpainting: a review. Neural Process. Lett. 51(2), 2007–2028 (2020)

    Article  Google Scholar 

  14. Fukushima, K.: A neural network model for selective attention in visual pattern recognition. Biol. Cybern. 55(1), 5–15 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  15. Gal, R., Alaluf, Y., Atzmon, Y., Patashnik, O., Bermano, A.H., Chechik, G., Cohen-Or, D.: An image is worth one word: personalizing text-to-image generation using textual inversion. arXiv:2208.01618 (2022)

  16. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv:1508.06576 (2015)

  17. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 2414–2423 (2016)

  18. Gonthier, N., Gousseau, Y., Ladjal, S.: Github code for high resolution neural texture synthesis with long range constraints. https://github.com/ngonthier/multiresolution_texture (2020)

  19. Gonthier, N., Gousseau, Y., Ladjal, S.: High-resolution neural texture synthesis with long-range constraints. J. Math. Imaging Vis. 64(5), 478–492 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  20. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. arXiv:1406.2661 (2014)

  21. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 16000–16009 (2022)

  22. Hertz, A., Hanocka, R., Giryes, R., Cohen-Or, D.: Deep geometric texture synthesis. arXiv:2007.00074 (2020)

  23. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. NeurIPS 33, 6840–6851 (2020)

    Google Scholar 

  24. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 1125–1134 (2017)

  25. Karnewar, A., Wang, O.: Msg-gan: multi-scale gradients for generative adversarial networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 7799–7808 (2020)

  26. Kramer, M.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)

    Article  Google Scholar 

  27. Kushwaha, V., Nandi, G., et al.: Study of prevention of mode collapse in generative adversarial network (GAN). In: 2020 IEEE 4th Conference on Information and Communication Technology (CICT), pp. 1–6. IEEE (2020)

  28. Kwatra, V., Schödl, A., Essa, I., Turk, G., Bobick, A.: Graphcut textures: image and video synthesis using graph cuts. ACM Trans. Graph. (TOG) 22(3), 277–286 (2003)

    Article  Google Scholar 

  29. Kwatra, V., Essa, I., Bobick, A., Kwatra, N.: Texture optimization for example-based synthesis. ACM SIGGRAPH 2005, 795–802 (2005)

    Google Scholar 

  30. Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Computer Vision—ECCV 2016, pp. 702–716 (2016)

  31. Li, L., Chen, M., Shi, H., Duan, Z., Xiong, X.: Multiscale structure and texture feature fusion for image inpainting. IEEE Access 10, 82668–82679 (2022)

    Article  Google Scholar 

  32. Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 3920–3928 (2017)

  33. Liu, G., Gousseau, Y., Xia, G.S.: Texture synthesis through convolutional neural networks and spectrum constraints. arXiv:1605.01141 (2016)

  34. Niklasson, E., Mordvintsev, A., Randazzo, E., Levin, M.: Self-organising textures. Distill https://distill.pub/selforg/2021/textures (2021)

  35. OpenAI.: Dalle: introducing outpainting. https://openai.com/blog/dall-e-introducing-outpainting/ (2022)

  36. Portilla, J., Simoncelli, E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40, 49–70 (2000)

    Article  MATH  Google Scholar 

  37. Raad, L., Davy, A., Desolneux, A., Morel, J.M.: A survey of exemplar-based texture synthesis. Ann. Math. Sci. Appl. 3(1), 89–148 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  38. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125 (2022)

  39. Rodriguez-Pardo, C., Garces, E.: Seamlessgan: self-supervised synthesis of tileable texture maps. arXiv:2201.05120 (2022)

  40. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 10684–10695 (2022)

  41. Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: Dreambooth: fine tuning text-to-image diffusion models for subject-driven generation. arXiv:2208.12242 (2022)

  42. Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., Ghasemipour, S.K.S., Ayan, B.K., Mahdavi, S.S., Lopes, R.G., Salimans, T., Ho, J., Fleet, D.J., Norouzi, M.: Photorealistic text-to-image diffusion models with deep language understanding. arXiv:2205.11487 (2022)

  43. Sajjadi, M.S., Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 4491–4500 (2017)

  44. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  45. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: ICML, pp. 2256–2265 (2015)

  46. Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. In: NeurIPS, vol. 32 (2019)

  47. Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. arXiv:2011.13456 (2020)

  48. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 2818–2826 (2016)

  49. Wang, Y., Tao, X., Qi, X., Shen, X., Jia, J.: Image inpainting via generative multi-column convolutional neural networks. In: NeurIPS, vol. 31 (2018)

  50. Wei, L.Y.: State of the art in example-based texture synthesis. https://github.com/1iyiwei/texture (2018)

  51. Wei, L.Y., Levoy, M.: Fast texture synthesis using tree-structured vector quantization. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 479–488 (2000)

  52. Wei, L.Y., Lefebvre, S., Kwatra, V., Turk, G.: State of the art in example-based texture synthesis. Eurographics 2009, State of the Art Report, EG-STAR, pp. 93–117 (2009)

  53. Wu, X.: Creative painting with latent diffusion models. arXiv:2209.14697 (2022)

  54. Xian, W., Sangkloy, P., Agrawal, V., Raj, A., Lu, J., Fang, C., Yu, F., Hays, J.: Texturegan: controlling deep image synthesis with texture patches. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 8456–8465 (2018)

  55. Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)

  56. Yin, L., Chua, A.: Long range constraints for neural texture synthesis using sliced wasserstein loss. arXiv:2211.11137 (2022)

  57. Yu, N., Barnes, C., Shechtman, E., Amirghodsi, S., Lukac, M.: Texture mixer: a network for controllable synthesis and interpolation of texture. In: Proc. Comp. Vis. and Pat. Recog. (CVPR), pp. 12164–12173 (2019)

  58. Zhou, Y., Zhu, Z., Bai, X., Lischinski, D., Cohen-Or, D., Huang, H.: Non-stationary texture synthesis by adversarial expansion. ACM Trans. Graph. (TOG) 37(4), 1–13 (2018)

  59. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2223–2232 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shumeet Baluja.

Ethics declarations

Conflict of interest

There are no potential conflict to disclose. This research was funded by the Author’s Institution, Google, Inc.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 20104 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baluja, S. The infinite doodler: expanding textures within tightly constrained manifolds. Vis Comput 39, 3271–3283 (2023). https://doi.org/10.1007/s00371-023-03025-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03025-3

Keywords

Navigation