skip to main content
10.1145/3503161.3547810acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cloud2Sketch: Augmenting Clouds with Imaginary Sketches

Published: 10 October 2022 Publication History

Abstract

Have you ever looked up at the sky and imagined what the clouds look like? In this work, we present an interesting task that augments clouds in the sky with imagined sketches. Different from generic image-to-sketch translation tasks, unique challenges are introduced: real-world clouds have different levels of similarity to something; sketch generation without sketch retrieval could lead to something unrecognizable; a retrieved sketch from some dataset cannot be directly used because of the mismatch of the shape; an optimal sketch imagination is subjective. We propose Cloud2Sketch, a novel self-supervised pipeline to tackle the aforementioned challenges. First, we pre-process cloud images with a cloud detector and a thresholding algorithm to obtain cloud contours. Then, cloud contours are passed through a retrieval module to retrieve sketches with similar geometrical shapes. Finally, we adopt a novel sketch translation model with built-in free-form deformation for aligning the sketches to cloud contours. To facilitate training, an icon-based sketch collection named Sketchy Zoo is proposed. Extensive experiments validate the effectiveness of our method both qualitatively and quantitatively.

Supplementary Material

M4V File (MM22-fp297.m4v)
Presentation video

References

[1]
EC Barrett and Colin K Grant. 1976. The identification of cloud types in LANDSAT MSS images. Technical Report.
[2]
Fred L. Bookstein. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on pattern analysis and machine intelligence (TPAMI) 11, 6 (1989), 567--585.
[3]
Randy L Buckner, Jessica R Andrews-Hanna, and Daniel L Schacter. 2008. The brain's default network: anatomy, function, and relevance to disease. Annals of the new York Academy of Sciences 1124, 1 (2008), 1--38.
[4]
Caroline Chan, Fredo Durand, and Phillip Isola. 2022. Learning to generate line drawings that convey geometry and semantics. arXiv preprint arXiv:2203.12691 (2022).
[5]
Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).
[6]
Herbert James Clark. 1965. Recognition memory for random shapes as a function of complexity, association value, and delay. Journal of Experimental Psychology 69, 6 (1965), 590.
[7]
Soumyabrata Dev, Florian M Savoy, Yee Hui Lee, and Stefan Winkler. 2017. Nighttime sky/cloud image segmentation. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 345--349.
[8]
Mathias Eitz, James Hays, and Marc Alexa. 2012. How do humans sketch objects? ACM Transactions on graphics (TOG) 31, 4 (2012), 1--10.
[9]
Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. 2019. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations (ICLR).
[10]
Aaron Gross and Anne Vallely. 2012. Animals and the human imagination: a companion to animal studies. Columbia University Press.
[11]
David Ha and Douglas Eck. 2017. A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477 (2017).
[12]
Jessica B Hamrick. 2019. Analogues of mental simulation and imagination in deep learning. Current Opinion in Behavioral Sciences 29 (2019), 8--16.
[13]
Rana Hanocka, Noa Fish, Zhenhua Wang, Raja Giryes, Shachar Fleishman, and Daniel Cohen-Or. 2018. Alignet: Partial-shape agnostic alignment via unsupervised learning. ACM Transactions on Graphics (TOG) 38, 1 (2018), 1--14.
[14]
Matthias Harders and Gabor Szekely. 2003. Enhancing human-computer interaction in medical segmentation. Proc. IEEE 91, 9 (2003), 1430--1442.
[15]
Demis Hassabis, Dharshan Kumaran, Christopher Summerfield, and Matthew Botvinick. 2017. Neuroscience-inspired artificial intelligence. Neuron 95, 2 (2017), 245--258.
[16]
Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2021. Masked autoencoders are scalable vision learners. arXiv preprint arXiv:2111.06377 (2021).
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 770--778.
[18]
Xiaodi Hou, Alan Yuille, and Christof Koch. 2013. Boundary detection benchmarking: Beyond f-measures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2123--2130.
[19]
Ming-Kuei Hu. 1962. Visual pattern recognition by moment invariants. IRE transactions on information theory 8, 2 (1962), 179--187.
[20]
World international organization. 1987. International Cloud Atlas Vol 2. World Meteorological Organization.
[21]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-toimage translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1125--1134.
[22]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. 2015. Spatial transformer networks. Advances in neural information processing systems 28 (2015).
[23]
Mehmet Kesim and Yasin Ozarslan. 2012. Augmented reality in education: current technologies and the potential for education. Procedia-social and behavioral sciences 47 (2012), 297--302.
[24]
Michael R LaChat. 1986. Artificial intelligence and ethics: an exercise in the moral imagination. Ai Magazine 7, 2 (1986), 70--70.
[25]
Mengtian Li, Zhe Lin, Radomir Mech, Ersin Yumer, and Deva Ramanan. 2019. Photo-sketching: Inferring contour drawings from images. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1403--1412.
[26]
Qingyong Li, Weitao Lu, and Jun Yang. 2011. A hybrid thresholding algorithm for cloud detection on ground-based color images. Journal of atmospheric and oceanic technology 28, 10 (2011), 1286--1296.
[27]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.
[28]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision (ECCV). Springer, 740--755.
[29]
Fang Liu, Xiaoming Deng, Yu-Kun Lai, Yong-Jin Liu, Cuixia Ma, and Hongan Wang. 2019. Sketchgan: Joint sketch completion and recognition with generative adversarial network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5830--5839.
[30]
Charles N Long, Jeff M Sabburg, Josep Calbó, and David Pagès. 2006. Retrieving cloud characteristics from ground-based daytime color all-sky images. Journal of Atmospheric and Oceanic Technology 23, 5 (2006), 633--652.
[31]
Sridhar Mahadevan. 2018. Imagination machines: A new challenge for artificial intelligence. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 32.
[32]
Zahid Mahmood, Tauseef Ali, Nazeer Muhammad, Nargis Bibi, Imran Shahzad, and Shoaib Azmat. 2017. EAR: Enhanced augmented reality system for sports entertainment applications. KSII Transactions on Internet and Information Systems (TIIS) 11, 12 (2017), 6069--6091.
[33]
D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. In Proc. 8th Int'l Conf. Computer Vision, Vol. 2. 416--423.
[34]
Florian A Potra, Xing Liu, Francoise Seillier-Moiseiwitsçh, Anindya Roy, Yaming Hang, Mark R Marten, Babu Raman, and Carol Whisnant. 2006. Protein image alignment via piecewise affine transformations. Journal of Computational Biology 13, 3 (2006), 614--630.
[35]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML). PMLR, 8748--8763.
[36]
Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, and James Hays. 2016. The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--12.
[37]
Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, and Ran He. 2021. Everything's Talkin': Pareidolia Face Reenactment. arXiv preprint arXiv:2104.03061 (2021).
[38]
Qianqian Song, Zhihui Cui, and Pu Liu. 2020. An Efficient Solution for Semantic Segmentation of Three Ground-based Cloud Datasets. Earth and Space Science 7, 4 (2020), e2019EA001040.
[39]
X. Soria, E. Riba, and A. Sappa. 2020. Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE Computer Society, Los Alamitos, CA, USA, 1912--1921.
[40]
Lidan Wang, Vishwanath Sindagi, and Vishal Patel. 2018. High-quality facial photo-sketch synthesis using multi-adversarial networks. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG). IEEE, 83--90.
[41]
Xiaogang Wang and Xiaoou Tang. 2008. Face photo-sketch synthesis and recognition. IEEE transactions on pattern analysis and machine intelligence (TPAMI) 31, 11 (2008), 1955--1967.
[42]
Yu-Xiong Wang, Ross Girshick, Martial Hebert, and Bharath Hariharan. 2018. Low-shot learning from imaginary data. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 7278--7286.
[43]
Peng Xu, Yongye Huang, Tongtong Yuan, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M Hospedales, Zhanyu Ma, and Jun Guo. 2018. Sketchmate: Deep hashing for million-scale human sketch retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 8090--8098.
[44]
Meijuan Ye, Shizhe Zhou, and Hongbo Fu. 2019. DeepShapeSketch: Generating hand drawing sketches from 3D objects. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.
[45]
Hua Zhang, Si Liu, Changqing Zhang, Wenqi Ren, Rui Wang, and Xiaochun Cao. 2016. Sketchnet: Sketch classification with web images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1105--1113.
[46]
Hua Zhang, Peng She, Yong Liu, Jianhou Gan, Xiaochun Cao, and Hassan Foroosh. 2019. Learning structural representations via dynamic object landmarks discovery for sketch recognition and retrieval. IEEE Transactions on Image Processing 28, 9 (2019), 4486--4499.
[47]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (ICCV). 2223--2232.

Index Terms

  1. Cloud2Sketch: Augmenting Clouds with Imaginary Sketches

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cloud augmentation
      2. image-to-sketch generation
      3. shape alignment
      4. sketch synthesis

      Qualifiers

      • Research-article

      Conference

      MM '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 147
        Total Downloads
      • Downloads (Last 12 months)39
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media