Abstract
Sketch-based image retrieval is demanding interest in the computer vision community due to its relevance in the visual perception system and its potential application in a wide diversity of industries. In the literature, we observe significant advances when the models are evaluated in public datasets. However, when assessed in real environments, the performance drops drastically. The big problem is that the SOTA SBIR models follow a supervised regimen, strongly depending on a considerable amount of labeled sketch-photo pairs, which is unfeasible in real contexts. Therefore, we propose SBIR-BYOL, an extension of the well-known BYOL, to work in a bimodal scenario for sketch-based image retrieval. To this end, we also propose a two-stage self-supervised training methodology, exploiting existing sketch-photo pairs and contour-photo pairs generated from photographs of a target catalog. We demonstrate the benefits of our model for the eCommerce environments, where searching is a critical component. Here, our self-supervised SBIR model shows an increase of over \(60\%\) of mAP.














Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hubel DH, Wiesel TN (2004) Brain and Visual Perception: The Story of a 25Year Collaboration, Illustrated. Oxford University Press, London
Walther DB, Chai B, Caddigan E, Beck DM, Fei-Fei L (2011) Simple line drawings suffice for functional mri decoding of natural scene categories. Proceed Natl Acad Sci 108(23):9661–9666
Yu Q, Yang Y, Liu F, Song Y-Z, Xiang T, Hospedales TM (2017) Sketch-a-net: A deep neural network that beats humans. Int J Comput Vis 122:3
Forbus K, Usher J, Lovett A, Lockwood K, Wetzel J (2011) Cogsketch: sketch understanding for cognitive science research and for education. Topi Cognit Sci 3(4):648–666
Mukherjee K, Hawkins RXD, Fan JW (2019) Communicating semantic part information in drawings. In: Goel AK, Seifert CM, Freksa C (eds.) Proceedings of the 41th Annual Meeting of the Cognitive Science Society, CogSci 2019: Creativity + Cognition + Computation, Montreal, Canada. 24-27: 2413–2419
Kearney KS, Hyle AE (2004) Drawing out emotions: the use of participant-produced drawings in qualitative inquiry. Qualitat Res 4(3):361–382
Torres P, Saavedra JM (2021) Compact and effective representations for sketch-based image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, Virtual, June 19-25, 2021, pp. 2115–2123. IEEE
Yu Q, Song J, Song Y-Z, Xiang T, Hospedales TM (2021) Fine-grained instance-level sketch-based image retrieval. Int. J. Comput. Vis 129(2):484–500
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans. Graph. (Proc. SIGGRAPH) 31(4):44–14410
Yu Q, Yang Y, Liu F, Song Y-Z, Xiang T, Hospedales TM (2017) Sketch-a-net: A deep neural network that beats humans. Int J Comput Vis 122(3):411–425
Xu P, Huang Y, Yuan T, Pang K, Song Y-Z, Xiang T, Hospedales TM, Ma Z, Guo J (2018) Sketchmate: Deep hashing for million-scale human sketch retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Xu P, Hospedales TM, Yin Q, Song Y-Z, Xiang T, Wang L (2022) Deep learning for free-hand sketch: A survey. IEEE Transact Patt Analy Mach Intell 1:109
Tripathi A, Dani RR, Mishra A, Chakraborty A (2020) Sketch-guided object localization in natural images. In: Vedaldi, A, Bischof, H, Brox, T, Frahm, J (eds) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VI. Lecture Notes in Computer Science vol 12351 pp 532–547
Bui T, Ribeiro L, Ponti M, Collomosse J (2018) Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression. Comput Graph 71:109
Fuentes A, Saavedra JM (2021) Sketch-qnet: a quadruplet convnet for color sketch-based image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, Virtual, June 19-25, 2021, pp. 2134–2141. IEEE
Murrugarra-Llerena N, Kovashka A (2018) Image retrieval with mixed initiative and multimodal feedback. Brit Mach Vis Confer BMVC 207:103–204
Murrugarra-Llerena N, Kovashka A (2021) Image retrieval with mixed initiative and multimodal feedback. Computer Vision and Image Understanding 207:103204
Collomosse J, McNeill G, Qian Y (2009) Storyboard sketches for content based video retrieval. pp. 245–252
Chen W, Hays J (2018) Sketchygan: towards diverse and realistic sketch to image synthesis. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9416–9425
Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: Controlling deep image synthesis with sketch and color. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6836–6845
Saavedra JM, Barrios JM (2015) Sketch based image retrieval using learned keyshapes (LKS). In: Proceedings of the British Machine Vision Conference 2015, BMVC 2015. Swansea, UK, September 7-10, 2015, pp. 164–116411
Hu R, Collomosse J (2013) A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Comput Vis Image Understand 117(7):790–806
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans. Graph. (Proc. SIGGRAPH) 31(4):44–14410
Hoffmann DL, Standish CD, García-Diez M, Pettitt PB, Milton JA, Zilhão J, Alcolea-González JJ, Cantalejo-Duarte P, Collado H, de Balbín R, Lorblanchet M, Ramos-Muñoz J, Weniger G-C, Pike AWG (2018) U-th dating of carbonate crusts reveals neandertal origin of iberian cave art. Science 359(6378):912–915
Li Y, Xu W (2022) Using cyclegan to achieve the sketch recognition process of sketch-based modeling. In: Yuan, PF, Chai, H, Yan, C, Leach, N (eds) Proceedings of the 2021 DigitalFUTURES. Springer: London pp. 26–34
de Andrade V, Freire S, Baptista M, Shwartz Y (2022) Drawing as a space for social-cognitive interaction. Educat Sci 12:45
Fernandes MA, Wammes JD, Meade ME (2018) The surprisingly powerful influence of drawing on memory. Curr Direct Psycholog Sci 27(5):302–308
Ha D, Eck D (2018) A neural representation of sketch drawings. In: International Conference on Learning Representations. https://openreview.net/forum?id=Hy6GHpkCW
Xu P, Huang Y, Yuan T, Xiang T, Hospedales TM, Song Y-Z, Wang L (2021) On learning semantic representations for large-scale abstract sketches. IEEE Transact Circuits Syst Video Technol 31(9):3366–3379
Morales J, Murrugarra-Llerena N, Saavedra JM (2022) Leveraging unlabeled data for sketch based understanding. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR-SketchDL Workshop. IEEE
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976
Saavedra JM (2014) Sketch based image retrieval using a soft computation of the histogram of edge local orientations (s-helo). In: 2014 IEEE International Conference on Image Processing (ICIP). pp. 2998–3002
Saavedra JM (2017) Rst-shelo: sketch-based image retrieval using sketch tokens and square root normalization. Multimed Tools Appl 76(1):931–951
Canny J (1986) A computational approach to edge detection. IEEE Transact Patt Analy Mach Intell PAMI 8(6):679–698
Lim JJ, Zitnick CL, Dollár P (2013) Sketch tokens: A learned mid-level representation for contour and object detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3158–3165
Saavedra JM, Bustos B (2013) Sketch-based image retrieval using keyshapes. Multimed Tools Appl 73(3):2033–2062
Yu Q, Liu F, Song Y, Xiang T, Hospedales TM, Loy CC (2016) Sketch me that shoe. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 799–807
Sangkloy P, Burnell N, Ham C, Hays J (2016) The sketchy database: Learning to retrieve badly drawn bunnies. ACM Transactions on Graphics (proceedings of SIGGRAPH)
McInnes L, Healy J, Saul N, Großberger L (2018) UMAP: uniform manifold approximation and projection. J Open Sour Soft 3(29):861
Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M, Piot B, kavukcuoglu k, Munos R, Valko M, (2020) Bootstrap your own latent - a new approach to self-supervised learning. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in Neural Information Processing Systems, vol 33. Curran Associates Inc, London, pp 21271–21284
Su Z, Liu W, Yu Z, Hu D, Liao Q, Tian Q, Pietikäinen M, Liu L (2021) Pixel difference networks for efficient edge detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5117–5127
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.The authors have no competing interests to declare that are relevant to the content of this article.All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Saavedra, J.M., Morales, J. & Murrugarra-Llerena, N. SBIR-BYOL: a self-supervised sketch-based image retrieval model. Neural Comput & Applic 35, 5395–5408 (2023). https://doi.org/10.1007/s00521-022-07978-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07978-9