SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches

Liu, Fang; Zou, Changqing; Deng, Xiaoming; Zuo, Ran; Lai, Yu-Kun; Ma, Cuixia; Liu, Yong-Jin; Wang, Hongan

doi:10.1007/978-3-030-58529-7_42

Fang Liu^12,13,
Changqing Zou¹⁴,
Xiaoming Deng¹²,
Ran Zuo^12,13,
Yu-Kun Lai¹⁵,
Cuixia Ma^12,13,
Yong-Jin Liu¹⁶ &
…
Hongan Wang^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12364))

Included in the following conference series:

European Conference on Computer Vision

3417 Accesses
19 Citations

Abstract

Sketch-based image retrieval (SBIR) has been a popular research topic in recent years. Existing works concentrate on mapping the visual information of sketches and images to a semantic space at the object level. In this paper, for the first time, we study the fine-grained scene-level SBIR problem which aims at retrieving scene images satisfying the user’s specific requirements via a freehand scene sketch. We propose a graph embedding based method to learn the similarity measurement between images and scene sketches, which models the multi-modal information, including the size and appearance of objects as well as their layout information, in an effective manner. To evaluate our approach, we collect a dataset based on SketchyCOCO and extend the dataset using Coco-stuff. Comprehensive experiments demonstrate the significant potential of the proposed approach on the application of fine-grained scene-level image retrieval.

F. Liu and C. Zou – Equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

https://code.google.com/archive/p/word2vec/
https://www.cityscapes-dataset.com/benchmarks/
Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. In: Advances in Neural Information Processing Systems, pp. 831–837 (2001)
Google Scholar
Bui, T., Ribeiro, L., Ponti, M., Collomosse, J.: Sketching out the details: sketch-based image retrieval using convolutional neural networks with multi-stage regression. Comput. Graph. 71, 77–87 (2018)
Article Google Scholar
Caesar, H., Uijlings, J., Ferrari, V.: Coco-stuff: thing and stuff classes in context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1209–1218 (2018)
Google Scholar
Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–768 (2011)
Google Scholar
Castrejon, L., Aytar, Y., Vondrick, C., Pirsiavash, H., Torralba, A.: Learning aligned cross-modal representations from weakly aligned data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2940–2949 (2016)
Google Scholar
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: internet image montage. In: ACM Transactions on Graphics (TOG), vol. 28, p. 124 (2009)
Google Scholar
Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5177–5186 (2019)
Google Scholar
Dey, S., Dutta, A., Ghosh, S.K., Valveny, E., Lladós, J., Pal, U.: Learning cross-modal deep embeddings for multi-object image retrieval using text and sketch. In: 24th International Conference on Pattern Recognition, pp. 916–921 (2018)
Google Scholar
Dey, S., Riba, P., Dutta, A., Llados, J., Song, Y.Z.: Doodle to search: practical zero-shot sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2179–2188 (2019)
Google Scholar
Dutta, A., Akata, Z.: Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5089–5098 (2019)
Google Scholar
Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. (TOG) 31(4), 1–10 (2012)
Google Scholar
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: An evaluation of descriptors for large-scale image retrieval from sketched feature lines. Comput. Graph. 34(5), 482–498 (2010)
Article Google Scholar
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Visual Comput. Graph. 17(11), 1624–1636 (2010)
Article Google Scholar
Gao, C., Liu, Q., Xu, Q., Wang, L., Liu, J., Zou, C.: SketchyCOCO: image generation from freehand scene sketches. In: Proceedings of the European Conference on Computer Vision, pp. 5174–5183 (2020)
Google Scholar
Guo, M., Chou, E., Huang, D.A., Song, S., Yeung, S., Fei-Fei, L.: Neural graph matching networks for fewshot 3D action recognition. In: Proceedings of the European Conference on Computer Vision, pp. 653–669 (2018)
Google Scholar
Ha, D., Eck, D.: A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477 (2017)
Hu, R., Barnard, M., Collomosse, J.: Gradient field descriptor for sketch based retrieval and localization. In: IEEE International Conference on Image Processing, pp. 1025–1028 (2010)
Google Scholar
Hu, R., Collomosse, J.: A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Comput. Vis. Image Underst. 117(7), 790–806 (2013)
Article Google Scholar
Khan, N., Chaudhuri, U., Banerjee, B., Chaudhuri, S.: Graph convolutional network for multi-label VHR remote sensing scene recognition. Neurocomputing 357, 36–46 (2019)
Article Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Liu, L., Shen, F., Shen, Y., Liu, X., Shao, L.: Deep sketch hashing: fast free-hand sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2871 (2017)
Google Scholar
Pang, K., et al.: Generalising fine-grained sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 677–686 (2019)
Google Scholar
Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graph. (TOG) 35(4), 1–12 (2016)
Article Google Scholar
Song, J., Song, Y.Z., Xiang, T., Hospedales, T.M., Ruan, X.: Deep multi-task attribute-driven ranking for fine-grained sketch-based image retrieval. In: BMVC, vol. 1, p. 3 (2016)
Google Scholar
Song, J., Yu, Q., Song, Y.Z., Xiang, T., Hospedales, T.M.: Deep spatial-semantic attention for fine-grained sketch-based image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5551–5560 (2017)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Tolias, G., Chum, O.: Asymmetric feature maps with application to sketch based retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2377–2385 (2017)
Google Scholar
Tripathi, S., Sridhar, S.N., Sundaresan, S., Tang, H.: Compact scene graphs for layout composition and patch retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 676–683 (2019)
Google Scholar
Wang, R., Yan, J., Yang, X.: Learning combinatorial embedding networks for deep graph matching. arXiv preprint arXiv:1904.00597 (2019)
Xie, Y., Xu, P., Ma, Z.: Deep zero-shot learning for scene sketch. arXiv preprint arXiv:1905.04510 (2019)
Xu, P.: Deep learning for free-hand sketch: a survey. arXiv preprint arXiv:2001.02600 (2020)
Xu, P., et al.: SketchMate: deep hashing for million-scale human sketch retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8090–8098 (2018)
Google Scholar
Yu, A., Grauman, K.: Fine-grained visual comparisons with local learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 192–199 (2014)
Google Scholar
Yu, Q., Liu, F., Song, Y.Z., Xiang, T., Hospedales, T.M., Loy, C.C.: Sketch me that shoe. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 799–807 (2016)
Google Scholar
Yu, Q., Yang, Y., Liu, F., Song, Y.Z., Xiang, T., Hospedales, T.M.: Sketch-a-Net: a deep neural network that beats humans. Int. J. Comput. Vis. 122(3), 411–425 (2017)
Article MathSciNet Google Scholar
Zhang, J., et al.: Generative domain-migration hashing for sketch-to-image retrieval. In: Proceedings of the European Conference on Computer Vision, pp. 297–314 (2018)
Google Scholar
Zhang, T., Liu, B., Niu, D., Lai, K., Xu, Y.: Multiresolution graph attention networks for relevance matching. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 933–942 (2018)
Google Scholar
Zou, C., et al.: SketchyScene: richly-annotated scene sketches. In: Proceedings of the European Conference on Computer Vision, pp. 421–436 (2018)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Key Research and Development Plan (2016YFB1001200), Natural Science Foundation of China (61872346, 61725204, 61473276), Natural Science Foundation of Beijing (L182052), and Royal Society-Newton Advanced Fellowship (NA150431).

Author information

Authors and Affiliations

State Key Laboratory of Computer Science and Beijing Key Lab of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences, Beijing, China
Fang Liu, Xiaoming Deng, Ran Zuo, Cuixia Ma & Hongan Wang
University of Chinese Academy of Sciences, Beijing, China
Fang Liu, Ran Zuo, Cuixia Ma & Hongan Wang
HMI Laboratory, Huawei Technologies, Shenzhen, China
Changqing Zou
Cardiff University, Cardiff, Wales
Yu-Kun Lai
Tsinghua University, Beijing, China
Yong-Jin Liu

Authors

Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Changqing Zou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Deng
View author publications
You can also search for this author in PubMed Google Scholar
Ran Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Kun Lai
View author publications
You can also search for this author in PubMed Google Scholar
Cuixia Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Jin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiaoming Deng , Cuixia Ma or Yong-Jin Liu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 19204 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, F. et al. (2020). SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12364. Springer, Cham. https://doi.org/10.1007/978-3-030-58529-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-58529-7_42
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58528-0
Online ISBN: 978-3-030-58529-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics