Sketch recognition using transfer learning

Sert, Mustafa; Boyacı, Emel

doi:10.1007/s11042-018-7067-1

Sketch recognition using transfer learning

Published: 03 January 2019

Volume 78, pages 17095–17112, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

886 Accesses
Explore all metrics

Abstract

Humans have an excellent ability to recognize freehand sketch drawings despite their abstract and sparse structures. Understanding freehand sketches with automated methods is a challenging task due to the diversity and abstract structures of these sketches. In this paper, we propose an efficient freehand sketch recognition scheme, which is based on the feature-level fusion of Convolutional Neural Networks (CNNs) in the transfer learning context. Specifically, we analyse different layer performances of distinct ImageNet pretrained CNNs and combine best performing layer features within the CNN-SVM pipeline for recognition. We also employ Principal Component Analysis (PCA) to reduce the fused deep feature dimensions to ensure the efficiency of the recognition application on the limited-capacity devices. We perform evaluations on two real sketch benchmark datasets, namely the Sketchy and the TU-Berlin to show the effectiveness of the proposed scheme. Our experimental results show that, the feature-level fusion scheme with the PCA achieves a recognition accuracy of 97.91% and 72.5% on the Sketchy and TU-Berlin datasets, respectively. This result is promising when compared with the human recognition accuracy of 73.1% on the TU-Berlin dataset. We also develop a sketch recognition application for smart devices to demonstrate the proposed scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A Novel Approach of Deep Convolutional Neural Networks for Sketch Recognition

Deep Neural Networks for Free-Hand Sketch Recognition

Sketch-a-Net: A Deep Neural Network that Beats Humans

Article 26 July 2016

References

Angelova A, Krizhevsky A, Vanhoucke V, Ogale A, Ferguson D (2015) Real-time pedestrian detection with deep network cascades
Aihkisalo T, Paaso T (2012) Latencies of service invocation and processing of the REST and SOAP Web service interfaces. In: 2012 IEEE 8th world congress on services. Honolulu, pp 100–107
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv:1701.07875
Boyaci E, Sert M (2017) Feature-level fusion of deep convolutional neural networks for sketch recognition on smartphones. In: Proceedings of IEEE international conference on consumer electronics (ICCE2017), January 8-10, 2017, Las Vegas, Nevada, USA, pp 485–486
Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27:1–27:27
Article Google Scholar
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of British machine vision conference (BMVC)
Chen W, Hays J (2018) SketchyGAN: towards diverse and realistic sketch to image synthesis. arXiv:1801.02753
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of the 30th international conference on neural information processing systems (NIPS’16). Curran Associates Inc., pp 2180–2188
Creswell A, Bharath AA (2016) Adversarial training for sketch retrieval. In: Computer vision - ECCV 2016 workshops, lecture notes in computer science, vol 9913. Springer, Cham, pp 798–809
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proc. IEEE Comput soc conf comput vis pattern recognit (CVPR), pp 886–893
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. IEEE Computer Vision and Pattern Recognition (CVPR)
Denton EL, Chintala S, Fergus T et al (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. In: NIPS
Eitz M, Hildebrand K, Boubekeur T, Alexa M (2011) Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans Visual Comput Graph 17(11):1624–1636
Article Google Scholar
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans Graph 31(4):1–10
Google Scholar
Ergun H, Akyuz YC, Sert M, Liu J (2016) Early and late level fusion of deep convolutional neural networks for visual concept recognition. Int J Semant Comput 10 (03):379–397
Article Google Scholar
Ergun H, Sert M (2016) Fusing deep convolutional networks for large scale visual concept classification. In: IEEE international conference on multimedia big data (BigMM2016)
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S (2014) Generative adversarial nets. In: Advances in neural information processing systems 27. Curran Associates, Inc., pp 2672–2680
Guo J, Gould S (2015) Deep CNN ensemble with data augmentation for object detection. arXiv:1506.07224
Guo J, Wang C, Roman-Rangel E, Chao H, Rui Y (2016) Building hierarchical representations for oracle character and sketch recognition. IEEE Transactions on Image Processing (TIP)
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). Honolulu, pp 5967–5976
Jahani-Fariman H, Kavakli M, Boyali A (2018) MATRACK: block sparse Bayesian learning for a sketch recognition approach. Multimed Tools Appl 77 (2):1997–2012
Article Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
Jolliffe L (1986) Principal component analysis. Springer, New York
Book MATH Google Scholar
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, pp 1725–1732
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst, 1097–1105
LeCun YA, Bottou L, Müller K R, Orr GB (2012) Efficient BackProp. In: Montavon G, Orr GB, Müller KR (eds) Neural networks: tricks of the trade. Lecture notes in computer science, vol 7700, pp 9–48
Li Y, Hospedales TM, Song YZ, Gong S (2015) Free-hand sketch recognition by multi-kernel feature learning. Comput Vis Image Underst 137(C):1–11
Google Scholar
Li Y, Song Y, Gong S (2017) Sketch recognition by ensemble matching of structured features. In: BMVC
Liu K, Sun Z, Song M, et al. (2017) Iterative samples labeling for sketch recognition. Multimed Tools Appl 76(10):12819–12852
Article Google Scholar
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of features image classification. In: Computer vision - ECCV. Springer, New York, pp 490–503
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Article MATH Google Scholar
Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Adv. Large margin classifiers. MIT Press, pp 61–74
Qian Y, Yongxin Y, Yi-Zhe S, Xiang T, Hospedales TM (2015) Sketch-a-net that beats humans. In: Proceedings of the British machine vision conference 2015, (BMVC), pp 1–12
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition workshops (CVPRW ’14). IEEE Computer Society, Washington, DC, pp 512–519
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: NIPS
Sangkloy P, Burnell N, Ham C, Hays J (2016) The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans Graph 35(4):119:1–119:12
Article Google Scholar
Sarvadevabhatla RK, Babu RV (2015) Freehand sketch recognition using deep features. arXiv:http://arXiv.org/abs/1502.00254
Schneider RG, Tuytelaars T (2014) Sketch classification and classification-driven analysis using fisher vectors. ACM Trans Graph 33(6):1–9
Article Google Scholar
Seddati O, Dupont S, Mahmoudi S (2017) DeepSketch 3 analyzing deep neural networks features for better sketch recognition and sketch-based image retrieval. Multimed Tools Appl 76(21):22333–22359
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:http://arXiv.org/abs/1409.1556
Snoek CGM, Worring M, Smeulders AWM (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on multimedia, pp 399–402
Srinivas S, Ravi Sarvadevabhatla K, Mopuri KR, Prabhu N, Kruthiventi S, Babu RV (2016) A taxonomy of deep convolutional neural nets for computer vision. Front Robot AI, 2(36)
Szegedy C, Liu W, Yangqing J, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
Tseng KY, Lin YL, Chen YH, Hsu WH (2012) Sketch-based image retrieval on mobile devices using compact hash bits. In: Proceedings of the 20th ACM international conference on multimedia. ACM, pp 913–916
Wagh K, Thool R (2012) A comparative study of SOAP vs REST web services provisioning techniques for mobile host. J Inf Eng Appl 2(5):12–16. ISSN 2224-5782 (print), ISSN 2225-0506 (online)
Google Scholar
Wang L, Sindagi V, Patel V (2018) High-quality facial photo-sketch synthesis using multi-adversarial networks. In: 13th IEEE international conference on automatic face & gesture recognition (FG 2018). Xi’an, pp 83–90
Wu S, Yang H, Zheng S, et al. (2017) Motion sketch based crowd video retrieval. Multimed Tools Appl 76(19):20167–20195
Article Google Scholar
Xiao C, Wang C, Zhang L (2015) PPTLens: create digital objects with sketch images. ACM Conference on Multimedia
Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: unsupervised dual learning for image-to-image translation. In: 2017 IEEE international conference on computer vision (ICCV). Venice, pp 2868–2876
Yoo D, Park S, Lee J-Y, Kweon IS (2014) Fisher kernel for deep neural activations. arXiv:http://arXiv.org/abs/1412.1628
Zhou T, Krähenbühl P, Aubry M, Huang Q, Efros AA (2016) Learning dense correspondence via 3D-guided cycle consistency. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, pp 117–126
Zhu J-Y, Krähenbühl P, Shechtman E, Efros AA (2016) Generative visual manipulation on the natural image manifold. In: ECCV
Zhu J-Y, Park T, Isola P, Efros A A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242–2251

Download references

Acknowledgments

The authors thank Berkay Selbes for running the feature extraction time experiments.

Author information

Authors and Affiliations

Department of Computer Engineering, Başkent University, 06790, Ankara, Turkey
Mustafa Sert & Emel Boyacı

Authors

Mustafa Sert
View author publications
You can also search for this author inPubMed Google Scholar
Emel Boyacı
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Mustafa Sert.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sert, M., Boyacı, E. Sketch recognition using transfer learning. Multimed Tools Appl 78, 17095–17112 (2019). https://doi.org/10.1007/s11042-018-7067-1

Download citation

Received: 06 April 2018
Revised: 31 October 2018
Accepted: 11 December 2018
Published: 03 January 2019
Issue Date: 30 June 2019
DOI: https://doi.org/10.1007/s11042-018-7067-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sketch recognition using transfer learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Approach of Deep Convolutional Neural Networks for Sketch Recognition

Deep Neural Networks for Free-Hand Sketch Recognition

Sketch-a-Net: A Deep Neural Network that Beats Humans

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now