Abstract
Sketch segmentation and labeling face two challenges: few samples and few features. 3D data-driven methods use additional labeled 3D meshes to increase samples. However, they are not feasible for the abstract sketches that have no corresponding 3D meshes. And handcrafted feature based methods, although need no 3D meshes, are sensitive to various strokes. To address the challenges, we explore transfer learning based on convolutional neural network (CNN) by fine-tuning a pre-trained CNN to classify strokes for sketch segmentation. We propose a novel informative input for the CNN, making the position information of strokes clear. To improve fine-tuning during transfer learning, we propose to add grouped filter layers to the CNN, making the CNN’s representational capacity incremental. Compared with the state-of-arts, our experimental results achieve 9.7% improvement on the abstract sketch dataset, and 2% improvement on the sketch dataset that has corresponding 3D meshes.
Similar content being viewed by others
Notes
The two datasets are available on the https://dl.acm.org/citation.cfm?id=2898351&preflayout=flatwebsite.
References
Chu B, Madhavan V, Beijbom O, Hoffman J, Darrell T (2016) Best practices for fine-tuning visual classifiers to new domains. In: ECCV Workshops, pp 435–442
Ding Z, Fu Y (2018) Deep transfer low-rank coding for cross-domain learning. IEEE Trans Neural Netw Learn Syst 30(6):1768–1779
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects?. ACM Trans Graph 31(4):44:1–44:10
Fan L, Wang R, Xu L, Deng J, Liu L (2013) Modeling by drawing with shadow guidance. Comput Graph Forum 32(7):157–166
Girshick RB, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587
Hariharan B, Arbeláez PA, Girshick RB, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In: CVPR, pp 447–456
He JY, Wu X, Jiang YG, Zhao B, Peng Q (2017) Sketch recognition with deep visual-sequential fusion model. In: ACM Multimedia. ACM, pp 448–456
Hinton GE, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hoffman J, Tzeng E, Park T, Zhu J, Isola P, Saenko K, Efros AA, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: ICML, pp 1994–2003
Huang Z, Fu H, Lau RW (2014) Data-driven segmentation and labeling of freehand sketches. ACM Trans Graph 33(6):175:1–175:10
Huh M, Agrawal P, Efros AA (2016) What makes imagenet good for transfer learning?. arXiv:1608.08614
Ioannou Y, Robertson D, Cipolla R, Criminisi A (2017) Deep roots: improving cnn efficiency with hierarchical filter groups. In: CVPR, pp 1231–1240
Kim B, Wang O, Öztireli AC, Gross M (2018) Semantic segmentation for line drawing vectorization using neural networks. Comput Graph Forum 37(2):329–338
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML. Morgan Kaufmann, San Francisco, pp 282–289
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li B, Lu Y, Johan H, Fares R (2017) Sketch-based 3d model retrieval utilizing adaptive view clustering and semantic information. Multimed Tool Appl 76 (24):26603–26631
Li C, Pan H, Liu Y, Tong X, Sheffer A, Wang W (2018) Robust flow-guided neural prediction for sketch-based freeform surface modeling. ACM Trans Graph 37(6):238:1–238:12
Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: A generalized framework for domain adaptation. IEEE Trans Cybern 49 (6):2144–2155
Li J, Jing M, Lu K, Zhu L, Shen HT (2019) Locality preserving joint transfer for domain adaptation. IEEE Trans Image Process 28(12):6103–6115
Li J, Lu K, Huang Z, Zhu L, Shen HT (2019) Heterogeneous domain adaptation through progressive alignment. IEEE Trans Neural Netw Learning Syst 30 (5):1381–1391
Li L, Fu H, Tai CL (2019) Fast sketch segmentation and labeling with deep learning. IEEE Comput Graph Appl 39(2):38–51
Li SZ (1994) Markov random field models in computer vision. In: ECCV. Springer, pp 361–370
Li Y, Lei H, Lin S, Luo G (2018) A new sketch-based 3d model retrieval method by using composite features. Multimed Tool Appl 77(2):2921–2944
Lowe DG (1999) Object recognition from local scale-invariant features. In: ICCV. IEEE, pp 1150–1157
Mou L, Meng Z, Yan R, Li G, Xu Y, Zhang L, Jin Z (2016) How transferable are neural networks in nlp applications?. In: EMNLP, pp 479–489
Noris G, Sỳkora D, Shamir A, Coros S, Whited B, Simmons M, Hornung A, Gross M, Sumner R (2012) Smart scribbles for sketch segmentation. Comput Graph Forum 31(8):2516–2527
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: CVPR, pp 1717–1724
Qi Y, Guo J, Li Y, Zhang H, Xiang T, Song YZ (2013) Sketching by perceptual grouping. In: ICIP. IEEE, pp 270–274
Qi Y, Song YZ, Xiang T, Zhang H, Hospedales T, Li Y, Guo J (2015) Making better use of edges via perceptual grouping. In: CVPR, pp 1856–1865
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: CVPR Workshops, pp 512–519
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks. arXiv:1606.04671
Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105(3):222–245
Sangkloy P, Burnell N, Ham C, Hays J (2016) The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics (TOG) 35 (4):119
Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: controlling deep image synthesis with sketch and color. In: CVPR, pp 5400–5409
Sankaranarayanan S, Balaji Y, Jain A, Lim S, Chellappa R (2018) Learning from synthetic data: addressing domain shift for semantic segmentation. In: CVPR, pp 3752–3761
Sarvadevabhatla RK, Dwivedi I, Biswas A, Manocha S et al (2017) Sketchparse: towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: ACM Multimedia. ACM, pp 10–18
Schneider RG, Tuytelaars T (2016) Example-based sketch segmentation and labeling using crfs. ACM Trans Graph 35(5):151:1–151:9
Seddati O, Dupont S, Mahmoudi S (2017) Deepsketch 3. Multimed Tool Appl 76(21):22333–22359
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: wa simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Sun Z, Wang C, Zhang L, Zhang L (2012) Free hand-drawn sketch segmentation. In: ECCV. Springer, pp 626–639
Tan G, Chen H, Qi J (2016) A novel image matting method using sparse manual clicks. Multimed Tool Appl 75(17):10213–10225
Wan L, Xiao Y, Dou N, Leung CS, Lai YK (2018) Scribble-based gradient mesh recoloring. Multimed Tool Appl 77(11):13753–13771
Wang W, Chen Z, Liu J, Qi Q, Zhao Z (2012) User-based collaborative filtering on cross domain by tag transfer learning. In: Proceedings of the 1st International Workshop on Cross Domain Knowledge Discovery in Web and Social Network Mining. ACM, pp 10–17
Wang YX, Ramanan D, Hebert M (2017) Growing a brain: fine-tuning by increasing model capacity. In: CVPR, pp 2471–2480
Xie G, Wang J, Zhang T, Lai J, Hong R, Qi GJ (2018) Interleaved structured sparse convolutional neural networks. In: CVPR, pp 8847–8856
Xu B, Chang W, Sheffer A, Bousseau A, McCrae J, Singh K (2014) True2form: 3d curve networks from 2d sketches via selective regularization. ACM Trans Graph 33(4):1–13
Xu K, Chen K, Fu H, Sun WL, Hu SM (2013) Sketch2scene: sketch-based co-retrieval and co-placement of 3d models. ACM Transactions on Graphics (TOG) 32 (4):123:1–123:15
Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: ICCV, pp 1215–1223
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: NIPS, pp 3320–3328
Yu Q, Yang Y, Liu F, Song YZ, Xiang T, Hospedales TM (2017) Sketch-a-net: a deep neural network that beats humans. Int J Comput Vis 122(3):411–425
Zhang T, Qi GJ, Xiao B, Wang J (2017) Interleaved group convolutions. In: ICCV. IEEE, pp 4383–4392
Zhang Y, David P, Gong B (2017) Curriculum domain adaptation for semantic segmentation of urban scenes. In: ICCV, pp 2039–2049
Zheng L, Zhao Y, Wang S, Wang J, Tian Q, 2016 Good practice in CNN feature transfer. arXiv:1604.00133
Zhou S, Zhou C, Xiao Y, Tan G (2018) Patchswapper: a novel real-time single-image editing technique by region-swapping. Comput Graph 73:80–87
Tan G, Zhu X, Liu X (2017) A free shape 3d modeling system for creative design based on modified catmull-clark subdivision. Multimedia Tools Appl 76(5):6429–6446
Zheng Y, Cao X, Xiao Y, Zhu X, Yuan J (2019) Joint residual pyramid for joint image super-resolution. J. Visual Communication and Image Representation 58:53–62
Tan G, Zhang Q, Zhu X, Hu H, Wu X (2020) Fingerprint Liveness Detection based on Guided Filtering and Hybrid Image Analysis. IET Image Processing 1–7. https://doi.org/10.1007/s00158-007-0093-7
Acknowledgments
The work is supported by the National Key R&D Program of China (2018YFB0203904), NSFC from PRC (61872137, 61502158, 61502157, 61472131, 61772191), Hunan NSF (2017JJ3042), and Science and Technology Key Projects of Hunan Province (2015TP1004, 2015SK2087, 2015JC1001, 2016JC2012).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, X., Yuan, J., Xiao, Y. et al. Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16. Multimed Tools Appl 79, 33891–33906 (2020). https://doi.org/10.1007/s11042-020-08706-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08706-y