Skip to main content
Log in

Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Sketch segmentation and labeling face two challenges: few samples and few features. 3D data-driven methods use additional labeled 3D meshes to increase samples. However, they are not feasible for the abstract sketches that have no corresponding 3D meshes. And handcrafted feature based methods, although need no 3D meshes, are sensitive to various strokes. To address the challenges, we explore transfer learning based on convolutional neural network (CNN) by fine-tuning a pre-trained CNN to classify strokes for sketch segmentation. We propose a novel informative input for the CNN, making the position information of strokes clear. To improve fine-tuning during transfer learning, we propose to add grouped filter layers to the CNN, making the CNN’s representational capacity incremental. Compared with the state-of-arts, our experimental results achieve 9.7% improvement on the abstract sketch dataset, and 2% improvement on the sketch dataset that has corresponding 3D meshes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The two datasets are available on the https://dl.acm.org/citation.cfm?id=2898351&preflayout=flatwebsite.

References

  1. Chu B, Madhavan V, Beijbom O, Hoffman J, Darrell T (2016) Best practices for fine-tuning visual classifiers to new domains. In: ECCV Workshops, pp 435–442

  2. Ding Z, Fu Y (2018) Deep transfer low-rank coding for cross-domain learning. IEEE Trans Neural Netw Learn Syst 30(6):1768–1779

    Article  MathSciNet  Google Scholar 

  3. Eitz M, Hays J, Alexa M (2012) How do humans sketch objects?. ACM Trans Graph 31(4):44:1–44:10

    Google Scholar 

  4. Fan L, Wang R, Xu L, Deng J, Liu L (2013) Modeling by drawing with shadow guidance. Comput Graph Forum 32(7):157–166

    Article  Google Scholar 

  5. Girshick RB, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587

  6. Hariharan B, Arbeláez PA, Girshick RB, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In: CVPR, pp 447–456

  7. He JY, Wu X, Jiang YG, Zhao B, Peng Q (2017) Sketch recognition with deep visual-sequential fusion model. In: ACM Multimedia. ACM, pp 448–456

  8. Hinton GE, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  Google Scholar 

  9. Hoffman J, Tzeng E, Park T, Zhu J, Isola P, Saenko K, Efros AA, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: ICML, pp 1994–2003

  10. Huang Z, Fu H, Lau RW (2014) Data-driven segmentation and labeling of freehand sketches. ACM Trans Graph 33(6):175:1–175:10

    Article  Google Scholar 

  11. Huh M, Agrawal P, Efros AA (2016) What makes imagenet good for transfer learning?. arXiv:1608.08614

  12. Ioannou Y, Robertson D, Cipolla R, Criminisi A (2017) Deep roots: improving cnn efficiency with hierarchical filter groups. In: CVPR, pp 1231–1240

  13. Kim B, Wang O, Öztireli AC, Gross M (2018) Semantic segmentation for line drawing vectorization using neural networks. Comput Graph Forum 37(2):329–338

    Article  Google Scholar 

  14. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105

  15. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML. Morgan Kaufmann, San Francisco, pp 282–289

  16. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  17. Li B, Lu Y, Johan H, Fares R (2017) Sketch-based 3d model retrieval utilizing adaptive view clustering and semantic information. Multimed Tool Appl 76 (24):26603–26631

    Article  Google Scholar 

  18. Li C, Pan H, Liu Y, Tong X, Sheffer A, Wang W (2018) Robust flow-guided neural prediction for sketch-based freeform surface modeling. ACM Trans Graph 37(6):238:1–238:12

    Google Scholar 

  19. Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: A generalized framework for domain adaptation. IEEE Trans Cybern 49 (6):2144–2155

    Article  Google Scholar 

  20. Li J, Jing M, Lu K, Zhu L, Shen HT (2019) Locality preserving joint transfer for domain adaptation. IEEE Trans Image Process 28(12):6103–6115

    Article  MathSciNet  Google Scholar 

  21. Li J, Lu K, Huang Z, Zhu L, Shen HT (2019) Heterogeneous domain adaptation through progressive alignment. IEEE Trans Neural Netw Learning Syst 30 (5):1381–1391

    Article  MathSciNet  Google Scholar 

  22. Li L, Fu H, Tai CL (2019) Fast sketch segmentation and labeling with deep learning. IEEE Comput Graph Appl 39(2):38–51

    Article  Google Scholar 

  23. Li SZ (1994) Markov random field models in computer vision. In: ECCV. Springer, pp 361–370

  24. Li Y, Lei H, Lin S, Luo G (2018) A new sketch-based 3d model retrieval method by using composite features. Multimed Tool Appl 77(2):2921–2944

    Article  Google Scholar 

  25. Lowe DG (1999) Object recognition from local scale-invariant features. In: ICCV. IEEE, pp 1150–1157

  26. Mou L, Meng Z, Yan R, Li G, Xu Y, Zhang L, Jin Z (2016) How transferable are neural networks in nlp applications?. In: EMNLP, pp 479–489

  27. Noris G, Sỳkora D, Shamir A, Coros S, Whited B, Simmons M, Hornung A, Gross M, Sumner R (2012) Smart scribbles for sketch segmentation. Comput Graph Forum 31(8):2516–2527

    Article  Google Scholar 

  28. Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: CVPR, pp 1717–1724

  29. Qi Y, Guo J, Li Y, Zhang H, Xiang T, Song YZ (2013) Sketching by perceptual grouping. In: ICIP. IEEE, pp 270–274

  30. Qi Y, Song YZ, Xiang T, Zhang H, Hospedales T, Li Y, Guo J (2015) Making better use of edges via perceptual grouping. In: CVPR, pp 1856–1865

  31. Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: CVPR Workshops, pp 512–519

  32. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  33. Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks. arXiv:1606.04671

  34. Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105(3):222–245

    Article  MathSciNet  Google Scholar 

  35. Sangkloy P, Burnell N, Ham C, Hays J (2016) The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics (TOG) 35 (4):119

    Article  Google Scholar 

  36. Sangkloy P, Lu J, Fang C, Yu F, Hays J (2017) Scribbler: controlling deep image synthesis with sketch and color. In: CVPR, pp 5400–5409

  37. Sankaranarayanan S, Balaji Y, Jain A, Lim S, Chellappa R (2018) Learning from synthetic data: addressing domain shift for semantic segmentation. In: CVPR, pp 3752–3761

  38. Sarvadevabhatla RK, Dwivedi I, Biswas A, Manocha S et al (2017) Sketchparse: towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: ACM Multimedia. ACM, pp 10–18

  39. Schneider RG, Tuytelaars T (2016) Example-based sketch segmentation and labeling using crfs. ACM Trans Graph 35(5):151:1–151:9

    Article  Google Scholar 

  40. Seddati O, Dupont S, Mahmoudi S (2017) Deepsketch 3. Multimed Tool Appl 76(21):22333–22359

    Article  Google Scholar 

  41. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR

  42. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: wa simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  43. Sun Z, Wang C, Zhang L, Zhang L (2012) Free hand-drawn sketch segmentation. In: ECCV. Springer, pp 626–639

  44. Tan G, Chen H, Qi J (2016) A novel image matting method using sparse manual clicks. Multimed Tool Appl 75(17):10213–10225

    Article  Google Scholar 

  45. Wan L, Xiao Y, Dou N, Leung CS, Lai YK (2018) Scribble-based gradient mesh recoloring. Multimed Tool Appl 77(11):13753–13771

    Article  Google Scholar 

  46. Wang W, Chen Z, Liu J, Qi Q, Zhao Z (2012) User-based collaborative filtering on cross domain by tag transfer learning. In: Proceedings of the 1st International Workshop on Cross Domain Knowledge Discovery in Web and Social Network Mining. ACM, pp 10–17

  47. Wang YX, Ramanan D, Hebert M (2017) Growing a brain: fine-tuning by increasing model capacity. In: CVPR, pp 2471–2480

  48. Xie G, Wang J, Zhang T, Lai J, Hong R, Qi GJ (2018) Interleaved structured sparse convolutional neural networks. In: CVPR, pp 8847–8856

  49. Xu B, Chang W, Sheffer A, Bousseau A, McCrae J, Singh K (2014) True2form: 3d curve networks from 2d sketches via selective regularization. ACM Trans Graph 33(4):1–13

    Google Scholar 

  50. Xu K, Chen K, Fu H, Sun WL, Hu SM (2013) Sketch2scene: sketch-based co-retrieval and co-placement of 3d models. ACM Transactions on Graphics (TOG) 32 (4):123:1–123:15

    Article  Google Scholar 

  51. Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: ICCV, pp 1215–1223

  52. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: NIPS, pp 3320–3328

  53. Yu Q, Yang Y, Liu F, Song YZ, Xiang T, Hospedales TM (2017) Sketch-a-net: a deep neural network that beats humans. Int J Comput Vis 122(3):411–425

    Article  MathSciNet  Google Scholar 

  54. Zhang T, Qi GJ, Xiao B, Wang J (2017) Interleaved group convolutions. In: ICCV. IEEE, pp 4383–4392

  55. Zhang Y, David P, Gong B (2017) Curriculum domain adaptation for semantic segmentation of urban scenes. In: ICCV, pp 2039–2049

  56. Zheng L, Zhao Y, Wang S, Wang J, Tian Q, 2016 Good practice in CNN feature transfer. arXiv:1604.00133

  57. Zhou S, Zhou C, Xiao Y, Tan G (2018) Patchswapper: a novel real-time single-image editing technique by region-swapping. Comput Graph 73:80–87

    Article  Google Scholar 

  58. Tan G, Zhu X, Liu X (2017) A free shape 3d modeling system for creative design based on modified catmull-clark subdivision. Multimedia Tools Appl 76(5):6429–6446

    Article  Google Scholar 

  59. Zheng Y, Cao X, Xiao Y, Zhu X, Yuan J (2019) Joint residual pyramid for joint image super-resolution. J. Visual Communication and Image Representation 58:53–62

    Article  Google Scholar 

  60. Tan G, Zhang Q, Zhu X, Hu H, Wu X (2020) Fingerprint Liveness Detection based on Guided Filtering and Hybrid Image Analysis. IET Image Processing 1–7. https://doi.org/10.1007/s00158-007-0093-7

Download references

Acknowledgments

The work is supported by the National Key R&D Program of China (2018YFB0203904), NSFC from PRC (61872137, 61502158, 61502157, 61472131, 61772191), Hunan NSF (2017JJ3042), and Science and Technology Key Projects of Hunan Province (2015TP1004, 2015SK2087, 2015JC1001, 2016JC2012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Yuan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, X., Yuan, J., Xiao, Y. et al. Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16. Multimed Tools Appl 79, 33891–33906 (2020). https://doi.org/10.1007/s11042-020-08706-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08706-y

Keywords

Navigation