Skip to main content
Log in

Teach machine to learn: hand-drawn multi-symbol sketch recognition in one-shot

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The ability to sequentially learn from few examples and re-utilize previous knowledge is an important milestone on the path to artificial general intelligence. In this paper, we propose Teach Machine to Learn (TML), a few-shot learning model for hand-drawn multi-symbol sketch recognition. The model decomposes multi-symbol sketch into stroke primitives and then explains the observed sequences in a bayesian criterion. A Bidirectional Long Short Term Memory (BiLSTM) encoder is employed for stroke primitives encoding. Meanwhile, a probabilistic Hidden Markov Model (HMM) is constructed for complete sketch inference and recognition. The challenging task of hand-drawn multi-symbol sketch recognition is implemented on two public datasets. The comparative results indicate that the proposed method outperforms the currently booming image-based deep models in recognition accuracy. Furthermore, our method is capable to continuously learn new concepts even in one-shot. The codes are currently available in https://github.com/chongyupan/Teach-Machine-to-Learn.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Silver D, Huang A, Maddison CJ, et al (2016) Mastering the game of Go with deep neural networks and tree search[J]. Nature 529(7587):484–489

    Article  Google Scholar 

  2. Krizhevsky A, Sutskever I, Hinton GE, et al (2012) ImageNet classification with deep convolutional neural networks[J]. Neural Inform Process Syst 141(5):1097–1105

    Google Scholar 

  3. Nguyen A, Yosinski J, Clune J et al (2015) Deep neural networks are easily fooled: High confidence predictions for unrecognizable images[J]. Comput Vis Pattern Recogn, 427–436

  4. Su J, Vargas DV, Sakurai K, et al (2019) One pixel attack for fooling deep neural networks[J]. IEEE Trans Evol Comput, 1–1

  5. Tirkaz C, Yanikoglu B, Sezgin TM, et al (2012) Sketched symbol recognition with auto-completion[J]. Pattern Recogn 45(11):3926–3937

    Article  Google Scholar 

  6. Lake BM, Salakhutdinov R, Tenenbaum JB, et al (2015) Human-level concept learning through probabilistic program induction[J]. Science 350(6266):1332–1338

    Article  MathSciNet  Google Scholar 

  7. Vinyals O, Blundell C, Lillicrap TP et al (2016) Matching networks for one shot learning[J]. Neural Inform Process Syst, 3637–3645

  8. Aljundi R, Chakravarty P, Tuytelaars T et al (2017) Expert gate: lifelong learning with a network of experts[J]. Comput Vis Pattern Recogn, 7120–7129

  9. Ruvolo P, Eaton E (2013) ELLA: an efficient lifelong learning algorithm[C]. In: International conference on machine learning, pp 507–515

  10. George D, Lehrach W, Kansky K, et al (2017) A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs[J]. Science 358:6368

    Article  Google Scholar 

  11. Holzinger A, Kickmeier-Rust M, Müller H (2019) KANDINSKY patterns as IQ-test for machine learning. In: International cross-domain conference for machine learning and knowledge extraction, lecture notes in computer science LNCS 11713. Springer, Canterbury, pp 1–14, DOI https://doi.org/10.1007/978-3-030-29726-8_1

  12. Lecun Y, Bengio Y, Hinton GE, et al (2015) Deep learning[J]. Nature 521(7553):436–444

    Article  Google Scholar 

  13. Olsen L, Samavati F, Sousa MC, et al (2009) Sketch-based modeling: a survey[J]. Comput Graph 33 (1):85–103

    Article  Google Scholar 

  14. Eitz M, Hildebrand K, Boubekeur T, et al (2011) Sketch-based image retrieval: benchmark and bag-of-features descriptors[J]. IEEE Trans Vis Comput Graph 17(11):1624–1636

    Article  Google Scholar 

  15. Hu R, Collomosse J (2013) A performance evaluation of gradient field HOG descriptor for sketch based image retrieval[J]. Comput Vis Image Underst 117(7):790–806

    Article  Google Scholar 

  16. Forbus KD, Usher J, Chapman V et al (2003) Sketching for military courses of action diagrams[C]. Intelligent User Interfaces, 61–68. https://doi.org/10.1145/604045.604059

  17. Hammond T, Logsdon D, Paulson B et al (2010) A sketch recognition system for recognizing free-hand course of action diagrams[C]. Innovative Applications of Artificial Intelligence

  18. Paulson B, Hammond T (2008) PaleoSketch: accurate primitive sketch recognition and beautification[C]. Intelligent User Interfaces, 1–10

  19. Fonseca MJ, Jorge JA (2000) Using fuzzy logic to recognize geometric shapes interactively[C]. In: IEEE International conference on fuzzy systems, pp 291–296

  20. Sezgin TM, Stahovich TF, Davis R et al (2006) Sketch based interfaces: early processing for sketch understanding[C]. In: International conference on computer graphics and interactive techniques

  21. Harding PR, Ellis T (2004) Recognizing hand gesture using Fourier descriptors[C]. In: International conference on pattern recognition, pp 286–289

  22. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection[C]. Comput Vis Pattern Recogn, 886–893

  23. Ouyang TY, Davis R (2009) A visual approach to sketched symbol recognition[C]. In: International joint conference on artificial intelligence, pp 1463–1468

  24. Shechtman E, Irani M (2007) Matching Local Self-Similarities across images and videos[C]. Comput Vis Pattern Recogn, 1–8

  25. Oltmans M (2007) Envisioning sketch recognition: a local feature based approach to recognizing informal sketches. Doctoral Dissertation, OAI: oai:dspace.mit.edu:1721.1/40318

  26. Rosa MD (2014) New methods, techniques and applications for sketch recognition. Doctoral Dissertation, https://doi.org/10.14273/unisa-304

  27. Schneider RG, Tuytelaars T (2014) Sketch classification and classification-driven analysis using Fisher vectors[J]. Int Conf Comput Graph Interact Techn, 33(6)

  28. Tümen S, Acer ME, Sezgin TM (2010) Feature extraction and classifier combination for image-based sketch recognition[C]. Sketch Based Interfaces and Modeling, 63–70. https://doi.org/10.2312/SBM/SBM10/063-070

  29. Li Y, Hospedales TM, Song Y, et al (2015) Free-hand sketch recognition by multi-kernel feature learning[J]. Comput Vis Image Underst, (137), 1–11

  30. Ouyang TY (2012) Understanding freehand diagrams: combining appearance and context for multi-domain sketch recognition, Doctoral Dissertation

  31. Sezgin TM, Davis R (2007) Sketch interpretation using multiscale models of temporal patterns[J]. IEEE Comput Graph Appl 27(1):28–37

    Article  Google Scholar 

  32. Sezgin TM, Davis R (2005) HMM-based efficient sketch recognition[C]. Intell User Interfaces, 281–283

  33. Ha D, Eck D (2018) A neural representation of sketch drawings[J]. International Conference on Learning Representations

  34. Eitz M, Hays J, Alexa M et al (2012) How do humans sketch objects[J]. Int Conf Comput Graph Interact Techniques, 31(4)

  35. Zou C, Yu Q, Du R et al (2018) SketchyScene: richly-annotated scene sketches[C]. Europ Conf Comput Vis, pp 438–454

  36. Yu Q, Yang Y, Liu F, et al (2017) Sketch-a-net: a deep neural network that beats humans[J]. Int J Comput Vis 122(3):411–425

    Article  MathSciNet  Google Scholar 

  37. Li Y, Bu R, Sun M et al (2018) PointCNN: convolution on x-transformed points[C]. Neural Inform Process Syst, 820– 830

  38. Sun Z, Wang C, Zhang L et al (2012) Free hand-drawn sketch segmentation[C]. European Conf Comput Vis, 626–639

  39. Zhang J, Chen Y, Li L et al (2018) Context-based sketch classification[C]. Non Photorealistic Animation and Rendering

  40. Hu C, Li D, Song Y, et al (2018) Sketch-a-classifier: sketch-based photo classifier generation[J]. arXiv: Computer Vision and Pattern Recognition

  41. Verma VK, Mishra A, Mishra AK, et al (2019) Generative model for zero-shot sketch-based image retrieval[J]. arXiv: Computer Vision and Pattern Recognition

  42. Song J, Pang K, Song Y et al (2018) Learning to sketch with shortcut cycle consistency[C]. Comput Vis Pattern Recogn, 801–810

  43. Xu P, Huang Y, Yuan T, et al (2018) SketchMate: deep hashing for million-scale human sketch retrieval[C]. In: IEEE/CVF Conference on computer vision and pattern recognition, pp 8090–8098. https://doi.org/10.1109/CVPR.2018.00844

  44. Lake BM, Salakhutdinov R, Gross J, et al (2011) One shot learning of simple visual concepts[J]. Cognit Sci 33:33

    Google Scholar 

  45. Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[J]. Comput Vis Pattern Recogn, 580–587

  46. Redmon J, Divvala SK, Girshick R, et al (2016) You only look once: unified, real-time object detection[C]. Comput Vis Pattern Recogn, 779–788

  47. Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector[J]. European Conf Comput Vis, 21–37

  48. Lin T, Dollar P, Girshick R et al (2017) Feature pyramid networks for object detection[C]. Comput Vis Pattern Recogn, 936– 944

  49. Sung F, Yang Y, Zhang L et al (2018) Learning to compare: relation network for few-shot learning[J]. Comput Vis Pattern Recogn, 1199–1208

  50. Hassabis D, Kumaran D, Summerfield C, et al (2017) Neuroscience-inspired artificial intelligence[J]. Neuron 95(2):245–258

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the equipment pre-research sharing technology project of China (No. 41412030301).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, C., Huang, J., Gong, J. et al. Teach machine to learn: hand-drawn multi-symbol sketch recognition in one-shot. Appl Intell 50, 2239–2251 (2020). https://doi.org/10.1007/s10489-019-01607-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01607-0

Keywords

Navigation