Skip to main content

Building Digital Ink Recognizers Using Data Mining: Distinguishing between Text and Shapes in Hand Drawn Diagrams

  • Conference paper
Trends in Applied Intelligent Systems (IEA/AIE 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6096))

  • 2196 Accesses

Abstract

The low accuracy rates of text-shape dividers for digital ink diagrams are hindering their use in real world applications. While recognition of handwriting is well advanced and there have been many recognition approaches proposed for hand drawn sketches, there has been less attention on the division of text and drawing. The choice of features and algorithms is critical to the success of the recognition, yet heuristics currently form the basis of selection. We propose the use of data mining techniques to automate the process of building text-shape recognizers. This systematic approach identifies the algorithms best suited to the specific problem and generates the trained recognizer. We have generated dividers using data mining and training with diagrams from three domains. The evaluation of our new recognizer on realistic diagrams from two different domains, against two other recognizers shows it to be more successful at dividing shapes and text with 95.2% of strokes correctly classified compared with 86.9% and 83.3% for the two others.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Rubine, D.H.: Specifying gestures by example. In: Proceedings of Siggraph ’91. ACM, New York (1991)

    Google Scholar 

  2. Paulson, B., Hammond, T.: PaleoSketch: Accurate Primitive Sketch Recognition and Beautification. In: Intelligent User Interfaces (IUI ’08). ACM Press, New York (2008)

    Google Scholar 

  3. Wobbrock, J.O., Wilson, A.D., Li, Y.: Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In: User interface software and technology. ACM, Newport (2007)

    Google Scholar 

  4. Plimmer, B.: Using Shared Displays to Support Group Designs; A Study of the Use of Informal User Interface Designs when Learning to Program. Computer Science (2004)

    Google Scholar 

  5. Young, M.: InkKit: The Back End of the Generic Design Transformation Tool. Computer Science (2005)

    Google Scholar 

  6. Schmieder, P., Plimmer, B., Blagojevic, R.: Automatic Evaluation of Sketch Recognition. In: Sketch Based Interfaces and Modelling, New Orleans, USA (2009)

    Google Scholar 

  7. Bhat, A., Hammond, T.: Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams. In: International Joint Conference on Artificial Intelligence (IJCAI ’09), Pasadena, California, USA (2009)

    Google Scholar 

  8. Bishop, C.M., Svensen, M., Hinton, G.E.: Distinguishing Text from Graphics in On-Line Handwritten Ink. In: Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  9. Patel, R., Plimmer, B., et al.: Ink Features for Diagram Recognition. In: 4th Eurographics Workshop on Sketch-Based Interfaces and Modeling 2007. Eurographics, Riverside (2007)

    Google Scholar 

  10. Plimmer, B., Freeman, I.: A Toolkit Approach to Sketched Diagram Recognition. In: HCI 2007. eWiC, Lancaster (2007)

    Google Scholar 

  11. Lank, E., Thorley, J.S., Chen, S.J.-S.: An interactive system for recognizing hand drawn UML diagrams. In: Proceedings of the Centre for Advanced Studies on Collaborative research. IBM Press, Mississauga (2000)

    Google Scholar 

  12. Hammond, T., Davis, R.: Tahuti: A Geometrical Sketch Recognition System for UML Class Diagrams. In: 2002 AAAI Spring Symposium on Sketch Understanding (2002)

    Google Scholar 

  13. Zeleznik, R.C., Bragdon, A., et al.: Lineogrammer: creating diagrams by drawing. In: Proceedings of User interface software and technology. ACM, Monterey (2008)

    Google Scholar 

  14. Shilman, M., Viola, P.: Spatial recognition and grouping of text and graphics. In: EUROGRAPHICS Workshop on Sketch-Based Interfaces and Modeling (2004)

    Google Scholar 

  15. Shilman, M., Wei, Z., et al.: Discerning structure from freeform handwritten notes. In: Document Analysis and Recognition (2003)

    Google Scholar 

  16. Jain, A.K., Namboodiri, A.M., Subrahmonia, J.: Structure in On-line Documents. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition. IEEE Computer Society, Los Alamitos (2001)

    Google Scholar 

  17. Ao, X., Li, J., et al.: Structuralizing digital ink for efficient selection. In: Proceedings of the 11th international conference on Intelligent user interfaces. ACM, Sydney (2006)

    Google Scholar 

  18. Machii, K., Fukushima, H., Nakagawa, M.: Online text/drawings segmentation of handwritten patterns. In: Document Analysis and Recognition, Tsukuba Science City, Japan (1993)

    Google Scholar 

  19. Microsoft Corporation, Ink Analysis Overview (cited 2008), http://msdn.microsoft.com/en-us/library/ms704040VS85aspx

  20. Mochida, K., Nakagawa, M.: Separating drawings, formula and text from free handwriting. In: International Graphonomics Society (IGS 2003), Scottsdale, Arizona (2003)

    Google Scholar 

  21. Blagojevic, R., Plimmer, B., et al.: A Data Collection Tool for Sketched Diagrams. In: Sketch Based Interfaces and Modeling. Eurographics, Annecy (2008)

    Google Scholar 

  22. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  23. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  24. Holmes, G., Pfahringer, B., et al.: Multiclass alternating decision trees. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 161–172. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  25. Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. Machine Learning 95(1-2), 161–205 (2005)

    Article  Google Scholar 

  26. Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting. Stanford University (1998)

    Google Scholar 

  27. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  28. Platt, J.: Machines using Sequential Minimal Optimization. In: Advances in Kernel Methods - Support Vector Learning (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Blagojevic, R., Plimmer, B., Grundy, J., Wang, Y. (2010). Building Digital Ink Recognizers Using Data Mining: Distinguishing between Text and Shapes in Hand Drawn Diagrams. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13022-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13022-9_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13021-2

  • Online ISBN: 978-3-642-13022-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics