Technical SectionAn efficient, classification-based approach for grouping pen strokes into objects☆
Graphical abstract
Introduction
One of the most difficult challenges in sketch understanding is clustering the strokes into distinct objects. Often there are no clear spatial or temporal boundaries between the objects in a freely drawn sketch, and the primary clue that a group of strokes is supposed to be grouped together is that they form a meaningful shape. This is the inherent chicken-and-egg problem for sketch recognition — shapes cannot be recognized until their strokes have been grouped together, but the strokes cannot be grouped until the shapes have been recognized.
The clustering problem is so challenging that many existing recognition systems avoid it by placing constraints on the way users draw. For example, some systems require the user to provide explicit cues, such as button clicks or pauses, to demarcate each object (e.g., [14]); others require each symbol to be drawn with a single-stroke (e.g., [26], [33], [25], [13]) or a temporally contiguous sequence of strokes (e.g., [8]). While these constraints aid recognition, they do not generally match the way people naturally draw [2].
To solve the problem of simultaneous grouping and recognition, Kara et al.׳s [17] mark-group-recognize technique relies on “marker symbols” — symbols that can be accurately and inexpensively extracted from a continuous stream of pen strokes, and that tend to separate the remaining symbols. While this approach is efficient, it is limited to domains that have effective markers.
Other recent work has focused on the problem of single-stroke classification. Jain et al. [16] use stroke length and stroke curvature to distinguish text from non-text in online documents. Qi et al. [23] present a method for using conditional random fields to classify strokes in organizational chart diagrams as either connectors or boxes. Addressing a similar problem, Bishop et al. [5], Patel et al. [21], and Bhat and Hammond [4] present methods that integrate shape and temporal information for classifying individual strokes as either text or drawing strokes. Wang et al. [32] improve on Bishop et al.׳s method. Indermühle et al. [15] distinguish text from non-text using a top-down approach that segments a document into regions of text and non-text. They also developed a bottom-up approach that examines the neighborhoods of individual pixels and considers connected components. More recently, Delaye and Liu [7] used conditional-random fields to jointly model local, spatial, and temporal information to achieve accurate discrimination of text from non-text. Finally, Blagojevic and Plimmer [6] present an approach in which pen strokes are characterized with quantitative features and a classifier is used to distinguish text from non-text.
The goal of most previous single-stroke classification techniques is to identify the text strokes so they can be sent to a character recognizer, while the shape strokes (i.e., strokes comprising graphic objects) are left ungrouped. Our approach goes further, grouping the shape strokes as well. We aim to achieve accurate grouping in domains for which marker symbols do not exist and single-stroke classification is more involved. We present a two-stage clustering algorithm that first classifies pen strokes into different classes of objects, and then groups strokes with like classifications into clusters representing individual objects. Fig. 1 illustrates our approach. In the first step of processing (Fig. 1a), individual strokes are classified as belonging to text, gate, or wire objects. This classification spatially and temporally separates individual objects of the same class — as in Fig. 1b, which shows only strokes classified as gates — making the strokes easier to cluster. Once our technique has grouped the strokes in a sketch into distinct objects, there are many existing sketch recognizers that can be used to recognize them (e.g., [18], [20], [14], [3], [25]).
While our grouping technique is the primary contribution of the present work, we also make important contributions to stroke-level classification. First, while previous approaches to single-stroke classification were applied only to two-way classification (usually text vs. non-text), our approach is accurate on both three-way and four-way classification of text and different types of graphics. Second, for two-way classification, our approach achieves accuracy comparable to that of a state-of-the-art approach, and is significantly more efficient. Third, the separation between objects that results from our single-stroke classification technique enables our novel formulation of the grouping problem as an inexpensive classification task.
Section snippets
Related work
In addition to the work described above, a growing body of free-sketch recognition research involves simultaneous stroke grouping and symbol recognition. Some grouping techniques rely directly on geometric properties of the strokes. For example, Saund et al. [28] decompose a sketch into sequences of contiguous line segments corresponding to line art, and “blobs” of dense ink corresponding to text. They use Gestalt principles to group these objects into larger structures. The approach is
Single-stroke classification
Our goal at this stage is to classify strokes into general categories to facilitate stroke grouping. We use a feature-based machine-learning approach with a standard classification algorithm and a feature set that extends the set presented in Patel et al. [21]. Our classifier uses AdaBoost with decision trees and is trained using WEKA [9]. Specifically the classifier is AdaBoostM1 using 10 iterations, a seed of 1, no resampling, and a weight threshold of 100. The base classifier is a pruned J48
Grouping
Classifying the individual pen strokes reduces the complexity of stroke grouping by decomposing the problem into smaller, easier problems, one for each class. However, even for the strokes in a single class, brute force grouping techniques, such as attempting to recognize all combinations of strokes, are still too expensive for interactive systems. Instead, we use a classifier to determine if each pair of strokes of the same class should be joined to form a cluster. If a stroke is joined with
Datasets
We tested both our single-stroke and clustering classifiers on freely drawn sketches in three different domains: digital circuits (Table 2), family trees (Table 3), and solutions to statics problems (Table 4, Table 5). We collected eight digital circuit sketches from each of the 24 subjects for a total of 192 sketches. Half of these sketches were copied from a picture of a circuit, while the rest were synthesized from a logical equation. Additionally, half of the sketches were drawn on a Tablet
Single-stroke classification results
To perform grouping, we classify strokes into three or more classes. However, to benchmark our classifier, here we restrict it to two classes, text vs. non-text, and compare it to three state-of-the-art methods: the Entropy method from [4], Microsoft , which is a commercial product, and Blagojevic and Plimmer׳s Divider [6]. For comparisons with the entropy method and InkAnalyzer, we use all four of our sketch domains including digital circuits, family trees, and statics solutions
Grouping results
We trained all of the classifiers in a user-holdout fashion in which data from one subject was selected for testing and data from the other subjects was used for training. We trained classifiers for each of the four domains separately. Results are averaged across subjects.
The Inductive Pairwise Classifiers were trained with AdaBoostM1 and J48 decision trees in WEKA, using the same parameters as used for single-stroke classification. For the Thresholded Pairwise Classifier, training consists of
Single-stroke classification
Single-stroke classification is an important part of our method. It simplifies pairwise comparisons, leading to more efficient and accurate grouping. Our classifier uses adaptive boosting (AdaBoost) with decision trees and a set of features that extends previous efforts at single-stroke classification [21]. For the task of classifying text vs. non-text, our method performed better than the entropy method described in [4] and Microsoft׳s .
Additionally, our approach had comparable
Conclusion
Grouping strokes in freely drawn sketches is so challenging that few recognition systems attempt it. Our work is a significant step toward solving this important problem. We have shown that separating pen strokes into different classes can make the grouping process both more efficient and more effective. We achieve the separation using an accurate multi-way single-stroke classifier.
Previous approaches to single-stroke classification were applied only to two-way classification (usually text vs.
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant nos. 0729422 & 0735695.
References (34)
- et al.
Using data mining for digital ink recognitiondividing text and shapes in sketched diagrams
Comput Graph
(2011) - et al.
Speedsega technique for segmenting pen strokes using pen speed
Comput Graph
(2011) - et al.
An image-based, trainable symbol recognizer for hand-drawn sketches
Comput Graph
(2005) - et al.
An efficient graph-based recognizer for hand-drawn symbols
Comput Graph
(2007) - Alvarado Christine, Davis Randall. Dynamically constructed Bayes nets for multi-domain sketch understanding. In:...
- Alvarado Christine, Lazzareschi Michael. Properties of real-world digital logic diagrams. In: PLT 2007. First...
- Anthony Lisa, Wobbrock Jacob O. $n-protractor: a fast and accurate multistroke recognizer. In: Proceedings of graphics...
- Bhat Akshay, Hammond Tracy. Using entropy to distinguish shape versus text in hand-drawn diagrams. In: Proceedings of...
- Bishop Christopher M, Svensen Markus, Hinton Goeffrey E. Distinguishing text from graphics in on-line handwritten ink....
- et al.
Text/non-text classification in online handwritten documents with conditional random fields
Combining geometry and domain knowledge to interpret hand-drawn diagrams
Comput Graph
The weka data mining softwarean update
ACM SIGKDD Explorations Newsl
Recognition and beautification of multi-stroke symbols in digital ink
Comput Graph
Cited by (14)
Enabling data mining of handwritten coursework
2016, Computers and Graphics (Pergamon)Citation Excerpt :To provide a reference, we combined our features with feature set S27. Table 8 summarizes the S27 features; for more details, see [46]. Table 9 lists the results of the information gain analysis.
Automatic understanding of sketch maps using context-aware classification
2016, Expert Systems with ApplicationsA flexible framework for online document segmentation by pairwise stroke distance learning
2015, Pattern RecognitionCitation Excerpt :State-of-the-art flowchart recognition systems only report the proportion of objects correctly segmented and labeled (see last columns of Table 6), but the comparison shows that without domain knowledge and without support from recognition modules, our system still provides good segmentation for these sketches. As another element of comparison, on various comparable sketching applications (electrical diagrams, family trees, engineering notes), the segmentation method of Stahovich et al. [19] is reported to correctly extract between 61.6% and 82.4% of the segments. In order to evaluate the robustness of the feature space optimized for the FlowchartDB documents, we perform a segmentation experiment with the same 20 features on the FADB dataset.
Local context-based recognition of sketched diagrams
2014, Journal of Visual Languages and ComputingSegmentation and Recognition of Offline Sketch Scenes Using Dynamic Programming
2022, IEEE Computer Graphics and ApplicationsFinding and Segmenting Mathematical Equations in Students' Online Handwritten Assignments
2020, Proceedings - 19th IEEE International Conference on Machine Learning and Applications, ICMLA 2020
- ☆
This article was recommended for publication by Beryl Plimmer.