Retrieving geometric information from images: the case of hand-drawn diagrams

Song, Dan; Wang, Dongming; Chen, Xiaoyu

doi:10.1007/s10618-017-0494-1

Retrieving geometric information from images: the case of hand-drawn diagrams

Published: 04 March 2017

Volume 31, pages 934–971, (2017)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Dan Song¹,
Dongming Wang^1,2 &
Xiaoyu Chen¹

960 Accesses
6 Citations
2 Altmetric
Explore all metrics

Abstract

This paper addresses the problem of retrieving meaningful geometric information implied in image data. We outline a general algorithmic scheme to solve the problem in any geometric domain. The scheme, which depends on the domain, may lead to concrete algorithms when the domain is properly and formally specified. Taking plane Euclidean geometry \({\mathbb {E}}\) as an example of the domain, we show how to formally specify \({\mathbb {E}}\) and how to concretize the scheme to yield algorithms for the retrieval of meaningful geometric information in \({\mathbb {E}}\). For images of hand-drawn diagrams in \({\mathbb {E}}\), we present concrete algorithms to retrieve typical geometric objects and geometric relations, as well as their labels, and demonstrate the feasibility of our algorithms with experiments. An example is presented to illustrate how nontrivial geometric theorems can be generated from retrieved geometric objects and relations and thus how implied geometric knowledge may be discovered automatically from images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Geometric Theorems from Scanned and Photographed Images of Diagrams

Automated generation of geometric theorems from images of diagrams

Article 08 October 2014

Xiaoyu Chen, Dan Song & Dongming Wang

Tangent-V: Math Formula Image Search Using Line-of-Sight Graphs

Notes

Note that the OCR Engine has to be trained with a dataset of samples before it is used for recognition.
We combine circle detection and line detection in \({\texttt {recognizeCircleAndLines}}(\mathrm {I}_2)\) because our algorithm takes into account the connection between the two kinds of detection. It may be desirable to split \({\texttt {recognizeCircleAndLines}}(\mathrm {I}_2)\) into two algorithms, one for each kind of detection.
Binarization serves to separate the background and the foreground of the grayscale image.
Resizing serves for improving the efficiency of retrieval as the size of images produced by photographing is usually larger than needed.
The first two conditions indicate that A and B are both incident to the line passing through \(P_0\) and \(P_{n-1}\), which can be verified by using condition (3) presented in Sect. 3.4, and are both between \(P_0\) and \(P_{n-1}\). The quantity \(\mu \) is introduced to measure the degree of correct recognition.

References

An W, Chen X, Wang D (2016) Searching for geometric theorems using features retrieved from diagrams. In: Kotsireas IS, Rump SM, Yap CK (eds) Mathematical aspects of computer and information sciences, LNCS 9582. Springer, Heidelberg, pp 383–397
Chapter Google Scholar
ADG (1996–2014) Proceedings of international workshop on automated deduction in geometry. http://dblp.uni-trier.de/db/conf/adg/index.html. Accessed 4 Aug 2016
Balbiani P, Dugat V, Fariñas del Cerro L, Lopez A (1994) Eléments de géométrie mécanique. Hermès, Paris
MATH Google Scholar
Ballard DH (1981) Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn 13(2):111–122
Article Google Scholar
Banerjee B, Chandrasekaran B (2010) A constraint satisfaction framework for executing perceptions and actions in diagrammatic reasoning. J Artif Intell Res 39(1):373–427
Article MathSciNet Google Scholar
Beeson M, Wos L (2014) OTTER proofs in Tarskian geometry. In: Demri S, Kapur D, Weidenbach C (eds) Automated reasoning, LNAI 8562. Springer, Heidelberg, pp 495–510
Google Scholar
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522
Article Google Scholar
Bhatt M, Lee JH, Schultz CPL (2011) CLP(QS): a declarative spatial reasoning framework. In: Egenhofer M, Giudice N, Moratz R, Worboys M (eds) Spatial information theory, LNCS 6899. Springer, Heidelberg, pp 210–230
Chapter Google Scholar
Buchberger B (1985) Gröbner bases: an algorithmic method in polynomial ideal theory. In: Bose NK (ed) Multidimensional systems theory. D. Reidel, Dordrecht, pp 184–232
Chapter Google Scholar
Bulko WC (1988) Understanding text with an accompanying diagram. In: Ali M (ed) Proceedings of the 1st international conference on industrial and engineering applications of aritificial intelligence and expert systems, vol 2. ACM Press, New York, pp 894–898
Chen TC, Chung KL (2001) A new randomized algorithm for detecting lines. Real-Time Imaging 7(6):473–481
Article Google Scholar
Chen TC, Chung KL (2001) An efficient randomized algorithm for detecting circles. Comput Vis Image Underst 83(2):172–191
Article MathSciNet Google Scholar
Chen X, Wang D (2012) Management of geometric knowledge in textbooks. Data Knowl Eng 73:43–57
Article Google Scholar
Chen X, Wang D (2013) Formalization and specification of geometric knowledge objects. Math Comput Sci 7(4):439–454
Article MathSciNet Google Scholar
Chen X (2014) Representation and automated transformation of geometric statements. J Syst Sci Complex 27(2):382–412
Article MathSciNet Google Scholar
Chen X, Song D, Wang D (2015) Automated generation of geometric theorems from images of diagrams. Ann Math Artif Intell 74(3–4):333–358
Article MathSciNet Google Scholar
Chou S-C, Gao X-S, Zhang J-Z (1994) Machine proofs in geometry: automated production of readable proofs for geometry theorems. World Scientific, Singapore
Book Google Scholar
Chou S-C, Gao X-S (2001) Automated reasoning in geometry. In: Handbook of automated reasoning, chap 10, vol I. MIT Press, Cambridge, pp 623–665
Chapter Google Scholar
Cohn AG, Hazarika SM (2001) Qualitative spatial representation and reasoning: an overview. Fundam Inf 46(1–2):1–29
MathSciNet MATH Google Scholar
Dehlinger C, Dufourd J-F, Schreck P (2001) Higher-order intuitionistic formalization and proofs in Hilberts elementary geometry. In: Richter-Gebert J, Wang D (eds) Automated deduction in geometry, LNAI 2061. Springer, Heidelberg, pp 306–323
Chapter Google Scholar
Duda RO, Hart PE (1972) Use of the Hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15
Article Google Scholar
Ferguson RW, Forbus KD (1998) Telling juxtapositions: using repetition and alignable difference in diagram understanding. In: Holyoak K, Gentner D, Kokinov B (eds) Advances in analogy research: integration of theory and data from the cognitive, computational, and neural sciences, NBU Series in Cognitive Science. New Bulgarian University, Sofia, pp 109–117
Forbus K, Usher J, Lovett A, Lockwood K, Wetzel J (2011) CogSketch: sketch understanding for cognitive science research and for education. Top Cogn Sci 3(4):648–666
Article Google Scholar
Gelernter H (1995) Realization of a geometry theorem proving machine. In: Feigenbaum EA, Feldman J (eds) Computers and thought. MIT Press, Cambridge, pp 134–152
GeoGebra. http://www.geogebra.org/cms/. Accessed 4 Aug 2016
GEOTHER. http://www-polsys.lip6.fr/wang/GEOTHER/. Accessed 4 Aug 2016
Ghimire D, Lee J (2013) Geometric feature-based facial expression recognition in image sequences using multi-class AdaBoost and support vector machines. Sensors 13(6):7714–7734
Article Google Scholar
Giaquinto M (2007) Visual thinking in mathematics: an epistemological study. Oxford University Press, New York
Book Google Scholar
Glasgow J, Narayanan NH, Chandrasekaran B (1995) Diagrammatic reasoning: cognitive and computational perspectives. MIT Press, Cambridge
Google Scholar
Gonthier G (2008) Formal proof—the four-color theorem. Not AMS 55(11):1382–1393
MathSciNet MATH Google Scholar
Goodrum AA (2000) Image information retrieval: an overview of current research. Inf Sci Res Spec Issue Inf Sci 3(2):63–66
Google Scholar
Gorman JW, Mitchell OR, Kuhl FP (1988) Partial shape recognition using dynamic programming. IEEE Trans Pattern Anal Mach Intell 10(2):257–266
Article Google Scholar
Hales T, Adams M, Bauer G et al (2015) A formal proof of the Kepler conjecture. arXiv preprint arXiv:1501.02155
Hammond T, Davis R (2002) Tahuti: a geometrical sketch recognition system for UML class diagrams. In: Proceeding of the special interest group on computer graphics 2006 courses, Article No. 25. ACM Press, New York
Hong H, Wang D, Winkler F (eds) (1995) Algebraic approaches to geometric reasoning. Special issue of annals of mathematics and artificial intelligence, vol 13. Baltzer, Basel, pp 1–2
Johnson LE (2015) Automated elementary geometry theorem discovery via inductive diagram manipulation. Master thesis, Massachusetts Institute of Technology
Jorge JA, Fonseca MJ (2000) A simple approach to recognise geometric shapes interactively. In: Chhabra AK, Dori D (eds) Graphics recognition recent advances, LNCS 1941. Springer, Heidelberg, pp 266–274
Chapter Google Scholar
Jung CR, Schramm R (2004) Rectangle detection based on a windowed Hough transform. In: Proceedings of the 17th Braizilian symposium on computer graphics and image processing. IEEE Computer Society, New York, pp 113–120
Kapur D, Mundy JL (eds) (1989) Geometric reasoning. Special issue of artificial intelligence, vol 37. MIT Press, Cambridge, pp 1–3
Lafarge F, Descombes X (2010) Geometric feature extraction by a multi-marked point process. IEEE Trans Pattern Anal Mach Intell 32(9):1597–1609
Article Google Scholar
Li K, Lu X, Ling H, Liu L, Feng T, Tang Z (2013) Detection of overlapped quadrangles in plane geometric figures. In: Proceedings of the 12th international conference on document analysis and recognition. IEEE Computer Society, Washington, pp 260–264
Matas J, Galambos C, Kittler J (2000) Robust detection of lines using the progressive probabilistic Hough transform. Comput Vis Image Underst 78(1):119–137
Article Google Scholar
Materlik D (2003) Using sketch recognition to enhance the human-computer interface of geometry software. Diploma thesis, Free University of Berlin
McCharen JD, Overbeek RA, Wos L (1976) Problems and experiments for and with automated theorem-proving programs. IEEE Trans Comput 25(8):773–782
Article Google Scholar
Montalvo FS (1990) Diagram understanding: the symbolic descriptions behind the scenes. In: Ichikawa T, Jungert E, Korfhage RR (eds) Visual languages and applications, series of languages and information systems. Plenum Press, New York, pp 5–27
Moon H, Chellappa R, Rosenfeld A (2002) Optimal edge-based shape detection. IEEE Trans Image Process 11(11):1209–1227
Article MathSciNet Google Scholar
Mori S, Nishida H, Yamada H (1999) Optical character recognition. Wiley, New York
Google Scholar
Nixon MS, Aguado AS (2008) Feature extraction & image processing, 2nd edn. Academic Press, Oxford, pp 87–88
Google Scholar
Novak G (1995) Diagrams for solving physical problems. In: Glasgow J, Narayanan NH, Chandrasekaran B (eds) Diagrammatic reasoning: cognitive and computational perspectives. AAAI Press, Palo Alto, pp 753–774
Quaife A (1992) Automated development of fundamental mathematical theories. Kluwer, Dordrecht
MATH Google Scholar
Schultz C, Bhatt M (2015) Spatial symmetry driven pruning strategies for efficient declarative spatial reasoning. In: Fabrikant SI, Raubal M, Bertolotto M, Davies C, Freundschuh S, Bell S (eds) Spatial information theory, LNCS 9368. Springer, Heidelberg, pp 331–353
Chapter Google Scholar
Seo MJ, Hajishirzi H, Farhadi A, Etzioni O (2014) Diagram understanding in geometry questions. In: Proceedings of the 28th AAAI conference on artificial intelligence. AAAI Press, Palo Alto, pp 2831–2838
Song D, Wang D, Chen X (2015) Discovering geometric theorems from scanned and photographed images of diagrams. In: Botana F, Quaresma P (eds) Automated deduction in geometry, LNAI 9201. Springer, Heidelberg, pp 149–165
Chapter Google Scholar
Song D, Chen X (2016) Automated generation of keywords from images for geometric information search. In: Prodeedings of the 11th international workshop on automated deduction in geometry (ADG 2016) (Strasbourg, France, June 27–29, 2016), pp 172–189
Sonka M, Hlavac V, Boyle R (2014) Image processing, analysis, and machine vision, 4th edn. Cengage Learning, Stamford
Google Scholar
Takahashi K (2012) PLCA: a framework for qualitative spatial reasoning based on connection patterns of regions. In: Qualitative spatio-temporal representation and reasoning: trends and future directions, chap 2. IGI Global, Hershey, pp 63–96
Tangelder JWH, Veltkamp RC (2008) A survey of content based 3D shape retrieval methods. Multimed Tools Appl 39(3):441–471
Article Google Scholar
Tesseract-OCR. http://code.google.com/p/tesseract-ocr/. Accessed 4 Aug 2016
Ulgen F, Flavell AC, Akamatsu N (1995) Geometric shape recognition with fuzzy filtered input to a backpropagation neural network. IEICE Trans Inf Syst 78(2):174–183
Google Scholar
Wang D (1996) Geometry machines: from AI to SMC. In: Calmet J, Campbell JA, Pfalzgraf J (eds) Artificial intelligence and symbolic mathematical computation, LNCS 1138. Springer, Heidelberg, pp 213–239
Chapter Google Scholar
Wang D (2001) Elimination methods. Springer, New York
Book Google Scholar
Wang D (2003) Automated generation of diagrams with Maple and Java. In: Joswig M, Takayama N (eds) Algebra, geometry, and software systems. Springer, Heidelberg, pp 277–287
Chapter Google Scholar
Wu W-T (1994) Mechanical theorem proving in geometries: basic principles (translated from the Chinese by X. Jin and D. Wang). Springer, New York
Book Google Scholar
Yuen HK, Princen J, Illingworth J, Kittler J (1990) Comparative study of Hough transform methods for circle finding. Image Vis Comput 8(1):71–77
Article Google Scholar
Zhang TY, Suen CY (1984) A fast parallel algorithm for thinning digital patterns. Commun ACM 27(3):236–239
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank the referees for their helpful comments on an early version of the paper and acknowledge the support of the research funds from the State Key Laboratory of Software Development Environment under Grant Nos. SKLSDE-2015ZX-18 and SKLSDE-2016ZX-18 and from the Central Universities under Grant No. YWF-16-SXXY-01.

Author information

Authors and Affiliations

LMIB-SKLSDE, School of Mathematics and Systems Science, Beihang University, Beijing, 100191, China
Dan Song, Dongming Wang & Xiaoyu Chen
Centre National de la Recherche Scientifique, 3 Rue Michel-Ange, 75794 , Paris Cedex 16, France
Dongming Wang

Authors

Dan Song
View author publications
You can also search for this author in PubMed Google Scholar
Dongming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoyu Chen.

Additional information

Responsible editor: Kristian Kersting.

Appendices

Appendix 1: Algorithmic description of \({\texttt {recognizeCircleAndLines}}(\mathrm {I}_2)\)

In Algorithm 4, the subalgorithm

\(\texttt {detectKeyPoints}(\mathrm {I}_2)\) is called to recognize key points in \(\mathrm {I}_2\) according to Definition 2;
\(\texttt {combineKp2PntInt}(\mathrm {K}_p)\) is called to collect points of interest from clusters of key points in \(\mathrm {K}_p\);
\(\texttt {assignPntOfInterest}(\mathrm {C}_{rv}, \mathrm {P}_I)\) is called to assign s to \(c.\textit{spoint}\) and e to \(c.\textit{epoint}\) of each \(c\in \mathrm {C}_{rv}\), where s is a point of interest nearest to \(c.\textit{spoint}\) and e is a point of interest nearest to \(c.\textit{epoint}\);
\(\texttt {sortCurves}(\mathrm {C}_{rv})\) is called to sort the elements of \(\mathrm {C}_{rv}\) in a descending order with respect to length.

The above subalgorithms are rather simple and we do not present them formally. The other subalgorithms \(\texttt {detectCurves}(\mathrm {I}_2,\mathrm {K}_p)\), \(\texttt {computeCrvAttributes}(\mathrm {C}_{rv})\), and \(\texttt {determineEntityType}(\mathrm {C}_{rv})\) of Algorithm 4 are formulated as Algorithms 5, 6, and 7 respectively.

In Algorithm 5, \(\texttt {getNextPoint}(S)\) is called to find the next point in the 8-neighbours of the last point of S which does not occur in S, which can be added to S, and which will lead to a key point.

In Algorithm 6,

\(\texttt {random}(n)\) generates a randomized integer in [1, n];
\(\texttt {getRadi}(T_c[1], T_c[p_{idx}], T_c[n])\) computes the radius of the circle c passing through three points \(T_c[1], T_c[p_{idx}], T_c[n]\);
\(\texttt {distance}(T_c[j], \texttt {line}(T_c[1], T_c[n]))\) computes the distance from the point \(T_c[j]\) to the line passing through \(T_c[1]\) and \(T_c[n]\);
\(\texttt {polyfit}(T_c[1,\ldots ,N_2],t)\) computes the coefficients of a polynomial of degree t which fits the sequence \(T_c[1,\ldots ,N_2]\) of points;
\(\texttt {polySlope}(coef_s,P)\) computes the slope of a line tangent to the curve defined by the polynomial with coefficients \(coef_s\) at the point P;
\(\texttt {tangentVecAngle}(slope,T_c[1,\ldots ,N_2])\) computes the directed angle from the vector (1, 0) to \((v_x,slope)\), where \(v_x=1\) if the X-coordinate of \(T_c[1]\) is bigger than \(T_c[N_2]\), and \(v_x=-1\) otherwise.

In Algorithm 7, \(\texttt {combine2curves}(\mathrm {C}_{rv}[i],\mathrm {C}_{rv}[m_i])\) combines the two curves \(\mathrm {C}_{rv}[i]\) and \(\mathrm {C}_{rv}[m_i]\) of points into one curve and then updates the attributes of the new curve.

Appendix 2: Selected experimental results

In Table 5, basic geometric entities and basic geometric relations, listed in the subcolumns under “Basic geometric information”, are represented in the form of T: n, where T is the type of the geometric entity or relation and n is the number of instances of geometric entities or relations of type T retrieved from the image shown in the column “Image of diagram”; “Size” denotes the image size in pixels; “Time” denotes the running time in seconds for retrieving the geometric information from the image.

Table 5 Test results

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, D., Wang, D. & Chen, X. Retrieving geometric information from images: the case of hand-drawn diagrams. Data Min Knowl Disc 31, 934–971 (2017). https://doi.org/10.1007/s10618-017-0494-1

Download citation

Received: 31 January 2016
Accepted: 07 February 2017
Published: 04 March 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s10618-017-0494-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Retrieving geometric information from images: the case of hand-drawn diagrams

Abstract

Access this article

Similar content being viewed by others

Discovering Geometric Theorems from Scanned and Photographed Images of Diagrams

Automated generation of geometric theorems from images of diagrams

Tangent-V: Math Formula Image Search Using Line-of-Sight Graphs

Notes

References

Acknowledgements