Skip to main content
Log in

A fast technique for comparing graph representations with applications to performance evaluation

  • Published:
Document Analysis and Recognition Aims and scope Submit manuscript

Abstract.

Finding efficient, effective ways to compare graphs arising from recognition processes with their corresponding ground-truth graphs is an important step toward more rigorous performance evaluation.

In this paper, we examine in detail the graph probing paradigm we first put forth in the context of our work on table understanding and later extended to HTML-coded Web pages. We present a formalism showing that graph probing provides a lower bound on the true edit distance between two graphs. From an empirical standpoint, the results of two simulation studies and an experiment using scanned pages show that graph probing correlates well with the latter measure. Moreover, our technique is very fast; graphs with tens or hundreds of thousands of vertices can be compared in mere seconds. Ease of implementation, scalability, and speed of execution make graph probing an attractive alternative for graph comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Babai L, Erdös P, Selkow SM (1980) Random graph isomorphism. SIAM J Comput 9(3):628-635

    Google Scholar 

  2. Bunke H (1997) On a relation between graph edit distance and maximum common subgraph. Patt Recog Lett 18:689-694

    Google Scholar 

  3. Bunke H (2000) Recent developments in graph matching. In: Proceedings of the 15th International Conference on Pattern Recognition, Barcelona, Spain, September 2000, 2:117-124

  4. Bunke H, Messmer BT (1997) Recent advances in graph matching. Int J Patt Recog Artif Intell 11(1):169-203

    Google Scholar 

  5. Corneil DG, Kirkpatrick DG (1980) A theoretical analysis of various heuristics for the graph isomorphism problem. SIAM J Comput 9(2):281-297

    Google Scholar 

  6. Dubois D, Prade H, Sédes F (1999) Some uses of fuzzy logic in multimedia databases querying. In: Proceedings of the Workshop on Logical and Uncertainty Models for Information Systems, London, July 1999, pp 46-54

  7. Fortin S (1996) The graph isomorphism problem. Department of Computer Science Technical Report TR 96-20, University of Alberta, Canada

  8. Hu J, Kashi R, Lopresti D, Nagy, Wilfong G (2001) Why table ground-truthing is hard. In: Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, September 2001, pp 129-133

  9. Hu J, Kashi R, Lopresti D, Wilfong G (2000) A system for understanding and reformulating tables. In: Proceedings of the 4th IAPR International Workshop on Document Analysis Systems, Rio de Janeiro, December 2000, pp 361-372

  10. Hu J, Kashi R, Lopresti D, Wilfong G (2001) Table structure recognition and its evaluation. In: Proceedings of Document Recognition and Retrieval VIII, San Jose, January 2001, 4307:44-55

  11. Hu J, Kashi R, Lopresti D, Wilfong G (2002) Evaluating the performance of table processing algorithms. Int J Doc Anal Recog 4(3):140-153

    Google Scholar 

  12. Jolion JM (2001) Graph matching: what are we really talking about? In: Proceedings of the 3rd IAPR Workshop on Graph-Based Representations in Pattern Recognition, Ischia, Italy, May 2001. http://citeseer.nj.nec.com/503443.html

  13. Kanungo T, Lee CH, Czorapinski J, Bella I (2001) TRUEVIZ: a groundtruth / metadata editing and visualizing toolkit for OCR. In: Proceedings of Document Recognition and Retrieval VIII, San Jose, January 2001, 4307:1-12

  14. Koutsofios E, North SC (1991) Drawing graphs with dot. Technical Report 59113-910904-08TM, AT&T Bell Laboratories

  15. Lazarescu M, Bunke H, Venkatesh S (2000) Graph matching: fast candidate elimination using machine learning techniques. In: Advances in pattern recognition. Lecture Notes in Computer Science, vol 1876, Springer, Berlin Heidelberg New York, pp 236-245

  16. Lopresti D, Wilfong G (2001a) Applications of graph probing to Web document analysis. In: Proceedings of the 1st international workshop on Web document analysis, Seattle, September 2001, pp 51-54

  17. Lopresti D, Wilfong G (2001b) Comparing semi-structured documents via graph probing. In: Proceedings of the 7th International Workshop on Multimedia Information Systems, Capri, Italy, November 2001, pp 41-50

  18. Lopresti D, Wilfong G (2001c) Evaluating document analysis results via graph probing. In: Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, September 2001, pp 116-120

  19. McKay B (1990) Nauty User’s Guide (Version 1.5). Computer Science Department, Australian National University, Canberra, Australia

  20. McKay B (1981) Practical graph isomorphism. Congressus Numerantium 30:45-87

    Google Scholar 

  21. Messmer BT, Bunke H (1995) Efficient error-tolerant subgraph isomorphism detection. In: Shape, Structure and Pattern Recognition, World Scientific, Singapore, pp 231-240

  22. Myers R, Wilson RC, Hancock ER (2000) Bayesian graph edit distance. IEEE Trans Patt Anal Mach Intell 22(6):628-635

    Google Scholar 

  23. Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: Proceedings of the 7th International Conference on Pattern Recognition, Montréal, July 1984, pp 347-349

  24. Ousterhout JK (1994) Tcl and the Tk toolkit. Addison-Wesley, Reading, MA

  25. Papadopoulos AN, Manolopoulos Y (1998) Structure-based similarity search with graph histograms. In: Proceedings of the 10th International Workshop on Database and Expert Systems Applications, pp 174-178. IEEE Press, New York

  26. Phillips I, Chen S, Haralick R (1993) CD-ROM document database standard. In: Proceedings of the 2nd International Conference on Document Analysis and Recognition, Tsukuba Science City, Japan, October 1993, pp 478-483

  27. Sanfeliu A, Fu KS (1983) A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Sys Man Cybern 13(3):353-362

    Google Scholar 

  28. Schlieder T, Naumann F (2000) Approximate tree embedding for querying XML data. In: Proceedings of the ACM SIGIR Workshop on XML and Information Retrieval, Athens, Greece, July 2000, pp 53-67

  29. Tsai WH, Fu KS (1979) Error-correcting isomorphisms of attributed relational graphs for pattern analysis. IEEE Trans Sys Man Cybern 9(12):757-768

    Google Scholar 

  30. Valiente G, Martínez C (1997) An algorithm for graph pattern-matching. In: Proceedings of the 4th South American Workshop on String Processing, Valparaíso, Chile, November 1997, pp 180-197. Carleton University Press, Ottawa, Ontario

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Lopresti.

Additional information

Received: 1 October 2002, Accepted: 15 January 2003, Published online: 6 February 2004

Correspondence to: D. Lopresti

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lopresti, D., Wilfong, G. A fast technique for comparing graph representations with applications to performance evaluation. IJDAR 6, 219–229 (2003). https://doi.org/10.1007/s10032-003-0106-z

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-003-0106-z

Keywords:

Navigation