Skip to main content

Semantics Driven Table Understanding in Born-Digital Documents

  • Conference paper
Image Processing and Communications Challenges 5

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 233))

  • 2186 Accesses

Summary

This paper presents a new approach to table understanding, suitable for born-digital PDF documents. Advance beyond the current state of the art in table understanding is provided by the proposed reverse MVC method, which takes advantage of only partial logic structure loss (degradation) in born-digital PDF documents, as opposed to unrecoverable loss (deterioration) taking place in scan based PDF documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adobe Systems Inc., Document management – Portable document format – Part 1: PDF 1.7 (2008)

    Google Scholar 

  2. Elsevier. Grand Paper Challenge Webpage, http://www.executablepapers.com

  3. Gbel, M., Hassan, T., Oro, E., Orsi, G.: A Methodology for Evaluating Algorithms for Table Understanding in PDF Documents (2013)

    Google Scholar 

  4. Hu, J., Nagy, G., Kashi, R., Wilfong, G., Lopresti, D.: Why table ground-truthing is hard (2011)

    Google Scholar 

  5. Hurst, M.: A constraint-based approach to table structure derivation (2003)

    Google Scholar 

  6. Siciarek, J., Wiszniewski, B.: IODA - an Interactive Open Document Architecture. Procedia Computer Science 4, 668–677 (2011)

    Article  Google Scholar 

  7. Veit, M., Herrmann, S.: Model-View-Controller and Object Teams: A Perfect Match of Paradigms. In: AOSD 2003, pp. 140–149. ACM Press (2003)

    Google Scholar 

  8. Wang, X.: Tabular Abstraction, Editing and Formatting. PhD thesis, University of Waterloo (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jacek Siciarek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Siciarek, J. (2014). Semantics Driven Table Understanding in Born-Digital Documents. In: S. Choras, R. (eds) Image Processing and Communications Challenges 5. Advances in Intelligent Systems and Computing, vol 233. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-01622-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01622-1_18

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-01621-4

  • Online ISBN: 978-3-319-01622-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics