Skip to main content

Automatic Paragraph Detection for Accessible PDF Documents

  • Conference paper
  • First Online:
Computers Helping People with Special Needs (ICCHP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9758))

Abstract

This paper describes a new algorithm for the automatic detection and tagging of paragraphs in PDF documents. This is an important feature of the PDF Accessibility Validation Engine (PAVE) [1] which is an open-source web application for the analysis and semi-automatic correction of accessibility issues in PDF documents. The tool is currently used by a large number of users, and their feedback is collected and evaluated. The evaluation so far revealed some major usability issues mainly due to the missing paragraph detection functionality. After an introduction in PDF accessibility this paper discusses the current usability issues with PAVE and describes the newly proposed algorithm to alleviate them. A first evaluation and conclusion of the results will be provided in the final paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Darvishy, A., Hutter, H.-P., Mannhart, O.: Web application for analysis, manipulation and generation of accessible PDF documents. In: Stephanidis, C. (ed.) UAHCI 2011. LNCS, vol. 6768, pp. 121–128. Springer, Heidelberg (2011)

    Google Scholar 

  2. Screen readers. http://www.freedomscientific.com/Products/Blindness/JAWS

  3. Darvishy, A., Hutter, H.-P.: Comparison of the effectiveness of different accessibility plugins based on important accessibility criteria. In: Stephanidis, C., Antona, M. (eds.) UAHCI 2013, Part III. LNCS, vol. 8011, pp. 305–310. Springer, Heidelberg (2013)

    Google Scholar 

  4. SS12 EU 2014 Finals and Winners. http://ss12.info/Europe/

  5. Déjean, H., Meunier, J.-L.: A system for converting PDF documents into structured XML format. In: Bunke, H., Spitz, A. (eds.) DAS 2006. LNCS, vol. 3872, pp. 129–140. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Meunier, J.-L.: Optimized XY-cut for determining a page reading order. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), pp. 347–351. IEEE Computer Society, Washington, DC, USA (2005)

    Google Scholar 

  7. Chu, Y., Adachi, J., Takasu, A.: Detection of paragraph boundaries in complex page layouts for electronic documents. In: Information Processing Society of Japan (IPSJ) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alireza Darvishy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Darvishy, A., Nevill, M., Hutter, HP. (2016). Automatic Paragraph Detection for Accessible PDF Documents. In: Miesenberger, K., Bühler, C., Penaz, P. (eds) Computers Helping People with Special Needs. ICCHP 2016. Lecture Notes in Computer Science(), vol 9758. Springer, Cham. https://doi.org/10.1007/978-3-319-41264-1_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41264-1_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41263-4

  • Online ISBN: 978-3-319-41264-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics