Document analysis applied to fragments: feature set for the reconstruction of torn documents

Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document. In this paper document analysis is applied to snippets of torn documents to calculate features that can be used for reconstruction. The main intention is to handle snippets of varying size and different contents (e.g. handwritten or printed text). Documents can either be destroyed by the intention to make the printed content unavailable (e.g. business crime) or due to time induced degeneration of ancient documents (e.g. bad storage conditions). Current reconstruction methods for manually torn documents deal with the shape, or e.g. inpainting and texture synthesis techniques. In this paper the potential of document analysis techniques of snippets to support a reconstruction algorithm by considering additional features is shown. This implies a rotational analysis, a color analysis, a line detection, a paper type analysis (checked, lined, blank) and a classification of the text (printed or hand written). Preliminary results show that these features can be determined reliably on a real dataset consisting of 690 snippets.


  • (2023)Computational techniques for virtual reconstruction of fragmented archaeological textilesHeritage Science10.1186/s40494-023-01102-311:1Online publication date: 13-Dec-2023
  • (2022)DAZeTD: Deep Analysis of Zones in Torn DocumentsFrontiers in Handwriting Recognition10.1007/978-3-031-21648-0_35(515-529)Online publication date: 25-Nov-2022
  • (2019)Table Rows Segmentation2019 International Conference on Document Analysis and Recognition (ICDAR)10.1109/ICDAR.2019.00080(461-466)Online publication date: Sep-2019
  • Show More Cited By



  1. document reconstruction
  2. layout analysis
  3. skew


