Datasets and Annotations for Document Analysis and Recognition

Valveny, Ernest

doi:10.1007/978-0-85729-859-1_32

Ernest Valveny³

4006 Accesses
6 Citations

Abstract

The definition of standard frameworks for performance evaluation is a key issue in order to advance the state-of-the-art in any field of document analysis since it permits a fair and objective comparison of different proposed methods under a common scenario. For that reason, a large number of public datasets have emerged in the last years. However, several challenges must be considered when creating such datasets in order to get a sufficiently large collection of representative data that can be easily exploited by the researchers. In this chapter we review different approaches followed by the document analysis community to address some of these challenges, such as the collection of representative data, its annotation with ground-truth information, or the representation using accepted and common formats. We also provide a comprehensive list of existing public datasets for each of the different areas of document analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alamri H, Sadri J, Suen CY, Nobile N (2008) A novel comprehensive database for Arabic off-line handwriting recognition. In: Proceedings of the 11th international conference on frontiers in handwriting recognition (ICFHR 2008), Montréal, pp 664–669
Google Scholar
Al-Ohali Y, Cheriet M, Suen C (2003) Databases for recognition of handwritten arabic cheques. Pattern Recognit 36(1):111–121. doi:10.1016/S0031-3203(02)00064-X, URL: http://www.sciencedirect.com/science/article/pii/S003132030200064X
Article Google Scholar
Antonacopoulos A, Karatzas D, Bridson D (2006) Ground truth for layout analysis performance evaluation. In: Proceedings of the 7th IAPR workshop on document analysis systems (DAS2006), Nelson. Springer, pp 302–311
Google Scholar
Antonacopoulos A, Bridson D, Papadopoulos C, Pletschacher S (2009) A realistic dataset for performance evaluation of document layout analysis. In: 10th international conference on document analysis and recognition (ICDAR’09), Barcelona, 2009, pp 296–300. doi:10.1109/ICDAR.2009.271
Google Scholar
Antonacopoulos A, Clausner C, Papadopoulos C, Pletschacher S (2011) Historical document layout analysis competition. In: 11th international conference on document analysis and recognition (ICDAR’11), Beijing, 2011
Google Scholar
Baird HS (1995) Document image defect models. In: O’Gorman L, Kasturi R (eds) Document image analysis. IEEE Computer Society, Los Alamitos, pp 315–325. URL: http://dl.acm.org/citation.cfm?id=201573.201660
Bhattacharya U, Chaudhuri B (2009) Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals. IEEE Trans Pattern Anal Mach Intell 31(3): 444–457. doi:10.1109/TPAMI.2008.88
Article Google Scholar
Blankers V, Heuvel C, Franke K, Vuurpijl L (2009) ICDAR 2009 signature verification competition. In: 10th international conference on document analysis and recognition (ICDAR’09), Barcelona, 2009, pp 1403–1407. doi:10.1109/ICDAR.2009.216
Google Scholar
Bukhari SS, Shafait F, Breuel TM (2012) The IUPR dataset of camera-captured document images. In: Proceedings of the 4th international conference on camera-based document analysis and recognition (CBDAR’11), Beijing. Springer, Berlin/Heidelberg, pp 164–171
Chapter Google Scholar
Dalitz C, Droettboom M, Pranzas B, Fujinaga I (2008) A comparative study of staff removal algorithms. IEEE Trans Pattern Anal Mach Intell 30:753–766. doi:http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.70749
Article Google Scholar
Delalandre M, Valveny E, Pridmore T, Karatzas D (2010) Generation of synthetic documents for performance evaluation of symbol recognition & spotting systems. Int J Doc Anal Recognit 13:187–207. doi:http://dx.doi.org/10.1007/s10032-010-0120-x, URL: http://dx.doi.org/10.1007/s10032-010-0120-x
Article Google Scholar
Doucet A, Kazai G, Dresevic B, Uzelac A, Radakovic B, Todic N (2011) Setting up a competition framework for the evaluation of structure extraction from OCR-ed books. Int J Doc Anal Recognit 14:45–52. doi:http://dx.doi.org/10.1007/s10032-010-0127-3, URL: http://dx.doi.org/10.1007/s10032-010-0127-3
Article Google Scholar
El Abed H, Kherallah M, Märgner V, Alimi AM (2011) On-line Arabic handwriting recognition competition: ADAB database and participating systems. Int J Doc Anal Recognit 14: 15–23. doi:http://dx.doi.org/10.1007/s10032-010-0124-6, URL: http://dx.doi.org/10.1007/s10032-010-0124-6
Article Google Scholar
Fierrez J, Galbally J, Ortega-Garcia J, Freire M, Alonso-Fernandez F, Ramos D, Toledano D, Gonzalez-Rodriguez J, Siguenza J, Garrido-Salas J, Anguiano E, Gonzalez-de Rivera G, Ribalda R, Faundez-Zanuy M, Ortega J, Cardeñoso-Payo V, Viloria A, Vivaracho C, Moro Q, Igarza J, Sanchez J, Hernaez I, Orrite-Uruñuela C, Martinez-Contreras F, Gracia-Roche J (2010) BiosecurID: a multimodal biometric database. Pattern Anal Appl 13:235–246. doi:10.1007/s10044-009-0151-4, URL: http://dx.doi.org/10.1007/s10044-009-0151-4
Article MathSciNet Google Scholar
Fischer A, Indermühle E, Bunke H, Viehhauser G, Stolz M (2010) Ground truth creation for handwriting recognition in historical documents. In: Proceedings of the 9th IAPR international workshop on document analysis systems (DAS’10), Boston. ACM, New York, pp 3–10. doi:http://doi.acm.org/10.1145/1815330.1815331, URL: http://doi.acm.org/10.1145/1815330.1815331
Fornés A, Dutta A, Gordo A, Lladós J (2012) CVC-MUSCIMA: a ground truth of handwritten music score images for writer identification and staff removal. Int J Doc Anal Recognit 15(3), 243–251. doi:10.1007/s10032-011-0168-2, URL: http://dx.doi.org/10.1007/s10032-011-0168-2
Article Google Scholar
Fruchterman T (1995) DAFS: a standard for document and image understanding. In: Proceedings of the symposium on document image understanding technology, Bowes, pp 94–100
Google Scholar
Garain U, Chaudhuri B (2005) A corpus for OCR research on mathematical expressions. Int J Doc Anal Recognit 7:241–259. doi:10.1007/s10032-004-0140-5, URL: http://dl.acm.org/citation.cfm?id=1102243.1102246
Article Google Scholar
Gatos B, Ntirogiannis K, Pratikakis I (2009) ICDAR2009 document image binarization contest (DIBCO 2009). In: 10th international conference on document analysis and recognition (ICDAR’09), Barcelona, 2009, pp 1375–1382. doi:10.1109/ICDAR.2009.246
Google Scholar
Gatos B, Stamatopoulos N, Louloudis G (2011) ICDAR2009 handwriting segmentation contest. Int J Doc Anal Recognit 14:25–33. doi:10.1007/s10032-010-0122-8, URL: http://dx.doi.org/10.1007/s10032-010-0122-8
Article Google Scholar
Guyon I, Schomaker L, Plamondon R, Liberman M, Janet S (1994) Unipen project of on-line data exchange and recognizer benchmarks. In: Proceedings of the international conference on pattern recognition, Jerusalem, pp 29–33
Google Scholar
Hassaï andne A, Al-Maadeed S, Alja’am JM, Jaoua A, Bouridane A (2011) The ICDAR2011 Arabic writer identification contest. In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 1470–1474. doi:10.1109/ICDAR.2011.292
Google Scholar
Helmers M, Bunke H (2003) Generation and use of synthetic training data in cursive handwriting recognition. In: Perales F, Campilho A, de la Blanca N, Sanfeliu A (eds) Pattern recognition and image analysis. Lecture notes in computer science, vol 2652. Springer, Berlin/Heidelberg, pp 336–345
Chapter Google Scholar
Hu J, Kashi RS, Lopresti DP, Wilfong GT (2002) Evaluating the performance of table processing algorithms. Int J Doc Anal Recognit 4(3):140–153
Article Google Scholar
Indermühle E, Liwicki M, Bunke H (2010) IAMonDo-database: an online handwritten document database with non-uniform contents. In: Proceedings of the 9th IAPR international workshop on document analysis systems (DAS’10), Boston. ACM, New York, pp 97–104. doi:http://doi.acm.org/10.1145/1815330.1815343, URL: http://doi.acm.org/10.1145/1815330.1815343
Kanai J, Rice SV, Nartker TA, Nagy G (1995) Automated evaluation of OCR zoning. IEEE Trans Pattern Anal Mach Intell 17:86–90. doi:http://doi.ieeecomputersociety.org/ 10.1109/34.368146
Google Scholar
Kanungo T, Haralick RM, Stuezle W, Baird HS, Madigan D (2000) A statistical, nonparametric methodology for document degradation model validation. IEEE Trans Pattern Anal Mach Intell 22:1209–1223. doi:http://dx.doi.org/10.1109/34.888707, URL: http://dx.doi.org/10.1109/34.888707
Article Google Scholar
Khosravi H, Kabir E (2007) Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit Lett 28:1133–1141. doi:10.1016/j.patrec.2006.12.022, URL: http://dl.acm.org/citation.cfm?id=1243503.1243603
Article Google Scholar
Kim DW, Kanungo T (2002) Attributed point matching for automatic groundtruth generation. Int J Doc Anal Recognit 5:47–66. doi:10.1007/s10032-002-0083-7, URL: http://dx.doi.org/10.1007/s10032-002-0083-7
Article Google Scholar
Lee CH, Kanungo T (2003) The architecture of TRUEVIZ: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit 36(3):811–825. doi:10.1016/S0031-3203(02)00101-2, URL: http://www.sciencedirect.com/science/article/pii/S0031320302001012
Article Google Scholar
Liang J, Phillips IT, Haralick RM (1997) Performance evaluation of document layout analysis algorithms on the UW data set. In: Proceedings of the SPIE document recognition IV, San Jose, pp 149–160
Google Scholar
Liwicki M, Bunke H (2005) IAM-OnDB – an on-line English sentence database acquired from handwritten text on a whiteboard. In: Proceedings of the eighth international conference on document analysis and recognition (ICDAR’05), Seoul. IEEE Computer Society, Washington, DC, pp 956–961. doi:http://dx.doi.org/10.1109/ICDAR.2005.132, URL: http://dx.doi.org/10.1109/ICDAR.2005.132
Liwicki M, van den Heuvel C, Found B, Malik M (2010) Forensic signature verification competition 4NSigComp2010 – detection of simulated and disguised signatures. In: International conference on frontiers in handwriting recognition (ICFHR), Kolkata, 2010, pp 715–720. doi:10.1109/ICFHR.2010.116
Google Scholar
Liwicki M, Malik M, van den Heuvel C, Chen X, Berger C, Stoel R, Blumenstein M, Found B (2011) Signature verification competition for online and offline skilled forgeries (SigComp2011). In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 1480–1484. doi:10.1109/ICDAR.2011.294
Google Scholar
Lopresti D (2009) Optical character recognition errors and their effects on natural language processing. Int J Doc Anal Recognit 12:141–151. doi:10.1007/s10032-009-0094-8, URL: http://dx.doi.org/10.1007/s10032-009-0094-8
Article Google Scholar
Louloudis G, Stamatopoulos N, Gatos B (2011) ICDAR 2011 writer identification contest. In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 1475–1479. doi:10.1109/ICDAR.2011.293
Google Scholar
Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: Proceedings of the seventh international conference on document analysis and recognition (ICDAR’03), Edinburgh, vol 2. IEEE Computer Society, Washington, DC, pp 682–687. URL: http://dl.acm.org/citation.cfm?id=938980.939531
MacLean S, Labahn G, Lank E, Marzouk M, Tausky D (2011) Grammar-based techniques for creating ground-truthed sketch corpora. Int J Doc Anal Recognit 14: 65–74. doi:http://dx.doi.org/10.1007/s10032-010-0118-4, URL: http://dx.doi.org/10.1007/s10032-010-0118-4
Article Google Scholar
Marti UV, Bunke H (1999) A full English sentence database for off-line handwriting recognition. In: Proceedings of the fifth international conference on document analysis and recognition (ICDAR’99), Bangalore. IEEE Computer Society, Washington, DC, pp 705–708. URL: http://dl.acm.org/citation.cfm?id=839279.840504
Mihov S, Schulz K, Ringlstetter C, Dojchinova V, Nakova V, Kalpakchieva K, Gerasimov O, Gotscharek A, Gercke C (2005) A corpus for comparative evaluation of OCR software and postcorrection techniques. In: Proceedings of the eighth international conference on document analysis and recognition, Seoul, 2005, vol 1, pp 162–166. doi:10.1109/ICDAR.2005.6
Google Scholar
Moll M, Baird H, An C (2008) Truthing for pixel-accurate segmentation. In: The eighth IAPR international workshop on document analysis systems (DAS’08), Japan, 2008, pp 379–385. doi:10.1109/DAS.2008.47
Google Scholar
Mori M, Suzuki A, Shio A, Ohtsuka S (2000) Generating new samples from handwritten numerals based on point correspondence. In: Proceedings of the 7th international workshop on frontiers in handwriting recognition (IWFHR2000), Amsterdam, pp 281–290
Google Scholar
Mouchere H, Viard-Gaudin C, Kim DH, Kim JH, Garain U (2011) CROHME2011: competition on recognition of online handwritten mathematical expressions. In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 1497–1500. doi:10.1109/ICDAR.2011.297
Google Scholar
Ntirogiannis K, Gatos B, Pratikakis I (2008) An objective evaluation methodology for document image binarization techniques. In: The eighth IAPR international workshop on document analysis systems (DAS’08), Nara, 2008, pp 217–224. doi:10.1109/DAS.2008.41
Google Scholar
Okamoto M, Imai H, Takagi K (2001) Performance evaluation of a robust method for mathematical expression recognition. In: International conference on document analysis and recognition, Seattle, p 0121. doi:http://doi.ieeecomputersociety.org/10.1109/ICDAR.2001.953767
Google Scholar
Ortega-Garcia J, Fierrez-Aguilar J, Simon D, Gonzalez J, Faundez-Zanuy M, Espinosa V, Satue A, Hernaez I, Igarza JJ, Vivaracho C, Escudero D, Moro QI (2003) MCYT baseline corpus: a bimodal biometric database. IEE Proc Vis Image Signal Process 150(6):395–401. doi:10.1049/ip-vis:20031078
Article Google Scholar
Paredes R, Kavallieratou E, Lins RD (2010) ICFHR 2010 contest: quantitative evaluation of binarization algorithms. In: International conference on frontiers in handwriting recognition, Kolkata, pp 733–736. doi:http://doi.ieeecomputersociety.org/10.1109/ICFHR.2010.119
Google Scholar
Perez D, Tarazon L, Serrano N, Castro F, Terrades O, Juan A (2009) The GERMANA database. In: 10th international conference on document analysis and recognition (ICDAR’09), Barcelona, 2009, pp 301–305. doi:10.1109/ICDAR.2009.10
Google Scholar
Phillips IT, Chhabra AK (1999) Empirical performance evaluation of graphics recognition systems. IEEE Trans Pattern Anal Mach Intell 21:849–870. doi:http://dx.doi.org/10.1109/34.790427, URL: http://dx.doi.org/10.1109/34.790427
Article Google Scholar
Phillips I, Chen S, Haralick R (1993) CD-ROM document database standard. In: Proceedings of the second international conference on document analysis and recognition, Tsukuba, 1993, pp 478–483. doi:10.1109/ICDAR.1993.395691
Google Scholar
Phillips I, Ha J, Haralick R, Dori D (1993) The implementation methodology for a CD-ROM English document database. In: Proceedings of the second international conference on document analysis and recognition, Tsukuba, 1993, pp 484–487. doi:10.1109/ICDAR.1993.395690
Google Scholar
Plamondon R, Guerfali W (1998) The generation of handwriting with delta-lognormal synergies. Biol Cybern 132:119–132
Article Google Scholar
Pletschacher S, Antonacopoulos A (2010) The page (page analysis and ground-truth elements) format framework. In: 20th international conference on pattern recognition (ICPR), Istanbul, 2010, pp 257–260. doi:10.1109/ICPR.2010.72
Google Scholar
Pratikakis I, Gatos B, Ntirogiannis K (2010) H-DIBCO 2010 – handwritten document image binarization competition. In: International conference on frontiers in handwriting recognition (ICFHR), Kolkata, 2010, pp 727–732. doi:10.1109/ICFHR.2010.118
Google Scholar
Pratikakis I, Gatos B, Ntirogiannis K (2011) ICDAR 2011 document image binarization contest (DIBCO 2011). In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 1506–1510. doi:10.1109/ICDAR.2011.299
Google Scholar
Quiniou S, Mouchere H, Saldarriaga S, Viard-Gaudin C, Morin E, Petitrenaud S, Medjkoune S (2011) HAMEX – a handwritten and audio dataset of mathematical expressions. In: International conference on document analysis and recognition (ICDAR), Beijing, 2011, pp 452–456. doi:10.1109/ICDAR.2011.97
Google Scholar
Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 9(2):139–152. doi:10.1007/s10032-006-0027-8, URL: http://dx.doi.org/10.1007/s10032-006-0027-8
Article Google Scholar
Rice SV, Jenkins FR, Nartker TA (1996) The fifth annual test of OCR accuracy. Technical report TR-96-01. AInformation Science Research Institute (University of Nevada, Las Vegas)
Google Scholar
Rusiñol M, Borrís A, Lladós J (2010) Relational indexing of vectorial primitives for symbol spotting in line-drawing images. Pattern Recognit Lett 31:188–201. doi:http://dx.doi.org/10.1016/j.patrec.2009.10.002, URL: http://dx.doi.org/10.1016/j.patrec.2009.10.002
Article Google Scholar
Saund E, Lin J, Sarkar P (2009) PixLabeler: user interface for pixel-level labeling of elements in document images. In: Proceedings of the 2009 10th international conference on document analysis and recognition (ICDAR’09), Barcelona. IEEE Computer Society, Washington, DC, pp 646–650. doi:http://dx.doi.org/10.1109/ICDAR.2009.250, URL: http://dx.doi.org/10.1109/ICDAR.2009.250
Schomaker L, Thomassen A, Teulings HL (1989) A computational model of cursive handwriting. In: Plamondon R, Suen CY, Simner ML (eds) Computer recognition and human production of handwriting. World Scientific, Singapore/Teaneck, pp 153–177
Chapter Google Scholar
Serrano N, Castro F, Juan A (2010) The RODRIGO database. In: LREC, Valletta
Google Scholar
Setlur S, Govindaraju V (1994) Generating manifold samples from a handwritten word. Pattern Recognit Lett 15(9):901–905. doi:10.1016/0167-8655(94)90152-X, URL: http://www.sciencedirect.com/science/article/pii/016786559490152X
Article Google Scholar
Shafait F (2007) Document image dewarping contest. In: 2nd international workshop on camera-based document analysis and recognition, Curitiba, pp 181–188
Google Scholar
Shahab A, Shafait F, Kieninger T, Dengel A (2010) An open approach towards the benchmarking of table structure recognition systems. In: Proceedings of the 9th IAPR international workshop on document analysis systems (DAS’10), Boston. ACM, New York, pp 113–120. doi:http://doi.acm.org/10.1145/1815330.1815345, URL: http://doi.acm.org/10.1145/1815330.1815345
Smith EHB (2010) An analysis of binarization ground truthing. In: Proceedings of the 9th IAPR international workshop on document analysis systems (DAS’10), Boston. ACM, New York, pp 27–34. doi:http://doi.acm.org/10.1145/1815330.1815334, URL: http://doi.acm.org/10.1145/1815330.1815334
Solimanpour F, Sadri J, Suen CY (2006) Standard databases for recognition of handwritten digits, numerical strings, legal amounts, letters and dates in Farsi language. In: Lorette G (ed) Tenth international workshop on frontiers in handwriting recognition, Université de Rennes 1, Suvisoft, La Baule. URL: http://hal.inria.fr/inria-00103983/en/
Suen C, Nadal C, Legault R, Mai T, Lam L (1992) Computer recognition of unconstrained handwritten numerals. Proc IEEE 80(7):1162–1180. doi:10.1109/5.156477
Article Google Scholar
Todoran L, Worring M, Smeulders M (2005) The UvA color document dataset. Int J Doc Anal Recognit 7:228–240. doi:10.1007/s10032-004-0135-2, URL: http://dl.acm.org/citation.cfm?id=1102243.1102245
Article Google Scholar
Uchida S, Nomura A, Suzuki M (2005) Quantitative analysis of mathematical documents. Int J Doc Anal Recognit 7:211–218. doi:10.1007/s10032-005-0142-y, URL: http://dl.acm.org/citation.cfm?id=1102243.1102248
Article Google Scholar
Varga T, Bunke H (2003) Generation of synthetic training data for an HMM-based handwriting recognition system. In: Proceedings of the seventh international conference on document analysis and recognition (ICDAR’03), Edinburgh, vol 1. IEEE Computer Society, Washington, DC, pp 618–622. URL: http://dl.acm.org/citation.cfm?id=938979.939265
Viard-Gaudin C, Lallican PM, Binter P, Knerr S (1999) The IRESTE On/Off (IRONOFF) dual handwriting database. In: Proceedings of the fifth international conference on document analysis and recognition (ICDAR’99), Bangalore. IEEE Computer Society, Washington, DC, pp 455–458. URL: http://dl.acm.org/citation.cfm?id=839279.840372
Wang K, Belongie S (2010) Word spotting in the wild. In: Proceedings of the 11th European conference on computer vision: part I (ECCV’10), Heraklion. Springer, Berlin/Heidelberg, pp 591–604. URL: http://dl.acm.org/citation.cfm?id=1886063.1886108
Chapter Google Scholar
Wang J, Wu C, Xu YQ, Shum HY, Ji L (2002) Learning-based cursive handwriting synthesis. In: Proceedings of the eighth international workshop on frontiers of handwriting recognition, Niagara-on-the-Lake, pp 157–162
Google Scholar
Wang DH, Liu CL, Yu JL, Zhou XD (2009) CASIA-OLHWDB1: a database of online handwritten Chinese characters. In: Proceedings of the 2009 10th international conference on document analysis and recognition (ICDAR’09), Barcelona. IEEE Computer Society, Washington, DC, pp 1206–1210. doi:http://dx.doi.org/10.1109/ICDAR.2009.163, URL: http://dx.doi.org/10.1109/ICDAR.2009.163
Yang L, Huang W, Tan CL (2006) Semi-automatic ground truth generation for chart image recognition. In: Workshop on document analysis systems (DAS), Nelson, pp 324–335
Google Scholar
Yanikoglu BA, Vincent L (1998) Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit 31(9): 1191–1204. doi:10.1016/S0031-3203(97)00137-4, URL: http://www.sciencedirect.com/science/article/pii/S0031320397001374
Article Google Scholar
Zhai J, Wenyin L, Dori D, Li Q (2003) A line drawings degradation model for performance characterization. In: Proceedings of the seventh international conference on document analysis and recognition, Edinburgh, 2003, pp 1020–1024. doi:10.1109/ICDAR.2003.1227813
Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Ciències de la Computació, Computer Vision Center, Universitat Autònoma de Barcelona, Campus UAB – Edifici O, 08193, Bellaterra, Spain
Ernest Valveny

Authors

Ernest Valveny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ernest Valveny .

Editor information

Editors and Affiliations

University of Maryland, College Park, MD, USA
David Doermann
Université de Lorraine, Nancy, France
Karl Tombre

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Valveny, E. (2014). Datasets and Annotations for Document Analysis and Recognition. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_32

Download citation

DOI: https://doi.org/10.1007/978-0-85729-859-1_32
Published: 24 July 2019
Publisher Name: Springer, London
Print ISBN: 978-0-85729-858-4
Online ISBN: 978-0-85729-859-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics