Recognizing text in raster maps

Chiang, Yao-Yi; Knoblock, Craig A.

doi:10.1007/s10707-014-0203-9

Recognizing text in raster maps

Published: 21 February 2014

Volume 19, pages 1–27, (2015)
Cite this article

GeoInformatica Aims and scope Submit manuscript

Yao-Yi Chiang¹ &
Craig A. Knoblock²

1703 Accesses
25 Citations
Explore all metrics

Abstract

Text labels in maps provide valuable geographic information by associating place names with locations. This information from historical maps is especially important since historical maps are very often the only source of past information about the earth. Recognizing the text labels is challenging because heterogeneous raster maps have varying image quality and complex map contents. In addition, the labels within a map do not follow a fixed orientation and can have various font types and sizes. Previous approaches typically handle a specific type of map or require intensive manual work. This paper presents a general approach that requires a small amount of user effort to semi-automatically recognize text labels in heterogeneous raster maps. Our approach exploits a few examples of text areas to extract text pixels and employs cartographic labeling principles to locate individual text labels. Each text label is then rotated automatically to horizontal and processed by conventional OCR software for character recognition. We compared our approach to a state-of-art commercial OCR product using 15 raster maps from 10 sources. Our evaluation shows that our approach enabled the commercial OCR product to handle raster maps and together produced significant higher text recognition accuracy than using the commercial OCR alone.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Arbitrarily Oriented Text Labels in Early Maps

Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Text Segmentation from Land Map Images

Notes

The information for obtaining the test maps can be found on: http://www.isi.edu/integration/data/maps/prj_map_extract_data.html

References

Adam S, Ogier J, Cariou C, Mullot R, Labiche J, Gardes J (2000) Symbol and character recognition: application to engineering drawings. Int J Doc Anal Recog 3(2):89–101
Article Google Scholar
Cao R, Tan CL (2002) Text/graphics separation in maps. In: Proceedings of the 4th IAPR international workshop on graphics recognition, pp 167–177
Chen C-C, Knoblock CA, Shahabi C (2008) Automatically and accurately conflating raster maps with orthoimagery. GeoInformatica 12(3):377–410
Article Google Scholar
Chen L-H, Wang J-Y (1997) A system for extracting and recognizing numeral strings on maps. In: Proceedings of the 4th international conference on document analysis and recognition, vol 1, pp 337–341
Chiang Y-Y, Knoblock CA, Shahabi C, Chen C-C (2009) Accurate and automatic extraction of road intersections from raster maps. GeoInformatica 13(2):121–157
Article Google Scholar
Chiang Y-Y, Knoblock CA (2010) An approach for recognizing text labels in raster maps. In: Proceedings of the 20th international conference on pattern recognition, pp 3199–3202
Chiang Y-Y, Knoblock CA (2011) Recognition of multi-oriented, multi-sized, and curved text. In: Proceedings of the 11th international conference of document analysis and recognition, pp 1399–1403
Chiang Y-Y, Knoblock CA (2013) A general approach for extracting road vector data from raster maps. Int J Doc Anal Recog 16(1):55–81
Article Google Scholar
Chiang Y-Y, Knoblock CA (2012) Generating named road vector data from raster maps. Geographic information science, lecture notes in computer science, vol 7478/2012, pp 57–71
Deseilligny MP, Mena HL, Stamonb G (1995) Character string recognition on maps, a rotation-invariant recognition method. Pattern Recog Lett 16(12):1297–1310
Article Google Scholar
Edmondson S, Christensen J, Marks J, Shieber SM (1996) A general cartographic labelling algorithm. Cartographica Int J Geogr Inf Geovisualization 33(4):13–24
Article Google Scholar
Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans Pattern Anal Mach Intell 10(6):910–918
Article Google Scholar
Gelbukh A, Levachkine S, Han S-Y (2004) Resolving ambiguities in toponym recognition in cartographic maps. In: Proceedings of the 5th IAPR international workshop on graphics recognition, pp 104–112
Goto H, Aso H (1998) Extracting curved text lines using local linearity of the text line. Int J Doc Anal Recognit 2(2–3):111–119
Google Scholar
Kanai J, Rice SV, Nartker TA, Nagy G (1995) Automated evaluation of OCR zoning. IEEE Trans Pattern Anal Mach Intell 17(1):86–90
Article Google Scholar
Leyk S, Boesch R (2010) Colors of the past: color image segmentation in historical topographic maps based on homogeneity. GeoInformatica 14(1):1–21
Article Google Scholar
Li L, Nagy G, Samal A, Seth SC, Xu Y (2000) Integrated text and line-art extraction from a topographic map. Int J Doc Anal Recog 2(4):177–185
Article Google Scholar
Li Y, Sun J, Tang C-K, Shum H-Y (2004) Lazy snapping. ACM Trans Graph 23(3):303–308
Article Google Scholar
Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: Proceedings of the SPIE conference on document recognition and retrieval X, vol 5010, pp 197–207
Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058. doi:10.1109/5.156468
Myers GK, Mulgaonkar PG, Chen C-H, DeCurtins JL, Chen E (1996) Verification-based approach for automated text and feature extraction from raster-scanned maps. In: Lecture notes in computer science, vol 1072. Springer, pp 190–203
Nagy G, Samal A, Seth S, Fisher T, Guthmann E, Kalafala K, Li L, Sivasubramaniam S, Xu Y (1997) Reading street names from maps - technical challenges. In: GIS/LIS conference, pp 89–97
Nagy GL, Nartker TA, Rice SV (2000) Optical character recognition: An illustrated guide to the frontier. In: Proceedings of the SPIE international symposium on electronic imaging science and technology, vol 3967, pp 58–69
Najman L (2004) Using mathematical morphology for document skew estimation. In: Proceedings of the SPIE conference on document recognition and retrieval IX, pp 182–191
Pal U, Sinha S, Chaudhuri BB (2003) Multi-oriented english text line identification. In: Proceedings of the 13th scandinavian conference on image analysis, pp 1146–1153
Pouderoux J, Gonzato JC, Pereira A, Guitton P (2007) Toponym recognition in scanned color topographic maps. In: Proceedings of the 9th international conference on document analysis and recognition, vol 1, pp 531–535
Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314
Article Google Scholar
Roy PP, Pal U, Lladós J, Kimura F (2008) Multi-oriented english text line extraction using background and foreground information. In: The eighth IAPR international workshop on document analysis systems, DAS ’08, pp 315–322. doi:10.1109/DAS.2008.83
Roy PP, Pal U, Lladós J, Delalandre M (2009) Multi-oriented and multi-sized touching character segmentation using dynamic programming. In: Proceedings of the 10th international conference on document analysis and recognition, pp 11–15
Velázquez A, Levachkine S (2004) Text/graphics separation and recognition in raster-scanned color cartographic maps. In: Lladós J, Kwon Y-B (eds) Graphics recognition of lecture notes in computer science, vol 3088. Springer, pp 63–74
Wong KY, Wahl FM (1982) Document analysis system. IBM J Res Dev 26:647–656
Article Google Scholar

Download references

Acknowledgment

This research is based upon work supported in part by the University of Southern California under the Viterbi School of Engineering Doctoral Fellowship.

Author information

Authors and Affiliations

Spatial Sciences Institute, University of Southern California, 3616 Trousdale Parkway, AHF B55, Los Angeles, CA, 90089, USA
Yao-Yi Chiang
Department of Computer Science, Information Sciences Institute, and Spatial Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA
Craig A. Knoblock

Authors

Yao-Yi Chiang
View author publications
You can also search for this author in PubMed Google Scholar
Craig A. Knoblock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yao-Yi Chiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chiang, YY., Knoblock, C.A. Recognizing text in raster maps. Geoinformatica 19, 1–27 (2015). https://doi.org/10.1007/s10707-014-0203-9

Download citation

Received: 15 December 2011
Revised: 18 December 2013
Accepted: 08 January 2014
Published: 21 February 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s10707-014-0203-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recognizing text in raster maps

Abstract

Access this article

Similar content being viewed by others

Detecting Arbitrarily Oriented Text Labels in Early Maps

Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Text Segmentation from Land Map Images

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recognizing text in raster maps

Abstract

Access this article

Similar content being viewed by others

Detecting Arbitrarily Oriented Text Labels in Early Maps

Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Text Segmentation from Land Map Images

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation