Abstract:
Scene text detection is challenging as the input may have different orientations, sizes, font styles, lighting conditions, perspective distortions and languages. This pap...Show MoreMetadata
Abstract:
Scene text detection is challenging as the input may have different orientations, sizes, font styles, lighting conditions, perspective distortions and languages. This paper addresses the problem by designing a Rotational Region CNN (R2CNN). R2CNN includes a Text Region Proposal Network (Text-RPN) to estimate approximate text regions and a multitask refinement network to get the precise inclined box. Our work has the following features. First, we use a novel multi-task regression method to support arbitrarily-oriented scene text detection. Second, we introduce multiple ROIPoolings to address the scene text detection problem for the first time. Third, we use an inclined Non-Maximum Suppression (NMS) to post-process the detection candidates. Experiments show that our method outperforms the state-of-the-art on standard benchmarks: ICDAR 2013, ICDAR 2015, COCO-Text and MSRA-TD500.
Date of Conference: 20-24 August 2018
Date Added to IEEE Xplore: 29 November 2018
ISBN Information:
Print on Demand(PoD) ISSN: 1051-4651