Loading [a11y]/accessibility-menu.js
Character-Level Street View Text Spotting Based on Deep Multisegmentation Network for Smarter Autonomous Driving | IEEE Journals & Magazine | IEEE Xplore

Character-Level Street View Text Spotting Based on Deep Multisegmentation Network for Smarter Autonomous Driving


Impact Statement:Impact Statement—Street view text spotting techniques will be increasingly important in intelligent transportation systems (ITS), because they are very beneficial for the...Show More

Abstract:

Urban scenes are full of street entities with sign boards. Therefore, in autonomous driving, street view text spotting techniques will play a significant role in the prec...Show More
Impact Statement:
Impact Statement—Street view text spotting techniques will be increasingly important in intelligent transportation systems (ITS), because they are very beneficial for the understanding of surrounding scenes during driving. However, the performance of existing scene text spotting algorithms still has a large room for improvement. To advance this research, in this work, we propose an MSTD neural network structure for character-level scene text localization and recognition. With a significant recognition performance increase of more than 25% on Chinese scene text datasets, the MSTD provides a better solution for street view scene text spotting. Our proposed technique can be used by digital mapping industries to automatically collect the points of interests along the streets; it can also be used in ITS for the understanding of surrounding scenes during driving.

Abstract:

Urban scenes are full of street entities with sign boards. Therefore, in autonomous driving, street view text spotting techniques will play a significant role in the precise understanding of surrounding scenes during driving, because texts contained in the images usually provide important clues for accurate image understanding, while it is often ambiguous for existing computer vision algorithms to understand scene images without texts. In this work, we propose a Multi-Segmentation network for character-level scene Text Detection (MSTD). The MSTD introduces a densely connected atrous spatial pyramid pooling module to enlarge the receptive field of the feature extraction layer, so as to localize long as well as large-sized text instances. Moreover, it devises a double segmentation subnetwork to utilize two independent but inherently complementary losses to co-optimize the network and increase the reliability of the confidence scores in predicting the text/nontext areas. With the characte...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 3, Issue: 2, April 2022)
Page(s): 297 - 308
Date of Publication: 28 September 2021
Electronic ISSN: 2691-4581

Funding Agency:


References

References is not available for this document.