Article

Incremental detection of text on road signs from video with application to a driving assistant system

Authors:

Jie YangAuthors Info & Claims

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

Pages 852 - 859

https://doi.org/10.1145/1027527.1027724

Published: 10 October 2004 Publication History

Abstract

This paper proposes a fast and robust framework for incrementally detecting text on road signs from natural scene video. The new framework makes two main contributions. First, the framework applies a Divide-and-Conquer strategy to decompose the original task into two sub-tasks, that is, localization of road signs and detection of text. The algorithms for the two sub-tasks are smoothly incorporated into a unified framework through a real time tracking algorithm. Second, the framework provides a novel way for text detection from video by integrating 2D features in each video frame (e.g., color, edges, texture) with 3D information available in a video sequence (e.g., object structure). The feasibility of the proposed framework has been evaluated on the video sequences captured from a moving vehicle. The new framework can be applied to a driving assistant system and other tasks of text detection from video.

References

[1]

Chen, D., Odobez, J.M., and Bourlard, H. Text detection and recognition in images and video frames. Pattern Recognition, 37, 3 (Mar. 2004), 595--608.

[2]

Chen, X., Yang, J., Zhang, J., and Waibel, A. Automatic detection and recognition of signs from natural scenes. IEEE Trans. on IP, 13, 1 (Jan. 2004), 87--99.

Digital Library

[3]

Clark, P., and Mirmehdi, M. Estimating the orientation and recovery of text planes in a single image. In Proc. of the 12th British Machine Vision Conference, 2001, 421--430.

[4]

Fang, C.-Y., Fuh, C-.S., Chen, S.-W., and Yen, P.-S. A road sign recognition system based on dynamic visual model. In Proc. of the CVPR, 2003, I: 750--755.

Digital Library

[5]

Gandhi, T., Kasturi, R., and Antani, S. Application of planar motion segmentation for scene text extraction. In Proc. of the ICPR, 2000, I: 445--449.

[6]

Haritaoglu, E.D., and Haritaoglu, I. Real time image enhancement and segmentation for sign/text detection. In Proc. of the ICIP, 2003, III: 993--996.

[7]

Jain, A.K., and Yu, B. Automatic text location in images and video frames. Pattern Recognition, 31, 12 (Dec. 1998), 2055--2076.

[8]

Kastrinaki, V., Zervakis, M., and Kalaitzakis, K. A survey of video processing techniques for traffic applications. Image and Vision Computing, 21, 4 (Apr. 2003), 359--381.

[9]

Lee, C.W., Jung, K., and Kim, H.J. Automatic text detection and removal in video sequences. Pattern Recognition Letters, 24, 15 (Nov. 2003), 2607--2623.

Digital Library

[10]

Li, H., Doermann, D. and Kia, O. Automatic text detection and tracking in digital video. IEEE Trans. on IP, 9, 1(Jan. 2000), 147--156.

Digital Library

[11]

Lienhart, R. Automatic text recognition for video indexing. In Proc. of ACM Multimedia (Nov. 1996), 11--20.

Digital Library

[12]

Lienhart, R., and Wernicke, A. Localizing and segmenting text in images and videos. IEEE Trans. on CSVT, 12,4 (Apr. 2002), 256--268.

Digital Library

[13]

Lucas, B. D., and Kanade, T. An iterative image registration technique with an application to stereo vision. In Proc. of the IJCAI (1981), 674--679.

[14]

http://www.fhwa.dot.gov/, Manual on Uniform Traffic Control Devices.

[15]

Myers, G. Bolles, R., Luong, Q.-T., and Herson, J. Recognition of text in 3-D scenes. In Proc. of the 4th Symp. on Document Image Understanding Technology(2001), pp. 23--25.

[16]

Sato, T., Kanade, T., Hughes, E.K., and Smith, M.A. Video OCR for digital news archives. In Proc. of the IEEE Int. Workshop on Content-Based Access of Image and Video Database (1998), 52--60.

Digital Library

[17]

Shi, J., and Tomasi, C. Good features to track. In Proc. of the CVPR (1994), I:593--600.

[18]

Wu, V., Manmatha, R., and Riseman, E,M. TextFinder: an automatic system to detect and recognize text in images, IEEE Trans. on PAMI, 21, 11 (Nov. 1999), 1224--1229.

Digital Library

[19]

Wu, Y., Yu, T., and Hua, G. Tracking Appearances with occlusions. In Proc. of the CVPR (2003), II: 789--795.

Digital Library

[20]

Zhang, D., and Chang, S. A Bayesian framework for fusing multiple word knowledge models in videotext recognition. In Proc. of the CVPR (2003), II: 528--533.

Cited By

Li MFu BChen HHe JQiao Y(2023)Dual Relation Network for Scene Text RecognitionIEEE Transactions on Multimedia10.1109/TMM.2022.317110825(4094-4107)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3171108
Sanyal BBiswal PMohapatra RDash RAgarwalla A(2022)Automated TSR Using DNN Approach for Intelligent VehiclesThe New Advanced Society10.1002/9781119884392.ch4(67-90)Online publication date: 18-Mar-2022
https://doi.org/10.1002/9781119884392.ch4
Cai YWang W(2020)Robustly detect different types of text in videosNeural Computing and Applications10.1007/s00521-020-04729-6Online publication date: 27-Jan-2020
https://doi.org/10.1007/s00521-020-04729-6
Show More Cited By

Index Terms

Incremental detection of text on road signs from video with application to a driving assistant system
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Animation
      1. Motion capture
      2. Motion processing

Recommendations

Detection of text on road signs from video

A fast and robust framework for incrementally detecting text on road signs from video is presented in this paper. This new framework makes two main contributions. 1) The framework applies a divide-and-conquer strategy to decompose the original task into ...
Human-Computer Driving Collaborative Control System for Curve Driving
Design, Operation and Evaluation of Mobile Communications
Abstract
Curve driving has high requirements on the driver’s hand, foot and eye coordination ability. Therefore, the bend is a frequent accident section. In this article, the performance of novice drivers and experienced drivers driving in curves were ...
Robust road lanes and traffic signs recognition for driver assistance system

Increasing safety and reducing road accidents, thereby saving lives, are one of the great interests in the context of advanced driver assistance systems. Apparently, among the complex and challenging tasks of future intelligent vehicles is road lanes ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

October 2004

1028 pages

ISBN:1581138938

DOI:10.1145/1027527

General Chairs:
Henning Schulzrinne
Columbia University
,
Nevenka Dimitrova
Philips Research
,
Program Chairs:
Angela Sasse
UCL
,
Sue Moon
KAIST
,
Rainer Lienhart
U Augsburg

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM04

Sponsor:

MM04: 2004 12th Annual ACM International Conference on Multimedia

October 10 - 16, 2004

NY, New York, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
721
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li MFu BChen HHe JQiao Y(2023)Dual Relation Network for Scene Text RecognitionIEEE Transactions on Multimedia10.1109/TMM.2022.317110825(4094-4107)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3171108
Sanyal BBiswal PMohapatra RDash RAgarwalla A(2022)Automated TSR Using DNN Approach for Intelligent VehiclesThe New Advanced Society10.1002/9781119884392.ch4(67-90)Online publication date: 18-Mar-2022
https://doi.org/10.1002/9781119884392.ch4
Cai YWang W(2020)Robustly detect different types of text in videosNeural Computing and Applications10.1007/s00521-020-04729-6Online publication date: 27-Jan-2020
https://doi.org/10.1007/s00521-020-04729-6
Wang XXia ZPeng JFeng X(2018)Multiorientation scene text detection via coarse-to-fine supervision-based convolutional networksJournal of Electronic Imaging10.1117/1.JEI.27.3.03303227:03(1)Online publication date: 7-Jun-2018
https://doi.org/10.1117/1.JEI.27.3.033032
Yang XYin FLiu C(2018)Online Video Text Detection with Markov Decision Process2018 13th IAPR International Workshop on Document Analysis Systems (DAS)10.1109/DAS.2018.20(103-108)Online publication date: Apr-2018
https://doi.org/10.1109/DAS.2018.20
杨斐(2017)Real-Time Traffic Sign Detection Based on Multi-Frame Video ImagesComputer Science and Application10.12677/CSA.2017.7505707:05(463-472)Online publication date: 2017
https://doi.org/10.12677/CSA.2017.75057
Borra SDey NAshour AKaraa WDey N(2017)Video Text Extraction and MiningMining Multimedia Documents10.1201/9781315399744-14(173-191)Online publication date: 2-May-2017
https://doi.org/10.1201/9781315399744-14
Yang CYin XPei WTian SZuo ZZhu CYan J(2017)Tracking Based Multi-Orientation Scene Text DetectionIEEE Transactions on Image Processing10.1109/TIP.2017.269510426:7(3235-3248)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1109/TIP.2017.2695104
Yang XHe WYin FLiu C(2017)A Unified Video Text Detection Method with Network Flow2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)10.1109/ICDAR.2017.62(331-336)Online publication date: Nov-2017
https://doi.org/10.1109/ICDAR.2017.62
Tian SPei WZuo ZYin X(2016)Scene text detection in video by learning locally and globallyProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060832.3060991(2647-2652)Online publication date: 9-Jul-2016
https://dl.acm.org/doi/10.5555/3060832.3060991
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten