Extraction of special effects caption text events from digital video

Crandall, David; Antani, Sameer; Kasturi, Rangachar

doi:10.1007/s10032-002-0091-7

Extraction of special effects caption text events from digital video

Original Research Paper
Published: April 2003

Volume 5, pages 138–157, (2003)
Cite this article

International Journal on Document Analysis and Recognition Aims and scope Submit manuscript

David Crandall¹,
Sameer Antani¹ &
Rangachar Kasturi¹

222 Accesses
53 Citations
9 Altmetric
Explore all metrics

Abstract. The popularity of digital video is increasing rapidly. To help users navigate libraries of video, algorithms that automatically index video based on content are needed. One approach is to extract text appearing in video, which often reflects a scene's semantic content. This is a difficult problem due to the unconstrained nature of general-purpose video. Text can have arbitrary color, size, and orientation. Backgrounds may be complex and changing. Most work so far has made restrictive assumptions about the nature of text occurring in video. Such work is therefore not directly applicable to unconstrained, general-purpose video. In addition, most work so far has focused only on detecting the spatial extent of text in individual video frames. However, text occurring in video usually persists for several seconds. This constitutes a text event that should be entered only once in the video index. Therefore it is also necessary to determine the temporal extent of text events. This is a non-trivial problem because text may move, rotate, grow, shrink, or otherwise change over time. Such text effects are common in television programs and commercials but so far have received little attention in the literature. This paper discusses detecting, binarizing, and tracking caption text in general-purpose MPEG-1 video. Solutions are proposed for each of these problems and compared with existing work found in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HyText – A Scene-Text Extraction Method for Video Retrieval

Characterizing the Impact of Using Features Extracted from Pre-trained Models on the Quality of Video Captioning Sequence-to-Sequence Models

Video Captioning Based on the Spatial-Temporal Saliency Tracing

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Pennsylvania State University, 202 Pond Laboratory, University Park, PA 16802, USA , , , , , , US
David Crandall, Sameer Antani & Rangachar Kasturi

Authors

David Crandall
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Antani
View author publications
You can also search for this author in PubMed Google Scholar
Rangachar Kasturi
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received: January 29, 2002 / Accepted: September 13, 2002

D. Crandall is now with Eastman Kodak Company, 1700 Dewey Avenue, Rochester, NY 14650-1816, USA; e-mail: david.crandall@kodak.com

S. Antani is now with the National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA; e-mail: antani@nlm.nih.gov

Correspondence to: David Crandall

Rights and permissions

Reprints and permissions

About this article

Cite this article

Crandall, D., Antani, S. & Kasturi, R. Extraction of special effects caption text events from digital video. IJDAR 5, 138–157 (2003). https://doi.org/10.1007/s10032-002-0091-7

Download citation

Issue Date: April 2003
DOI: https://doi.org/10.1007/s10032-002-0091-7

Keywords:Text detection – Text tracking – Text binarization – Video indexing

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extraction of special effects caption text events from digital video

Access this article

Similar content being viewed by others

HyText – A Scene-Text Extraction Method for Video Retrieval

Characterizing the Impact of Using Features Extracted from Pre-trained Models on the Quality of Video Captioning Sequence-to-Sequence Models

Video Captioning Based on the Spatial-Temporal Saliency Tracing

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Extraction of special effects caption text events from digital video

Access this article

Similar content being viewed by others

HyText – A Scene-Text Extraction Method for Video Retrieval

Characterizing the Impact of Using Features Extracted from Pre-trained Models on the Quality of Video Captioning Sequence-to-Sequence Models

Video Captioning Based on the Spatial-Temporal Saliency Tracing

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation