research-article

Caption Detection, Localization and Type Recognition in Arabic News Video

Authors:

Ibrahim A. Zedan,

Khaled M. Elsayed,

Eid EmaryAuthors Info & Claims

INFOS '16: Proceedings of the 10th International Conference on Informatics and Systems

Pages 114 - 120

https://doi.org/10.1145/2908446.2908472

Published: 09 May 2016 Publication History

Get Access

Abstract

In this paper, we propose a method to detect and localize all caption types in Arabic news videos. Moreover, different types of captions are considered including static, horizontal scrolling and vertical scrolling captions. Our method is able to deal with different patterns of appearance and disappearance of captions in news video. Deal with news videos with multiple captions. Our method is based on edge feature and multiple frames integration. Canny edge map is computed for each frame. Horizontal lines detection is applied and frames are categorized into clusters. Finally, caption types are recognized from each cluster by observing the normalized inter-frame edge map difference. Experimental results show the effectiveness of the proposed method to detect, locate all caption types, recognize the caption type and identify the appearance/disappearance intervals of captions. The experiments are conducted using real news videos recorded from different TV channels.

References

[1]

X.-S. Hua, X.-R. Chen, L. Wenyin, and H.-J. Zhang, 2001, "Automatic location of text in video frames," Proceedings of the 2001 ACM workshops on Multimedia multimedia information retrieval - MULTIMEDIA '01.

Digital Library

Google Scholar

[2]

R. Wang, W. Jin, and L. Wu, 2004, "A novel video caption detection approach using Multi-frame Integration," Proceedings - International Conference on Pattern Recognition, vol. 1, no. 200433, pp. 449--452.

Digital Library

Google Scholar

[3]

M. Ben Halima, H. Karray, and A. M. Alimi, 2013, "Arabic Text Recognition in Video Sequences," International Journal of Computational Linguistics Research, pp. 603--608.

Google Scholar

[4]

Z. Yang, 2012, "Caption Detection and Text Recognition in News Video, 5th International Congress on Image and Signal Processing (CISP).

Crossref

Google Scholar

[5]

M. Halima, H. Karray, and A. Alimi, 2010,"A comprehensive method for Arabic video text detection, localization, extraction and recognition," 11th Pacific Rim Conference on Multimedia, Shanghai, China.

Digital Library

Google Scholar

[6]

T. Q. Phan, P. Shivakumara, and C. L. Tan, 2009, "A Laplacian Method for Video Text Detection," 2009 10th International Conference on Document Analysis and Recognition.

Digital Library

Google Scholar

[7]

H. Karray, A. A. Regim, and I. Machines, 2005, "Detection and Extraction of the Text in a video sequence," 12th IEEE International Conference on Electronics, Circuits and Systems.

Google Scholar

[8]

S. Lefevre and N. Vincent, 2005, "Caption localisation in video sequences by fusion of multiple detectors," Eighth International Conference on Document Analysis and Recognition ICDAR05.

Digital Library

Google Scholar

[9]

A. Khader, J. Saudagar, and H. Vulla, 2015, "Efficient Arabic Text Extraction and Recognition using Thinning and Dataset Comparison Technique,", International Conference on Communication, Information & Computing Technology (ICCICT).

Google Scholar

[10]

T. Pratheeba, V. Kavitha, and S. Raja Rajeswari, 2010, "Morphology based text detection and extraction from complex video scene," International Journal of Engineering and Technology, vol. 2, no. 3, pp. 200--206.

Google Scholar

[11]

V. Khare, P. Shivakumara, and P. Raveendran, 2015, "A new Histogram Oriented Moments descriptor for multi-oriented moving text detection in video," Expert Systems with Applications, vol. 42, no. 21, pp. 7627--7640.

Digital Library

Google Scholar

[12]

T. K. Boaz and C. J. Prabhakar, 2013, "A novel approach for detection and localization of caption in video based on pixel pairs," National Conference on Challenges in Research & Technology in the Coming Decades (CRT 2013).

Google Scholar

[13]

R. Bhavadharani, P. M. Sowmya, and A. Thilagavathy, 2014, "A Dynamic Approach to Extract Texts and Captions from Videos," International Journal of Computer Science and Mobile Computing, vol. 3, no. 4, pp. 1047--1052.

Google Scholar

Cited By

View all

Haraguchi DSakaguchi SKato JGoto MUchida S(2022)Fonts That Fit the Music: A Multimodal Design Trend Analysis of Lyric VideosIEEE Access10.1109/ACCESS.2022.318402810(65414-65425)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3184028
Souza MMaia HSantos ABernardes Vieira MPedrini H(2022)Multi-Script Video Caption Localization Based on Visual RhythmsApplied Artificial Intelligence10.1080/08839514.2022.203292636:1Online publication date: 4-Feb-2022
https://doi.org/10.1080/08839514.2022.2032926
Amin AHassan SHuenerfauth M(2021)Effect of Occlusion on Deaf and Hard of Hearing Users’ Perception of Captioned Video QualityUniversal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments10.1007/978-3-030-78095-1_16(202-220)Online publication date: 3-Jul-2021
https://doi.org/10.1007/978-3-030-78095-1_16
Show More Cited By

Recommendations

Subject Caption Detection in News Video with Complex Picture
CSIE '09: Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 06

We propose a method to detect story related subject captions in news video. This paper addresses two issues of caption detection problem. One is the time consuming in the feature computation, the other is the clutter of caption detection results. We ...
Localizing and Extracting Caption in News Video Using Multi-Frame Average
Proceedings of the 2008 conference on New Trends in Multimedia and Network Information Systems

News video is a very important video source. Caption in a news video can help us to understand the semantics of video content directly. A caption localization and extraction approach for news video will be proposed. This approach applies a new Multi-...
A novel method of artificial caption detection in videos using temporal and spatial information
RACS '13: Proceedings of the 2013 Research in Adaptive and Convergent Systems

The majority of the artificial captions in videos include semantic information related to the video. Most of the preceding studies on caption detection seeking to extract such semantic information relied on spatial information in still images and used ...

Comments

Information & Contributors

Information

Published In

INFOS '16: Proceedings of the 10th International Conference on Informatics and Systems

May 2016

347 pages

ISBN:9781450340625

DOI:10.1145/2908446

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

INFOS '16

INFOS '16: The 10th International Conference on Informatics and Systems

May 9 - 11, 2016

Giza, Egypt

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
56
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Haraguchi DSakaguchi SKato JGoto MUchida S(2022)Fonts That Fit the Music: A Multimodal Design Trend Analysis of Lyric VideosIEEE Access10.1109/ACCESS.2022.318402810(65414-65425)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3184028
Souza MMaia HSantos ABernardes Vieira MPedrini H(2022)Multi-Script Video Caption Localization Based on Visual RhythmsApplied Artificial Intelligence10.1080/08839514.2022.203292636:1Online publication date: 4-Feb-2022
https://doi.org/10.1080/08839514.2022.2032926
Amin AHassan SHuenerfauth M(2021)Effect of Occlusion on Deaf and Hard of Hearing Users’ Perception of Captioned Video QualityUniversal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments10.1007/978-3-030-78095-1_16(202-220)Online publication date: 3-Jul-2021
https://doi.org/10.1007/978-3-030-78095-1_16
Sakaguchi SKato JGoto MUchida S(2020)Lyric Video Analysis Using Text Detection and TrackingDocument Analysis Systems10.1007/978-3-030-57058-3_30(426-440)Online publication date: 14-Aug-2020
https://doi.org/10.1007/978-3-030-57058-3_30
Zedan IElsayed KEmary E(2017)News Videos Segmentation Using Dominant Colors RepresentationAdvances in Soft Computing and Machine Learning in Image Processing10.1007/978-3-319-63754-9_5(89-109)Online publication date: 15-Oct-2017
https://doi.org/10.1007/978-3-319-63754-9_5
Zedan IElsayed KEmary E(2016)An Innovative Method for Key Frames Extraction in News VideosProceedings of the International Conference on Advanced Intelligent Systems and Informatics 201610.1007/978-3-319-48308-5_37(383-394)Online publication date: 18-Oct-2016
https://doi.org/10.1007/978-3-319-48308-5_37
Zedan IElsayed KEmary E(2016)Abrupt Cut Detection in News Videos Using Dominant Colors RepresentationProceedings of the International Conference on Advanced Intelligent Systems and Informatics 201610.1007/978-3-319-48308-5_31(320-331)Online publication date: 18-Oct-2016
https://doi.org/10.1007/978-3-319-48308-5_31

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Subject Caption Detection in News Video with Complex Picture

Localizing and Extracting Caption in News Video Using Multi-Frame Average

A novel method of artificial caption detection in videos using temporal and spatial information

Comments

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

Subject Caption Detection in News Video with Complex Picture

Localizing and Extracting Caption in News Video Using Multi-Frame Average

A novel method of artificial caption detection in videos using temporal and spatial information

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations