skip to main content
10.1145/3177148.3180098acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedpraiConference Proceedingsconference-collections

Urdu Caption Text Detection using Textural Features

Published: 27 March 2018 Publication History


The amount of multimedia data has increased manifolds in the recent years. This calls for development of efficient retrieval techniques. Among various aspects of content based retrieval, textual content appearing in videos and images serves as a powerful semantic index. Development of such a retrieval system requires detection of text regions, recognition of detected text and generation of indices on keywords. Among these, the focus of the present study lies on detection of textual content from video frames. More specifically, we target the caption Urdu text appearing in News and entertainment channels. A series of image analysis operations is first carried out to identify candidate text blocks in the image. Features extracted from text and non-text regions using Gabor filters and Curvelet transform are fed to two classifiers namely artificial neural network and support vector machine. Evaluations on a database of 1000 video frames reported promising precision and recall.


Ansari Aasif and Muzammil H. Mohammed. 2015. Content based Video Retrieval Systems-Methods, Techniques, Trends and Challenges. International Journal of Computer Applications 112.7 (2015).
Majumdar Angshul. 2007. Bangla basic character recognition using digital curvelet transform. Journal of Pattern Recognition Research 2, 1 (2007), 17--26.
Lluis Gomez-Bigorda Anguelos Nicolaou, Andrew D. Bagdanov and Dimosthenis Karatzas. 2016. Visual Script and Language Idenfication. 2016 12th IAPR Workshop on Document Analysis Systems (2016).
Xiang Bai, Cong Yao, and Wenyu Liu. 2016. Strokelets: A Learned Multi-Scale Mid-Level Representation for Scene Text Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING 25 (2016).
Sudipto Banerjee, Koustav Mullick, and Ujjwal Bhattacharya. 2013. A robust approach to extraction of texts from camera captured images. In International Workshop on Camera-Based Document Analysis and Recognition. Springer, 30--46.
Emmanuel J Candes and David L Donoho. 2000. Curvelets: A surprisingly effective nonadaptive representation for objects with edges. Technical Report. DTIC Document.
Boris Epshtein, Eyal Ofek, and Yonatan Wexler. 2010. Detecting text in natural scenes with stroke width transform. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2963--2970.
Vigneshwari G. and A. Juliet. 2015. Optimized searching of video based on speech and video text content. Soft-Computing and Networks Security (ICSNS), 2015 International Conference on. IEEE (2015).
Lluis Gomez and Dimosthenis Karatzas. 2016. A fine-grained approach to scene text script identification. 2016 12th IAPR Workshop on Document Analysis Systems (2016).
Tanaya Guha and QM Jonathan Wu. 2010. Curvelet based feature extraction. INTECH Open Access Publisher.
Joutel Guillaume, Eglin Véronique, Bres Stéphane, and Emptoz Hubert. 2007. Curvelets based feature extraction of handwritten shapes for ancient manuscripts classification. In Electronic Imaging 2007. International Society for Optics and Photonics, 65000D--65000D.
D. S. Guru, S. Manjunath, P. Shivakumara, and C. L. Tan. 2010 USA. p. 501-506. An Eigen Value Based Approach for Text Detection in Video. In 9th IAPR International Workshop on Document Analysis Systems.
Tong He, Weilin Huang, Yu Qiao, and Jian Yao. 2016. Text-Attentional Convolutional Neural Network for Scene Text Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING 25 (2016).
Rong Huang, Palaiahnakote Shivakumara, and Seiichi Uchida. 2013. Scene character detection by an edge-ray filter. In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 462--466.
Weilin Huang, Yu Qiao, and Xiaoou Tang. 2014. Robust scene text detection with convolution neural network induced mser trees. In European Conference on Computer Vision. Springer, 497--511.
X Huang. 2011. A Novel Video Text Extraction Approach Based on Log-Gabor Filters. In 4th International Congress on Image and Signal Processing.
Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2016. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision 116, 1 (2016), 1--20.
Akhtar Jamil. 2011. Edge-based Features for Localization of Artificial Urdu Text in Video Images. Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE (2011).
Akhtar Jamil, Azra Batool, Zumra Malik, Ali Mirza, and Imran Siddiqi. 2016. Multilingual Artificial Text Extraction and Script Identification from Video Images. International Journal of Advanced Computer Science & Applications 1, 7 (2016), 529--539.
A Jamil, I Siddiqi, F Arif, and A Raza. Jamil, A., et al. Edge-based Features for Localization of Artificial Urdu Text in Video Images. in International Conference on 2011. Edge-based Features for Localization of Artificial Urdu Text in Video Images. In International Conference on Document Analysis and Recognition.
et al. Khatri, Mohd Javed. 2015. Video OCR for Indexing and Retrieval. International Journal of Computer Applications 118.2 (2015).
Y.C. Kiran and L.N. C. 2012. Text extraction and verification from video based on SVM. World Journal of Science and Technology 2(5) (2012), 124--126.
Hyung Il Koo and Duck Hoon Kim. 2013. Scene text detection via connected component clustering and nontext filtering. IEEE transactions on image processing 22, 6 (2013), 2296--2305.
SeongHun Lee, Min Su Cho, Kyomin Jung, and Jin Hyung Kim. 2010. Scene text extraction with edge constraint and text collinearity. In Pattern Recognition (ICPR), 2010 20th International Conference on. IEEE, 3983--3986.
Zumra Malik, Ali Mirza, Akram Bennour, Imran Siddiqi, and Chawki Djeddi. 2015. Video Script Identification using a Combination of Textural Features. In 2015 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS). IEEE, 61--67.
Manuel Manju and Saidas SR. 2015. Handwritten Malayalam Character Recognition using Curvelet Transform and ANN. International Journal of Computer Applications 121, 6 (2015).
Zayene Oussama, Hennebert Jean, Touj Sameh Masmoudi, Ingold Rolf, and Amara Najoua Essoukri Ben. 2015. A dataset for Arabic text detection, tracking and recognition in news videos-AcTiV. In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 996--1000.
Yi-Feng Pan, Xinwen Hou, and Cheng-Lin Liu. 2011. A hybrid approach to detect and localize texts in natural scene images. IEEE Transactions on Image Processing 20, 3 (2011), 800--813.
T.Q Phan, P Shivakumara, and C.L Tan. 2009. A Laplacian Method for Video Text Detection. In 10th International Conference on Document Analysis and Recognition.
Ye Qiaoyang and David Doermann. 2015. Text detection and recognition in imagery: A survey. IEEE Transactions On Pattern Analysis And Machine Intelligence 37 (July 2015).
Xiaohang Ren, Kai Chen, Xiaokang Yang, Yi Zhou, Jianhua He, and Jun Sun. 2015. A new unsupervised convolutional neural network model for chinese scene text detection. In Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on. IEEE, 428--432.
Srinivasan Selvan and Srinivasan Ramakrishnan. 2007. SVD-based modeling for image texture classification using wavelet transformation. Image Processing, IEEE Transactions on 16, 11 (2007), 2688--2696.
P. Shivakumara, T.Q. Phan, and C.L. Tan. 2010. New Fourier-Statistical Features in RGB Space for Video Text Detection. IEEE Trans. Circuits Syst. Video Techn (2010), 1520--1532.
P. Shivakumara, R.P. Sreedhar, Trung Quy Phan, Shijian Lu, and C.L. Tan. 2012. Multioriented Video Scene Text Detection Through Bayesian Classification and Boundary Growing. IEEE Transactions On Circuits And Systems For Video Technology 22(8) (2012).
Weijuan Wen, Xianglin Huang, Lifang Yang, and Zhao Yang. 2009. An Efficient Method for Text Location and Segmentation. In WRI World Congress on Software Engineering (WCSE 09), Beijing, China. p. 3 -- 7.
J. Ye, L.L. Huang, and X. Hao. 2009. Neural Network Based Text Detection in Videos Using Local Binary Patterns. In Chinese Conference on Pattern Recognition CCPR2009 p. 1--5.
Chucai Yi and Yingli Tian. 2012. Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Transactions on Image Processing 21, 9 (2012), 4256--4268.
J Yi, Y Peng, and J Xiao. 2007. Color-based clustering for text detection and extraction in image. In 15th international conference on Multimedia, Germany.
Xu-Cheng Yin, Ze-Yu Zuo, Shu Tian, and Cheng-Lin Liu. 2016. Text detection, tracking and recognition in video: a comprehensive survey. IEEE Transactions on Image Processing 25, 6 (2016), 2752--2773.
X.-C. and X. Yin Yin and K. Huang. 2013. Robust Text Detection in Natural Scene Images. CoRR abs/1301.2628. (2013).
Oussama Zayene, Mathias Seuret, Sameh M Touj, Jean Hennebert, Rolf Ingold, and Najoua E Ben Amara. 2016. Text detection in Arabic news video based on SWT operator and convolutional auto-encoders. In Document Analysis Systems (DAS), 2016 12th IAPR Workshop on. IEEE, 13--18.
Shuye Zhang, Mude Lin, Tianshui Chen, Lianwen Jin, and Liang Lin. 2016. CHARACTER PROPOSAL NETWORK FOR ROBUST TEXT EXTRACTION. ICASSP (2016).
W. Zhen and W. Zagiqiang. 2009. A comparative study of feature selection for SVM in video text detection. In 2nd International symposium on Computational Intelligence and Design p. 552 -- 556.

Cited By

View all
  • (2022)Text Recognition Using Image Processing Technology for Visiting CardInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT228652(488-492)Online publication date: 5-Dec-2022
  • (2021)Automated text detection from big data scene videos in higher education: a practical approach for MOOCs case studyJournal of Computing in Higher Education10.1007/s12528-021-09294-yOnline publication date: 3-Sep-2021
  • (2021)A system for detection of moving caption text in videos: a news use caseMultimedia Tools and Applications10.1007/s11042-021-10856-6Online publication date: 22-Apr-2021
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Other conferences
MedPRAI '18: Proceedings of the 2nd Mediterranean Conference on Pattern Recognition and Artificial Intelligence
March 2018
135 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


  • IAPR: International Association for Pattern Recognition


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 March 2018


Request permissions for this article.

Check for updates

Author Tags

  1. Artificial Urdu Text
  2. Curvelet Transform
  3. Gabor Filters
  4. Text Detection
  5. Textural Features


  • Research-article
  • Research
  • Refereed limited


MedPRAI '18


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Feb 2025

Other Metrics


Cited By

View all
  • (2022)Text Recognition Using Image Processing Technology for Visiting CardInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT228652(488-492)Online publication date: 5-Dec-2022
  • (2021)Automated text detection from big data scene videos in higher education: a practical approach for MOOCs case studyJournal of Computing in Higher Education10.1007/s12528-021-09294-yOnline publication date: 3-Sep-2021
  • (2021)A system for detection of moving caption text in videos: a news use caseMultimedia Tools and Applications10.1007/s11042-021-10856-6Online publication date: 22-Apr-2021
  • (2021)Urdu signboard detection and recognition using deep learningMultimedia Tools and Applications10.1007/s11042-020-10175-281:9(11965-11987)Online publication date: 6-Jan-2021
  • (2020)Detection and recognition of cursive text from video framesEURASIP Journal on Image and Video Processing10.1186/s13640-020-00523-52020:1Online publication date: 28-Aug-2020
  • (2020)Urdu-Text Detection and Recognition in Natural Scene Images Using Deep LearningIEEE Access10.1109/ACCESS.2020.29942148(96787-96803)Online publication date: 2020
  • (2020)Recognition of Cursive Caption Text Using Deep Learning - A Comparative Study on Recognition UnitsPattern Recognition and Artificial Intelligence10.1007/978-3-030-59830-3_14(156-167)Online publication date: 9-Oct-2020
  • (2019)Impact of Pre-Processing on Recognition of Cursive Video TextPattern Recognition and Image Analysis10.1007/978-3-030-31332-6_49(565-576)Online publication date: 22-Sep-2019

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media