Skip to main content
Log in

Texture feature-based text region segmentation in social multimedia data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a method of effectively segmenting text areas that exist in images by using the texture features of various types of input images obtained in social multimedia networks with an artificial neural network. The proposed text segmentation method consists of four main steps: a step for extracting candidate text areas, a step for localizing the text areas, a step for separating the text from the background, and a step for verifying the candidate text areas. In the candidate text area extraction step, candidate blocks that have any text areas are segmented in an input image on the basis of the texture features of the candidate blocks. In the text area localization step, only strings are extracted from the candidate text blocks. In the text and background separation step, the text areas are separated from the background area in the localized text blocks. In the candidate text area verification step, an artificial neural network is used to verify whether the extracted text blocks include actual text areas and exclude non-text areas. In the experimental results, the proposed method was applied to various types of news and non-news images, and it was found that the proposed method extracted text regions more accurately than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Affonso C, Sassi RJ, Barreiros RM (2015) Biological image classification using rough-fuzzy artificial neural network. Expert Syst Appl 42(24):9482–9488

    Article  Google Scholar 

  2. Aggoune A, Bouramoul A, Kholladi MK (2014) Personalized indexing for heterogeneous multimedia data. In: Proc. International Symposium on Concepts and Tools for knowledge Management (ISKO-Maghreb), 1–7

  3. Angadi SA, Kodabagi MM (2014) A robust segmentation technique for line, word and character extraction from Kannada text in low resolution display board images. In: Proc. International Conference on Signal and Image Processing (ICSIP), 42–49

  4. Chen D, Odobez JM, Bourlard H (2004) Text detection and recognition in images and video frames. Pattern Recogn 37(3):595–608

    Article  Google Scholar 

  5. Dan Z (2013) Improving the accuracy in software effort estimation: using artificial neural network model based on particle swarm optimization. In: Proc. International Conference on Service Operations and Logistics, and Informatics (SOLI), 180–185

  6. Deng C, Ma W, Yin Y (2011) An edge detection approach of image fusion based on improved Sobel operator. In: Proc. International Conference on Image and Signal Processing (CISP), 3:1189–1193

  7. Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling Internet images, tags, and their semantics. Int J Comput Vis 106(2):210–233

    Article  Google Scholar 

  8. Haneda E, Bouman CA (2011) Text segmentation for MRC document compression. IEEE Trans Image Process 20(6):611–1626

    Article  MathSciNet  Google Scholar 

  9. Herrera PJ, Pajares G, Guijarro M (2011) A segmentation method using Otsu and fuzzy k-Means for stereovision matching in hemispherical images from forest environments. Appl Soft Comput 11(8):4738–4747

    Article  Google Scholar 

  10. Hsia SC, Ho CN, Liu CH (2014) Real-time text detection using PAC/DUE embedded system. In: Proc. International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 321–324

  11. Huang X (2011) A novel video text extraction approach based on Log-Gabor filters. In: Proc. International Congress on Image and Signal Processing (CISP), 1:474–478

  12. Huang S, Ahmadi M, Sid-Ahmed MA (2008) A hidden Markov model-based character extraction method. Pattern Recogn 41(9):2890–2900

    Article  MATH  Google Scholar 

  13. Huang X, Yang L, Yang Z (2009) A method of text segmentation from scanned image with complex background. In: Proc. International Conference on Management and Service Science (MASS), 1–4

  14. Ilkucar M, Isik AH, Cifci A (2014) Classification of breast cancer data with harmony search and back propagation based artificial neural network. In: Proc. International Conference on Signal Processing and Communications Applications (SIU), 762–765

  15. Jee HK, Lim S, Youn J, Lee J (2014) An augmented reality-based authoring tool for E-learning applications. Multimed Tool Appl 68(2):225–235

    Article  Google Scholar 

  16. Jiang N, Yang W, Duan L, Xu X, Huang C, Liu Q (2012) Acceleration of CT reconstruction for wheat tiller inspection based on adaptive minimum enclosing rectangle. Comput Electron Agr 85:123–133

    Article  Google Scholar 

  17. Kim W, Kim C (2009) A new approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18(2):401–411

    Article  MathSciNet  Google Scholar 

  18. Kim T, Kim EJ (2015) Hybrid storage-based caching strategy for content delivery network services. Multimed Tool Appl 74(5):1697–1709

    Article  Google Scholar 

  19. Kim WJ, Kim SD, Radha H (2008) 3D binary morphological operations using run-length representation. Signal Process Image Commun 23(6):442–450

    Article  Google Scholar 

  20. Kolesnikov A, Trichina E, Kauranne T (2015) Estimating the number of clusters in a numerical data set via quantization error modeling. Pattern Recogn 48(3):941–952

    Article  Google Scholar 

  21. Li J, Tian Y, Huang T, Gao W (2008) Multi-polarity text segmentation using graph theory. In: Proc. IEEE International Conference on Image Processing (ICIP), 3008–3011

  22. Lyu MR, Song J, Cai M (2005) A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans Circ Syst Video Technol 15(2):243–255

    Article  Google Scholar 

  23. Marquez D, Besccs J (2007) A model-based iterative method for caption extraction in compressed MPEG video. Lect Notes Comput Sci 4816:91–94

    Article  Google Scholar 

  24. Nguyen TN, Miyata K (2015) Multi-scale region perpendicular local binary pattern: an effective feature for interest region description. Vis Comput 31(4):391–406

    Article  Google Scholar 

  25. Otsu N (1979) A threshold selection method from gray-level histogram. IEEE Trans Syst Man Cybern 9(1):62–66

    Article  MathSciNet  Google Scholar 

  26. Qian X, Liu G, Wang H, Su R (2007) Text detection, localization, and tracking in compressed video. Signal Process Image Commun 22(9):752–768

    Article  Google Scholar 

  27. Rahman MA, Kim HN, Saddik AE, Gueaieb W (2012) A context-aware multimedia framework toward personal social network services. Multimed Tool Appl 71(3):1717–1747

    Google Scholar 

  28. Roccetti M, Salomoni P, Ghini V, Ferretti S (2005) Bringing the wireless Internet to UMTS devices: a case study with music distribution. Multimed Tool Appl 25(2):217–251

    Article  Google Scholar 

  29. Roy PP, Pal U, Llados J, Delalandre M (2012) Multi-oriented touching text character segmentation in graphical documents using dynamic programming. Pattern Recogn 45(5):1972–1983

    Article  Google Scholar 

  30. Song J, Cai M, Lyu MR (2003) A robust statistic method for classifying color polarity of video text. In: Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 3:385–388

  31. Strauss O, Comby F (2007) Variable structuring element-based fuzzy morphological operations for single viewpoint omnidirectional images. Pattern Recogn 40(12):3578–3596

    Article  MATH  Google Scholar 

  32. Su R, Sun C, Zhang C, Pham TD (2014) A new method for linear feature and junction enhancement in 2D images based on morphological operation, oriented anisotropic Gaussian function and Hessian information. Pattern Recogn 47(10):3193–3208

    Article  Google Scholar 

  33. Thepade SD, Subhedarpage KS, Mali AA (2013) Performance rise in content based video retrieval using multi-level Thepade’s sorted ternary block truncation coding with intermediate block videos and even-odd videos, In: Proc. International Conference on Advances in Computing, Communications and Informatics (ICACCI), 962–966

  34. Tian S, Lu S, Su B, Tan CL (2014) Scene text segmentation with multi-level maximally stable extremal regions. In: Proc. International Conference on Pattern Recognition (ICPR), 2703–2708

  35. Vasudev T, Hemanthkumar G, Nagabhushan P (2007) Transformation of arc-form-text to linear-form-text suitable for OCR. Pattern Recogn Lett 28(16):2343–2351

    Article  Google Scholar 

  36. Wu JW, Tseng JCR, Tsai WN (2010) A discrete particle swarm optimization algorithm for domain independent linear text segmentation. In: Proc. IEEE International Conference on granular computing (GRC), 519–524

  37. Zhang DQ, Rajendran RK, Chang SF (2002) General and domain-specific techniques for detecting and recognizing superimposed text in video. In: Proc. International Conference on Image Processing (ICIP), 1:I-593–I-596

  38. Zhang H, Zhu Q, Guan XF (2012) Probe into image segmentation based on Sobel operator and maximum entropy algorithm. In: Proc. International Conference on Computer Science and Service System (CSSS), 238–241

Download references

Acknowledgments

This work was supported by the ICT R&D program of MSIP/IITP. [2014(R0112-14-1014), The Development of Open Platform for Service of Convergence Contents.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gye-Young Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, SH., An, KJ., Jang, SW. et al. Texture feature-based text region segmentation in social multimedia data. Multimed Tools Appl 75, 12815–12829 (2016). https://doi.org/10.1007/s11042-015-3237-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-3237-6

Keywords

Navigation