skip to main content
10.1145/3318299.3318327acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

MHDT: A Deep-Learning-Based Text Detection Algorithm for Unstructured Data in Banking

Authors Info & Claims
Published:22 February 2019Publication History

ABSTRACT

Text detection in natural scene images becomes highly demanded for unstructured data in banking. In this paper, we propose a new deep learning algorithm called MSER, Hu-moment and Deep learning for Text detection (MHDT) based on Maximum Stable Extremal Regions (MSER) and Hu-moment features. Firstly, we extract MSERs as candidate characters. Secondly, a character classifier is introduced with Hu-moment features to reduce the number of input for clustering. After single linkage clustering, a text classifier trained from a Deep Brief Network is used to delete non-text. The proposed algorithm is evaluated on the ICDAR database, and the experimental results show that the proposed algorithm yields high precision and recall rate.

References

  1. Balducci, Bitty, and Detelina Marinova. Unstructured data in marketing. Journal of the Academy of Marketing Science (2018): 1--34.Google ScholarGoogle Scholar
  2. Edge, Darren, Jonathan Larson, and Christopher White. 2018. Bringing AI to BI: Enabling Visual Analytics of Unstructured Data in a Modern Business Intelligence Platform. Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yin, Xu-Cheng, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao. 2014. Robust text detection in natural scene images. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2014): 970--983.Google ScholarGoogle Scholar
  4. Chen, Xiangrong, and Alan L. Yuille. 2004. Detecting and reading text in natural scenes. 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2. IEEE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Lee, Jung-Jin, et al. 2011. Adaboost for text detection in natural scene. 2011 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yin, Xuwang, et al. 2012. Effective text localization in natural scene images with MSER, geometry-based grouping and AdaBoost. 2012 21st International Conference on Pattern Recognition (ICPR). IEEE, 2012.Google ScholarGoogle Scholar
  7. Epshtein, Boris, Eyal Ofek, and Yonatan Wexler. 2010. Detecting text in natural scenes with stroke width transform. 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  8. Yi, Chucai, and Yingli Tian. 2012. Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Transactions on Image Processing 21, 9 (2012): 4256--4268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yi, Chucai, and Yingli Tian. 2013. Text extraction from scene images by character appearance and structure modeling. Computer Vision and Image Understanding 117, 2 (2013): 182--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chen, Huizhong, et al. 2011. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. 2011 18th IEEE International Conference on Image Processing (ICIP). IEEE, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  11. Liu, Weibo, et al. 2017. A survey of deep neural network architectures and their applications. Neurocomputing 234 (2017): 11--26.Google ScholarGoogle ScholarCross RefCross Ref
  12. Kim, Yelin, Honglak Lee, and Emily Mower Provost. 2013. Deep learning for robust feature generation in audiovisual emotion recognition. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  13. Zhou, Shusen, Qingcai Chen, and Xiaolong Wang. 2010. Discriminative deep belief networks for image classification. 2010 17th IEEE International Conference on Image Processing (ICIP), IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  14. Huang, Chenchen, et al. 2014. A research of speech emotion recognition based on deep belief network and SVM. Mathematical Problems in Engineering 2014 (2014).Google ScholarGoogle Scholar
  15. Wang, Hai, Yingfeng Cai, and Long Chen. "A vehicle detection algorithm based on deep belief network." The scientific world journal 2014 (2014).Google ScholarGoogle Scholar
  16. Matas, Jiri, et al. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and vision computing 22, 10 (2004): 761--767.Google ScholarGoogle Scholar
  17. Mikolajczyk, Krystian, et al. 2005. A comparison of affine region detectors. International journal of computer vision 65, 1--2 (2005): 43--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hu, Ming-Kuei. 1962. Visual pattern recognition by moment invariants. IRE transactions on information theory 8, 2 (1962): 179--187.Google ScholarGoogle Scholar
  19. "Peak Noise to Signal Ratio". {online}. Available: http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratioGoogle ScholarGoogle Scholar
  20. Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural computation 18, 7 (2006): 1527--1554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hinton, Geoffrey E. 2012. A practical guide to training restricted Boltzmann machines. Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 599--619.Google ScholarGoogle Scholar
  22. Shahab, Asif, Faisal Shafait, and Andreas Dengel. 2011. ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. 2011 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Breiman, Leo. Classification and regression trees. Routledge, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  24. Koo, Hyung Il, and Duck Hoon Kim. 2013. Scene text detection via connected component clustering and nontext filtering. IEEE transactions on image processing 22, 6 (2013): 2296--2305. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MHDT: A Deep-Learning-Based Text Detection Algorithm for Unstructured Data in Banking

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICMLC '19: Proceedings of the 2019 11th International Conference on Machine Learning and Computing
      February 2019
      563 pages
      ISBN:9781450366007
      DOI:10.1145/3318299

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 February 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader