Abstract
In this paper, the discriminative training criterion of maximum-minimum similarity (MMS) is used to improve the performance of text extraction based on Gaussian mixture modeling of neighbor characters. A recognizer is optimized in the MMS training through maximizing the similarities between observations and models from the same classes, and minimizing those for different classes. Based on this idea, we define the corresponding objective function for text extraction. Through minimizing the objective function by using the gradient descent method, the optimum parameters of our text extraction method are obtained. Compared with the maximum likelihood estimation (MLE) of parameters, the result trained with the MMS method makes the overall performance of text extraction improved greatly. The precision rate decreased little from 94.59% to 93.56%, but the recall rate increased a lot from 80.39% to 98.55%.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognition 37, 977–997 (2004)
Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recognition 31, 2055–2076 (1998)
Sato, T., Kanade, T., Hughes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content based Access of Image and Video Databases, Bombay, India, pp. 52–60 (1998)
Sin, B., Kim, S., Cho, B.: Locating characters in scene images using frequency features. In: Proceedings of International Conference on Pattern Recognition, Quebec, Canada, pp. 489–492 (2002)
Wu, V., Manmatha, R., Riseman, E.M.: TextFinder: an automatic system to detect and recognize text in images. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1224–1229 (1999)
Zhang, D., Chang, S.: Learning to Detect Scene Text Using a Higher-order MRF with Belief Propagation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2004), Washington, DC, United States, pp. 101–108 (2004)
Fu, H., Liu, X., Jia, Y.: Gaussian Mixture Modeling of Neighbor Characters for Multilingual Text Extraction in Images. In: IEEE International Conference on Image Processing 2006 (ICIP 2006), Atlanta (accepted, 2006)
Juang, B.H., Chou, W., Lee, C.H.: Minimum Classification Error Rate Methods for Speech Recognition. IEEE Trans. Speech and Audio Processing 5, 257–265 (1997)
Jiqing, H., Wen, G.: Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction. Journal of Computer Science and Technology 16, 458–464 (2001)
Rui, Z., Xiaoqing, D.: Minimum Classification Error Training for Handwritten Character Recognition. In: 16th International Conference on Pattern Recognition, August 2002, vol. 1, pp. 580–583 (2002)
Liu, X., Jia, Y., Chen, X., Fu, H., Wang, Y.: Maximum-Minimum Similarity Training Criterion for Pattern Recognition. Technical Report (2006), http://www.mcislab.org.cn/reports/
Moerland, P.: A comparison of mixture models for density estimation. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN 1999), vol. 1, pp. 25–30 (1999)
Fu, H., Liu, X., Jia, Y.: Text Area extraction Method Based on Edge-pixels Clustering. In: Proceedings of the 8th International Computer Scientists, Convergence of Computing Technologies in the New Era, Beijing, pp. 446–450 (2005)
Yuan, Y., Sun, W.: Optimization Theory and Methods (in Chinese). Since Press (2003)
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: Icdar 2003 robust reading competitions. In: Proceeding of the 7th International Conference on Document Analysis and Recognition, Edinburgh, UK, pp. 682–687 (2003)
Karatzas, D., Antonacopoulos, A.: Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Colour Perception. In: IEEE, Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, UK, pp. 634–637 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, H., Liu, X., Jia, Y. (2006). Maximum-Minimum Similarity Training for Text Extraction. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893295_31
Download citation
DOI: https://doi.org/10.1007/11893295_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46484-6
Online ISBN: 978-3-540-46485-3
eBook Packages: Computer ScienceComputer Science (R0)