Abstract
The original Bag-of-Visual-Words approach discards the spatial relations of the visual words. In this paper, a LDA-based topic model is adopted to obtain the semantic relations of visual words for each word image. Because the LDA-based topic model usually hurts retrieval performance when directly employs itself. Therefore, the LDA-based topic model is linearly combined with a visual language model for each word image in this study. After that, the basic query likelihood model is used for realizing the procedure of retrieval. The experimental results on our dataset show that the proposed LDA-based representation approach can efficiently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents. Meanwhile, the proposed approach improves the performance significantly than the original BoVW approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Manmatha, R., Han, C., Riseman, E.M., Croft, W.B.: Indexing handwriting using word matching. In: Proceedings of ICDL 1996, pp. 151–159. ACM Press, New York (1996)
Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: Proceedings of ICDAR 2003, pp. 218–222. IEEE Press, New York (2003)
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of CVPR 2003, pp. 521–527. IEEE Press, New York (2003)
Chen, X., Hu, X., Shen, X.: Spatial weighting for bag-of-visual-words and its application in content-based image retrieval. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 867–874. Springer, Heidelberg (2009)
Tirilly, P., Claveau, V., Gros, P.: Distance and weighting schemes for bag of visual words image retrieval. In: Proceedings of MIR 2010, pp. 323–332. ACM Press, New York (2010)
Zhu, L., Jin, H., Zheng, R., Feng, X.: Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9), 509–518 (2014)
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). PP. 120–126
Lopes-Monroy, A.P., Montes-Y-Gomez, M., Escalante, H.J., Cruz-Roa, A., Gonzalez, F.A.: Improving the BoVW via discriminative visual n-grams and MKL strategies. Neurocomputing 175, 768–781 (2016)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR 2010, pp. 3360–3367. IEEE Press, New York (2010)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of CVPR 2006, pp. 2169–2178. IEEE Press, New York (2006)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), pp. 275–281. ACM Press, New York (1998)
Blei, D.M., Ng, A.Y., Jordan, M.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), pp. 178–185. ACM Press, New York (2006)
Wu, L., Li, M., Li, Z., Ma, W., Yu, N.: Visual language modeling for image classification. In: Proceedings of MIR 2007, pp. 115–124. ACM Press, New York (2007)
Wei, H., Gao, G., Bao, Y., Wang, Y.: An efficient binarization method for ancient Mongolian document images. In: Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010), pp. 43–46. IEEE Press, New York (2010)
Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recogn. (IJDAR) 17(1), 33–45 (2014)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 334–342. ACM Press, New York (2001)
Acknowledgements
The paper is supported by the National Natural Science Foundation of China under Grant 61463038 and the Research Project of Higher Education School of Inner Mongolia Autonomous Region under Grant NJZY14007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wei, H., Gao, G., Su, X. (2016). LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_52
Download citation
DOI: https://doi.org/10.1007/978-3-319-46681-1_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)