Abstract
To spot keywords on handwritten documents, we present a hybrid keyword spotting system, based on features extracted with Convolutional Deep Belief Networks and using Dynamic Time Warping for word scoring. Features are learned from word images, in an unsupervised manner, using a sliding window to extract horizontal patches. For two single writer historical data sets, it is shown that the proposed learned feature extractor outperforms two standard sets of features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, F.R., Wilcox, L.U., Bloomberg, D.S.: Word spotting in scanned images using Hidden Markov Models. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 5, pp. 1–4. IEEE (1993)
Choisy, C.: Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition, vol. 1, pp. 242–246. IEEE (2007)
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33, 934–942 (2012)
Fischer, A., Wüthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of the International Conference on Virtual Systems and Multimedia, pp. 137–142. IEEE (2009)
Forsyth, D., Jaety, E., Teh, Y.W., Maire, M., Bock, R.B., Vesom, G.: Making latin manuscripts searchable using gHMMs. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 17, p. 385. MIT Press (2005)
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34, 211–224 (2012)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Honglak, L., Chaitanya, E., Ng, A.Y.: Sparse deep belief net model for visual area V2. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 873–880 (2008)
Kuo, S.S., Agazzi, O.E.: Keyword spotting in poorly printed documents using pseudo 2-D Hidden Markov Models. IEEE Trans. Pattern Anal. Mach. Intell. 16, 842–848 (1994)
Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of the International Workshop on Document Image Analysis for Libraries, pp. 278–287. IEEE (2004)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the International Conference on Machine Learning, pp. 609–616. ACM (2009)
Manmatha, R., Croft, W.: Word spotting: indexing handwritten archives. In: Intelligent Multimedia Information Retrieval Collection, pp. 43–64 (1997)
Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 631–637. IEEE (1996)
Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Pattern Recogn. Artif. Intell. 15, 65–90 (2001)
Myers, C., Rabiner, L., Rosenberg, A.: An investigation of the use of dynamic time warping for word spotting and connected speech recognition. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 5, pp. 173–177. IEEE (1980)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning, pp. 807–814 (2010)
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 521–527. IEEE (2003)
Rath, T.M., Manmatha, R.: Word spotting for historical documents. Int. J. Doc. Anal. Recogn. (IJDAR) 9, 139–152 (2007)
Rodrıguez, J.A., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2008)
Rose, R.C., Paul, D.B.: A Hidden Markov Model based keyword recognition system. In: Proceedings of the International Conference on Acoustics Speech, and Signal Processing, pp. 129–132. IEEE (1990)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 43–49 (1978)
Vinciarelli, A.: A survey on off-line cursive word recognition. Pattern Recogn. 35, 1433–1446 (2002)
Wicht, B., Hennebert, J.: Camera-based Sudoku recognition with deep belief network. In: Proceedings the of IEEE International Conference of Soft Computing and Pattern Recognition, pp. 83–88. IEEE (2014)
Wicht, B., Hennebert, J.: Mixed handwritten and printed digit recognition in Sudoku with convolutional deep belief network. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition. IEEE (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wicht, B., Fischer, A., Hennebert, J. (2016). Keyword Spotting with Convolutional Deep Belief Networks and Dynamic Time Warping. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-44781-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44780-3
Online ISBN: 978-3-319-44781-0
eBook Packages: Computer ScienceComputer Science (R0)