Keyword Spotting with Convolutional Deep Belief Networks and Dynamic Time Warping

Wicht, Baptiste; Fischer, Andreas; Hennebert, Jean

doi:10.1007/978-3-319-44781-0_14

Baptiste Wicht^16,17,
Andreas Fischer^16,17 &
Jean Hennebert^16,17

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9887))

Included in the following conference series:

International Conference on Artificial Neural Networks

3760 Accesses
1 Citations

Abstract

To spot keywords on handwritten documents, we present a hybrid keyword spotting system, based on features extracted with Convolutional Deep Belief Networks and using Dynamic Time Warping for word scoring. Features are learned from word images, in an unsupervised manner, using a sliding window to extract horizontal patches. For two single writer historical data sets, it is shown that the proposed learned feature extractor outperforms two standard sets of features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Chen, F.R., Wilcox, L.U., Bloomberg, D.S.: Word spotting in scanned images using Hidden Markov Models. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 5, pp. 1–4. IEEE (1993)
Google Scholar
Choisy, C.: Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition, vol. 1, pp. 242–246. IEEE (2007)
Google Scholar
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33, 934–942 (2012)
Article Google Scholar
Fischer, A., Wüthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of the International Conference on Virtual Systems and Multimedia, pp. 137–142. IEEE (2009)
Google Scholar
Forsyth, D., Jaety, E., Teh, Y.W., Maire, M., Bock, R.B., Vesom, G.: Making latin manuscripts searchable using gHMMs. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 17, p. 385. MIT Press (2005)
Google Scholar
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34, 211–224 (2012)
Article Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)
Article MATH Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Honglak, L., Chaitanya, E., Ng, A.Y.: Sparse deep belief net model for visual area V2. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 873–880 (2008)
Google Scholar
Kuo, S.S., Agazzi, O.E.: Keyword spotting in poorly printed documents using pseudo 2-D Hidden Markov Models. IEEE Trans. Pattern Anal. Mach. Intell. 16, 842–848 (1994)
Article Google Scholar
Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of the International Workshop on Document Image Analysis for Libraries, pp. 278–287. IEEE (2004)
Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the International Conference on Machine Learning, pp. 609–616. ACM (2009)
Google Scholar
Manmatha, R., Croft, W.: Word spotting: indexing handwritten archives. In: Intelligent Multimedia Information Retrieval Collection, pp. 43–64 (1997)
Google Scholar
Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 631–637. IEEE (1996)
Google Scholar
Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Pattern Recogn. Artif. Intell. 15, 65–90 (2001)
Article Google Scholar
Myers, C., Rabiner, L., Rosenberg, A.: An investigation of the use of dynamic time warping for word spotting and connected speech recognition. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 5, pp. 173–177. IEEE (1980)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning, pp. 807–814 (2010)
Google Scholar
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 521–527. IEEE (2003)
Google Scholar
Rath, T.M., Manmatha, R.: Word spotting for historical documents. Int. J. Doc. Anal. Recogn. (IJDAR) 9, 139–152 (2007)
Article Google Scholar
Rodrıguez, J.A., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2008)
Google Scholar
Rose, R.C., Paul, D.B.: A Hidden Markov Model based keyword recognition system. In: Proceedings of the International Conference on Acoustics Speech, and Signal Processing, pp. 129–132. IEEE (1990)
Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 43–49 (1978)
Article MATH Google Scholar
Vinciarelli, A.: A survey on off-line cursive word recognition. Pattern Recogn. 35, 1433–1446 (2002)
Article MATH Google Scholar
Wicht, B., Hennebert, J.: Camera-based Sudoku recognition with deep belief network. In: Proceedings the of IEEE International Conference of Soft Computing and Pattern Recognition, pp. 83–88. IEEE (2014)
Google Scholar
Wicht, B., Hennebert, J.: Mixed handwritten and printed digit recognition in Sudoku with convolutional deep belief network. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Applied Science of Western Switerzland, Delémont, Switzerland
Baptiste Wicht, Andreas Fischer & Jean Hennebert
University of Fribourg, Fribourg, Switzerland
Baptiste Wicht, Andreas Fischer & Jean Hennebert

Authors

Baptiste Wicht
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Jean Hennebert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baptiste Wicht .

Editor information

Editors and Affiliations

University of Lausanne, Lausanne, Switzerland
Alessandro E.P. Villa
University of Lausanne, Lausanne, Switzerland
Paolo Masulli
Universitat Politécnica de Catalunya, Terrrassa, Spain
Antonio Javier Pons Rivero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wicht, B., Fischer, A., Hennebert, J. (2016). Keyword Spotting with Convolutional Deep Belief Networks and Dynamic Time Warping. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-44781-0_14
Published: 13 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44780-3
Online ISBN: 978-3-319-44781-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics