Abstract
In this article, we propose a new approach to segmentation-free word spotting that is based on the combination of three different contributions. Firstly, inspired by the success of bounding box proposal algorithms in object recognition, we propose a scheme to generate a set of word-independent text box proposals. For that, we generate a set of atomic bounding boxes based on simple connected component analysis that are combined using a set of spatial constraints in order to generate the final set of text box proposals. Secondly, an attribute representation based on the Pyramidal Histogram of Characters (PHOC) is encoded in an integral image and used to efficiently evaluate text box proposals for retrieval. Thirdly, we also propose an indexing scheme for fast retrieval based on character n-grams. For the generation of the index a similar attribute space based on a Pyramidal Histogram of Character N-grams (PHON) is used. All attribute models are learned using linear SVMs over the Fisher Vector representation of the word images along with the PHOC or PHON labels of the corresponding words. We show the performance of the proposed approach in both tasks of query-by-string and query-by-example in standard single- and multi-writer data sets, reporting state-of-the-art results.











Similar content being viewed by others
Notes
We understand by segmentation-free the ability of the method to search a word in a whole non-segmented page as opposed to a segmentation-based scenario where retrieval is performed on segmented word images.
References
Plamondon, R., Srihari, S.: On-line and offline handwriting recognition: a comprehensive survey. IEEE Trans. PAMI 22(1), 63–84 (2000)
Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. PAMI 26, 709–720 (2004)
Rodríguez-Serrano, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained hand-written documents. In: ICFHR (2008)
Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. 9, 123 (2007)
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42, 3169 (2009)
Leydier, Y., Ouji, A., Lebourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit. 42, 2089–2105 (2009)
Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: International Conference on Document Analysis and Recognition, pp. 827–831 (2013)
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211 (2012)
Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48, 545–555 (2014)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004)
Rothacker, L., Rusiñol, M., Fink, G.A.: Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1305–1309 (2013)
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recognit. 47, 3967 (2014)
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. PAMI 36, 2552 (2014)
Sudholt, S., Fink, G.A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 277–282 (2016)
Poznanski, A., Wolf, L.: CNN-N-gram for handwriting word recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2305–2314 (2016)
Ghosh, S.K., Valveny, E.: A sliding window framework for word spotting based on word attributes. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 652–661 (2015)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from Edges. In: ECCV (2014)
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154 (2013)
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116, 1 (2015)
Kovalchuk, A., Wolf, L, Dershowitz, N.: A simple and fast word spotting method. In: International Conference on Frontiers in Handwriting Recognition (2014)
Wilkinson, T., Brun, A.: A novel word segmentation method based on object detection and deep learning. In: International Symposium of Advances in Visual Computing, pp. 231–240 (2015)
Ghosh, S.K., Valveny, E.: Query by string word spotting based on character bi-gram indexing. In: 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 881–885 (2015)
Santosh, K.C.: g-DICE: graph mining-based document information content exploitation. Int. J. Doc. Anal. Recognit. (IJDAR) 18, 337–355 (2015)
Hassan, Tamir: User-guided wrapping of pdf documents using graph matching techniques. In: International Conference on Document Analysis and Recognition, pp. 631–635 (2009)
Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: a string kernel for SVM protein classification. In: Pacific Symposium on Biocomputing (2002)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLR 2, 419 (2002)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: ECCV (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR (2005)
Jones, M.N., Mewor, D.J.K.: Case-sensitive letter and bigram frequency counts from large-scale English corpora. Behav Res Methods Instrum Comput 36, 388 (2004)
Rath, T., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139 (2007)
Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. IJDAR 5, 39 (2002)
Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The ESPOSALLES Database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recognit. (PR) 46, 1658 (2013)
Leydier, Y., Lebourgeois, F., Emptoz, H.: Text search for medieval manuscript images. Pattern Recognit. 40(12), 3552 (2007)
Terasawa, K., Tanaka, Y.: Slit style HOG feature for document image word spotting. In: Proceedings of the International Conference on Document Analysis and Recognition (2009)
Wilkinson, T., Lindström, J., Brun, A.: Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections. In: IEEE Conference on Computer Vision (2017)
Rothacker, L., Fink, G.A.: Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 13th International Conference on Document Analysis and Recognition (ICDAR) (2015)
Liang, Y., Fairhurst, M., Guest, R.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45, 4225 (2012)
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33, 934 (2012)
Krishnan, P., Jawahar, C.V.: Matching handwritten document images. In: ECCV, pp. 766–782 (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ghosh, S., Valveny, E. Text box proposals for handwritten word spotting from documents. IJDAR 21, 91–108 (2018). https://doi.org/10.1007/s10032-018-0300-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-018-0300-7