Abstract
Searching handwritten documents is a relatively unexplored frontier for documents in any language. Traditional approaches use either image-based or text-based techniques. This paper describes a framework for versatile search where the query can be either text or image, and the retrieval method fuses text and image retrieval methods. A UNICODE and an image query are maintained throughout the search, with the results being combined by a neural network. Preliminary results show positive results that can be further improved by refining the component pieces of the framework (text transcription and image search).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
Taghva, K., Borsack, J., Condit, A.: Results of applying probabilistic IR to OCR text. In: Research and Development in Information Retrieval, pp. 202–211 (1994)
Russell, G., Perrone, M., Chee, Y.M., Ziq, A.: Handwritten Document Retrieval. In: Proc. Eighth International Workshop on Frontiers in Handwriting Recognition, Niagara-on-the-lake, Ontario, pp. 233–238 (2002)
Srihari, S.N., Huang, C., Srinivasan, H.: A search engine for handwritten documents. In: Document Recognition and Retrieval XII: Proceedings SPIE, San Jose, CA, pp. 66–75 (2005)
Srihari, S.N., Zhang, B., Tomai, C., Lee, S., Shi, Z., Shin, Y.C.: A search engine for handwritten documents. In: Proc. Symposium on Document Image Understanding Technology (SDIUT 2005), Greenbelt, MD, pp. 67–75 (2003)
Srihari, S.N., Shi, Z.: Forensic handwritten document retrieval system. In: Proc. Document Image Analysis for Libraries (DIAL), Palo Alto, CA, pp. 188–194. IEEE Computer Society, Los Alamitos (2004)
Srihari, S.N., Srinivasan, H., Babu, P., Bhole, C.: Handwritten Arabic word spotting using the CEDARABIC document analysis system. In: Proc. Symposium on Document Image Understanding Technology (SDIUT 2005), College Park, MD, pp. 123–132 (2005)
Srihari, S.N., Srinivasan, H., Babu, P., Bhole, C.: Spotting words in handwritten Arabic documents. In: Document Recognition and Retrieval XIII: Proceedings SPIE, San Jose, CA, pp. 606702-1– 606702-12 (2006)
Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(4), 366–379 (1997)
Kim, G., Govindaraju, V., Srihari, S.N.: A segmentation and recognition strategy for handwritten phrases. In: International Conference on Pattern Recognition. ICPR-13, pp. 510–514 (1996)
Huang, C., Srihari, S.N.: Mapping transcripts to handwritten text. In: Proc. Tenth International Workshop on Frontiers in Handwriting Recognition (IWFHR), La Boule, France, IEEE Computer Society, Los Alamitos (2006)
Zhang, B., Srihari, S.N.: Binary vector dissimilarity measures for handwriting identification. In: Proceedings of the SPIE, Document Recognition and Retrieval, pp. 155–166 (2003)
Srihari, S.N., Tomai, C.I., Zhang, B., Lee, S.: Individuality of numerals. In: Proc. Seventh International Conference on Document Analysis and Recognition (ICDAR), Edinburgh, UK, p. 1096. IEEE Computer Society, Los Alamitos (2003)
Kim, G.: Recognition of offline handwritten words and extension to phrase recognition. Doctoral Dissertation, State University of New York at Buffalo (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Srihari, S.N., Ball, G.R., Srinivasan, H. (2008). Versatile Search of Scanned Arabic Handwriting. In: Doermann, D., Jaeger, S. (eds) Arabic and Chinese Handwriting Recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78199-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-78199-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78198-1
Online ISBN: 978-3-540-78199-8
eBook Packages: Computer ScienceComputer Science (R0)