Abstract
Arabic handwritten text recognition has not received the same attention as that directed towards Latin script-based languages. In this paper, we present our efforts to develop a comprehensive Arabic Handwritten Text database (AHTD). At this stage, the database will consist of text written by 1000 writers from different countries. Currently, it has data from over 300 writers. It is composed of an images database containing images of the written text at various resolutions, and a ground truth database that contains meta-data describing the written text at the page, paragraph, and line levels. Tools to extract paragraphs from pages, segment paragraphs into lines have also been developed. Segmentation of lines into words will follow. The database will be made freely available to researchers world-wide. It is hoped that the AHTD database will stir research efforts in various handwritten-related problems such as text recognition, and writer identification and verification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Paul Lewis, M.: Ethnologue: Languages of the World. In: SIL International, 16th edn., Dallas, Texas (2011) http://www.ethnologue.com/ (last accessed on January 24, 2011)
Al-Badr, B., Mahmoud, S.A.: Survey and bibliography of Arabic optical text recognition. Signal Processing 41(1), 49–77 (1995)
Al-Muhtaseb, H.A., Mahmoud, S.A., Qahwaji, R.S.: Recognition of off-line printed Arabic text using hidden markov models. Signal Process. 88(12), 2902–2912 (2008)
Märgner, V., El Abed, H.: Databases and competitions: Strategies to improve arabic recognition systems. In: Doermann, D., Jaeger, S. (eds.) SACH 2006. LNCS, vol. 4768, pp. 82–103. Springer, Heidelberg (2008)
Lorigo, L.M., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(5), 712–724 (2006)
Marti, U.-V., Bunke, H.: A full English sentence database for off-line handwriting recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 705–708 (1999)
Hull, J.J.: A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(5), 550–554 (1994)
Dimauro, G., Impedovo, S., Modugno, R., Pirlo, G.: A new database for research on bank-check processing. In: Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition, pp. 524–528 (2002)
Abuhaiba, I.S.I., Mahmoud, S.A., Green, R.J.: Recognition of handwritten cursive Arabic characters. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(6), 664–672 (1994)
Kharma, N., Ahmed, M., Ward, R.: A new comprehensive database of handwritten Arabic words, numbers, and signatures used for ocr testing. In: IEEE Canadian Conference on Electrical and Computer Engineering, vol. 2, pp. 766–768 (1999)
Pechwitz, M., Snoussi Maddouri, S., Mrgner, V., Ellouze, N., Amiri, H.: IFN/ENIT - database of handwritten Arabic words. In: Proceedings of the 7th Colloque International Francophone sur l’Ecrit et le Document, CIFED (2002)
El Abed, H., Mrgner, V.: The IFN/ENIT-database - a tool to develop Arabic handwriting recognition systems. In: IEEE International Symposium on Signal Processing and its Applications, ISSPA (2007)
Khedher, M.Z., Abandah, G.: Arabic character recognition using approximate stroke sequence. In: Arabic Language Resources and Evaluation - Status and Prospects Workshop, Third Int’l Conf. on Language Resources and Evaluation, LREC 2002 (2002)
Al-Ma’adeed, S., Elliman, D., Higgins, C.A.: A data base for Arabic handwritten text recognition research. In: Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR 2002), pp. 485–489 (2002)
Al-Ohali, Y., Cheriet, M., Suen, C.: Databases for recognition of handwritten Arabic cheques. Pattern Recognition 36(1), 111–121 (2003)
El-Sherif, E.A., Abdelazeem, S.: A two-stage system for Arabic handwritten digit recognition tested on a new large database. In: International Conference on Artificial Intelligence and Pattern Recognition, pp. 237–242 (2007)
Alamri, H., Sadri, J., Suen, C.Y., Nobile, N.: A novel comprehensive database for Arabic off-line handwriting recognition. In: Proceedings of the 11 th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 664–669 (2008)
Mahmoud, S.: Recognition of writer-independent off-line handwritten Arabic (Indian) numerals using hidden markov models. Signal Processing 88(4), 844–857 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mahmoud, S.A., Ahmad, I., Alshayeb, M., Al-Khatib, W.G. (2011). A Database for Offline Arabic Handwritten Text Recognition. In: Kamel, M., Campilho, A. (eds) Image Analysis and Recognition. ICIAR 2011. Lecture Notes in Computer Science, vol 6754. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21596-4_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-21596-4_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21595-7
Online ISBN: 978-3-642-21596-4
eBook Packages: Computer ScienceComputer Science (R0)