research-article

Morphar+: an Arabic morphosyntactic analyzer

Authors:
Zouhir Zemirli

Al-Imam Muhammad Ibn Saud Islamic University, Riyadh, Riyadh

Al-Imam Muhammad Ibn Saud Islamic University, Riyadh, Riyadh
View Profile

,
Yahya O. Mohamed Elhadj

Al-Imam Muhammad Ibn Saud Islamic University, Riyadh, Riyadh

Al-Imam Muhammad Ibn Saud Islamic University, Riyadh, Riyadh
View Profile

ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and InformaticsAugust 2012Pages 816–823https://doi.org/10.1145/2345396.2345529

Published:03 August 2012Publication History

ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and Informatics

Pages 816–823

ABSTRACT

We present in this paper an Arabic morpho-syntactic analyzer (Morphar+) built on top of the free Arabic Morphological analyzer (AraMorph). It is known that AraMorph produces a large number of morphological solutions, but little information to select the appropriate morphological solution for words in context. For this purpose, we start characterizing/describing all particles of the Arabic language, broken noun patterns, and most nominal and verbal sentence structures; next, we formulated dozens of rules associated with these descriptions and then programmed them in a simple and efficient manner to help deducing not only the appropriate solution but also both case and ending case marks. We divided the Arabic particles into groups according to their grammatical functions for extracting the exact and final morphological function of words. Appropriate contextual rules have been stated based on the above descriptions; after applying our contextual rules on the output produced by AraMorph, we obtained an improvement of about 6% in the number of correct words with an accurate morphological function. Our goal is to reduce the error rate to less than 5% in order to integrate this very fast and accurate morpho-syntactic analyzer into a system to translate Arabic written text into Arabic sign language; we believe that this will give enough information to achieve a quick and effective translation.

References

Alansary S, Nagi M, Adly N. Towards Analyzing the International Corpus of Arabic (ICA): Progress of Morphological Stage. 8th International Conference on Language Engineering, Egypt, December 2008.Google Scholar
Atwell E, Al-Sulaiti L, Al-Osaimi S, Abu-Shawar B. A Review of Arabic Corpus Analysis Tools. Proceedings of JEP-TALN'04 Arabic Language Processing, Fez, April 2004.Google Scholar
Tim Buckwalter. Buckwalter Arabic Morphological Analyzer Version 2.0. LDC Catalog No. LDC2004L02, Linguistic Data Consortium, 2004, www.ldc.upenn.edu/Catalog.Google Scholar
Beesley K. Xerox Arabic Morphological Analyzer Surface-Language (Unicode) documentation. Xerox Research Center Europe, 2003.Google Scholar
Berri J, Zidom H, Atif Y. Web-based Arabic Morphological Analyzer. In Gelbukh, A (Ed): CICLing 2001, LNCS 2004, pp 216--225, Springer-Verlag Berlin Heidlberg. Google ScholarDigital Library
Darwish K. Building a shallow Arabic Morphological Analyzer in One Day. ACL02 Workshop on Computer Processing of Semitic Languages, 2002. Google ScholarDigital Library
Sakhr's Morphological Analyzer.Google Scholar
RDI Arabic Morphological Analyzer.Google Scholar
Al-Khalil morphological analyzer.Google Scholar
Jaccarini A., Mourad G., Gaubert C, Dijioua B. Un logiciel pour la mise au point de grammaires pour le filtrage d'information en Arabe. TALN03, Batz-sur-Mer, 11--14 juin 2003.Google Scholar
Habash, Nizar, Owen Rambow and Ryan Roth. MADA+TOKAN: A Toolkit for Arabic Tokenization, Diacritization, Morphological Disambiguation, POS Tagging, Stemming and Lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt, 2009.Google Scholar
Boudlal, R. Belahbib, A. Lakhouaja, A. Mazroui, A. Meziane, M. Bebah. A Markovian Approach for Arabic Root Extraction. The International Arab Journal of Information Technology, Vol. 8, No. 1, January 2011.Google Scholar
Y.O.M. Elhadj, A. M. Alansari, LA AlSughayeir. Using Statistical Models for Automatic Recognition of Arabic Terms Properties (in Arabic). International Journal of Computer Science and Engineering in Arabic, Vol. 3, No 2, 2010.Google Scholar
Y.O.M. Elhadj, Z. Zemirli. Virtual Translator from Arabic text to Saudi Sign-Language (A2SaL). Annual Technical Report of the Project Number: 08-INF432-8, KACST, KSA, 2011.Google Scholar

Index Terms

Morphar+: an Arabic morphosyntactic analyzer
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules
ROCLING '11: ROCLING 2011 Poster Papers

Recognition of off-line handwritten Chinese character had been an important problem. Because of the variation and vagueness derived from different users' handwritings, it was hard to recognize handwriting characters via statistical features obtained ...
Read More
Empirical studies in strategies for Arabic retrieval
SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval

This work evaluates a few search strategies for Arabic monolingual and cross-lingual retrieval, using the TREC Arabic corpus as the test-bed. The release by NIST in 2001 of an Arabic corpus of nearly 400k documents with both monolingual and cross-...
Read More
Constructing lexicon with morpho-syntactic features from untagged corpora
ECC'09: Proceedings of the 3rd international conference on European computing conference

This article presents a computational method of morpho-syntactic rules which automatically creates a lexicon with morphological features after disambiguation and PoS tagging in large non annotated corpora. The method is tested and implemented in two ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and Informatics
August 2012
1307 pages
ISBN:9781450311960
DOI:10.1145/2345396
Editors:
Sabu M. Thampi,
El-Sayed El-Afry,
Javier Aguiar,
General Chairs:
K. Gopalan
Purdue University Calumet
,
Sabu M. Thampi
Indian Institute of Information Technology and Management - Kerala, India
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 August 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Arabic morphology and syntax
automatic translation
broken plurals
contextual rules
particles
syntactic rules
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 92
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Morphar+: an Arabic morphosyntactic analyzer

ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules

Empirical studies in strategies for Arabic retrieval

Constructing lexicon with morpho-syntactic features from untagged corpora

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Morphar+: an Arabic morphosyntactic analyzer

ICACCI '12: Proceedings of the International Conference on Advances in Computing, Communications and Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revision for recognizing Chinese handwritten sentences based on lexical, syntactical and corpus rules

Empirical studies in strategies for Arabic retrieval

Constructing lexicon with morpho-syntactic features from untagged corpora

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media