Lahajat: A rule-based converter of standard Arabic lexical databases into spoken Arabic forms | IEEE Conference Publication | IEEE Xplore

Lahajat: A rule-based converter of standard Arabic lexical databases into spoken Arabic forms


Abstract:

Lexical resources on Arabic tend to focus on the standard version of the language (Modern Standard Arabic, MSA), mostly used in written and formal sources. However, the d...Show More

Abstract:

Lexical resources on Arabic tend to focus on the standard version of the language (Modern Standard Arabic, MSA), mostly used in written and formal sources. However, the diffusion of informal genres has increasingly made it necessary the production of wider resources, encompassing the features of spoken varieties commonly found in written texts. The Lahajat project addresses this need by providing a series of rule-based transformations that enlarge existing lexical resources for MSA in order to cover for typical morphonological features found in spoken varieties. In particular, two specific case studies are shown that apply to two widely diverging varieties, Egyptian Arabic and Tunisian Arabish.
Date of Conference: 24-26 October 2016
Date Added to IEEE Xplore: 05 January 2017
ISBN Information:
Electronic ISSN: 2327-1884
Conference Location: Tangier, Morocco

References

References is not available for this document.