Skip to main content

A Framework for Language Resource Construction and Syntactic Analysis: Case of Arabic

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9623))

  • 1343 Accesses

Abstract

Language resources such as grammars or dictionaries are very important to any natural language processing application. Unfortunately, the manual construction of these resources is laborious and time-consuming. The use of annotated corpora as a knowledge database might be a solution to a fast construction of a grammar for a given language. In this paper, we present our framework to automatically induce a syntactic grammar from an Arabic annotated corpus (The Penn Arabic TreeBank), a probabilistic context free grammar in our case. The developed system allows the user to build a probabilistic context free grammar from the annotated corpus syntactic trees. It’s also offer the possibility to parse Arabic sentences using the generated resource. Finally, we present evaluation results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://nlp.stanford.edu/software/parser-arabic-data-splits.shtml.

References

  1. Khoufi, N., Boudokhane, M.: Statistical-based system for morphological annotation of Arabic texts. In: Proceedings of the Recent Advances in Natural Language Processing (RANLP 2013), Hissar, Bulgaria, pp. 100–106 (2013)

    Google Scholar 

  2. McCord, M.C., Cavalli-Sforza, V.: An Arabic slot grammar parser. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, pp. 81–88. Association for Computational Linguistics (2007)

    Google Scholar 

  3. Buckwalter, T.: Buckwalter Arabic morphological analyzer version 2.0 (2004)

    Google Scholar 

  4. Bataineh, B.M., Bataineh, E.A.: An efficient recursive transition network parser for Arabic language. In: Proceedings of the World Congress on Engineering, vol. 2, pp. 1–3 (2009)

    Google Scholar 

  5. Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Advances in Neural Information Processing Systems 15 (NIPS 2002), pp. 3–10. MIT Press, Cambridge (2003)

    Google Scholar 

  6. Green, S., Manning, C.D.: Better Arabic parsing: baselines, evaluations, and analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 394–402. Association for Computational Linguistics, August 2010

    Google Scholar 

  7. Al-Taani, A., Msallam, M., Wedian, S.: A top-down chart parser for analyzing Arabic sentences. Int. Arab J. Inf. Technol. 9, 109–116 (2012)

    Google Scholar 

  8. Alqrainy, S., Muaidi, H., Alkoffash, M.S.: Context-free grammar analysis for Arabic sentences. Int. J. Comput. Appl. 53(3), 7–11 (2012)

    Google Scholar 

  9. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media Inc., Sebastopol (2009)

    MATH  Google Scholar 

  10. Khoufi, N., Aloulou, C., Hadrich Belguith, L.: Parsing Arabic using induced probabilistic context free grammar. Int. J. Speech Technol. 19, 1–11 (2015). https://doi.org/10.1007/s10772-015-9300-x

    Google Scholar 

  11. Habash, N.Y.: Introduction to Arabic Natural Language Processing: Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, San Rafael (2010). G. Hirst (Series ed.) 3(1)

    Google Scholar 

  12. Hajic, J., VidovĂ¡-HladkĂ¡, B., Pajas, P.: The Prague dependency treebank: annotation structure and support. In: Proceedings of the IRCS Workshop on Linguistic Databases, pp. 105–114 (2001)

    Google Scholar 

  13. Habash, N.Y., Roth, R.M.: CATiB: The Columbia Arabic Treebank. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 221–224. Association for Computational Linguistics, Stroudsburg, August 2009

    Google Scholar 

  14. Maamouri, M., Bies, A., Buckwalter, T., Mekki, W.: The Penn Arabic Treebank: building a large-scale annotated Arabic corpus. In: The NEMLAR Conference on Arabic Language Resources and Tools, pp. 102–109, September 2004

    Google Scholar 

  15. Maamouri, M., Bies, A., Kulick, S.: Enhancing the Arabic Treebank: a collaborative effort toward new annotation guidelines. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, 28–30 May 2008

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nabil Khoufi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khoufi, N., Aloulou, C., Hadrich Belguith, L. (2018). A Framework for Language Resource Construction and Syntactic Analysis: Case of Arabic. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75477-2_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75476-5

  • Online ISBN: 978-3-319-75477-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics