Skip to main content

A Rule-Based Morphosemantic Analyzer for French for a Fine-Grained Semantic Annotation of Texts

  • Conference paper
Systems and Frameworks for Computational Morphology (SFCM 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 380))

Abstract

We describe DériF, a rule-based morphosemantic analyzer developed for French. Unlike existing word segmentation tools, DériF provides derived and compound words with various sorts of semantic information: (1) a definition, computed from both the base meaning and the specificities of the morphological rule; (2) lexical-semantic features, inferred from general linguistic properties of derivation rules; (3) lexical relations (synonymy, (co-)hyponymy) with other, morphologically unrelated, words belonging to the same analyzed corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Plag, I.: Word-formation in English. Cambridge University Press, Cambridge (2003)

    Book  Google Scholar 

  2. Cartoni, B., Lefer, M.-A.: Improving the representation of word-formation in multilingual lexicographic tools: the MuLeXFoR database. In: XIV EURALEX, pp. 581–591. Fryske Academy, Leeuwarden (2010)

    Google Scholar 

  3. Creutz, M., Lagus, K.: Inducing the Morphological Lexicon of a Natural Language from Unannotated Text. In: AKRR 2005, pp. 106–113. Pattern Recognition Society of Finland, Helsinki (2005)

    Google Scholar 

  4. Sagot, B.: The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French. In: LREC 2010, pp. 2744–2751. ELRA, La Valetta (2010)

    Google Scholar 

  5. Bernhard, D., Cartoni, B., Tribout, D.: A Task-Based Evaluation of French Morphological Resources and Tools. Linguistic Issues in Language Technology 5, 2 (2011)

    Google Scholar 

  6. Bilotti, M.W., Katz, B., Lin, J.: What Works Better for Question Answering: Stemming or Morphological Query Expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) (Workshop at SIGIR 2004), Sheffield (2004)

    Google Scholar 

  7. Dasgupta, S., Ng, V.: Unsupervised morphological parsing of Bengali. Language Resources and Evaluation 40(3-4), 311–330 (2006)

    Article  Google Scholar 

  8. Goldsmith, J.: An algorithm for the unsupervised learning of morphology. Computational Linguistics 27(2), 153–198 (2001)

    Article  MathSciNet  Google Scholar 

  9. Cavar, D., Rodriguez, P., Schrementi, G.: Unsupervised morphology induction for part-of-speech-tagging. In: Proceedings of the 29th Annual Penn Linguistics Colloquium, vol. 12(1), pp. 29–41. University of Pennsylvania, Philadelphia (2006)

    Google Scholar 

  10. Claveau, V.: Unsupervised and semi-supervised morphological analysis for Information Retrieval in the biomedical domain. In: COLING, Mumbai, India, pp. 629–646 (2012)

    Google Scholar 

  11. Bernhard, D.: Automatic Acquisition of Semantic Relationships from Morphological Relatedness. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 121–132. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Clément, L., Sagot, B., Lang, B.: Morphology based automatic acquisition of large-coverage lexica. In: LREC, pp. 1841–1844. ELRA, Lisbon (2004)

    Google Scholar 

  13. Wicentowski, R.: Multilingual Noise-Robust Supervised Morphological Analysis using the WordFrame Model. In: Proceedings of 7th Meeting of the ACL Special Interest Group on Computational Phonology (SIGPHON), pp. 70–77. ACL, Barcelona (2004)

    Chapter  Google Scholar 

  14. Virpioja, S., Turunen, V.T., Spiegler, S., Kohonen, O., Kurimo, M.: Empirical Comparison of Evaluation Methods for Unsupervised Learning of Morphology. TAL 42(2), 45–90 (2011)

    Google Scholar 

  15. Stroppa, N., Yvon, F.: An Analogical Learner for Morphological Analysis. In: CoNLL, pp. 120–127. ACL, Ann Arbor (2005)

    Chapter  Google Scholar 

  16. Hathout, N.: Morphonette: a paradigm-based morphological network. Lingue e Linguaggio 2, 245–264 (2011)

    Google Scholar 

  17. Moreau, F., Claveau, V., Sébillot, P.: Automatic morphological query expansion using analogy-based machine learning. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 222–233. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  18. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  19. Hull, A.D.: Stemming Algorithms - A case study for detailed evaluation. Journal of the American Society of Information Science 47(1), 70–84 (1996)

    Article  Google Scholar 

  20. Juravsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, New Jersey (2000)

    Google Scholar 

  21. Cohen-Sygal, Y., Wintner, S.: Finite-State Registered Automata for Non-Concatenative Morphology. Computational Linguistics 32(1), 49–82 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  22. Walther, M.: Temiar reduplication in one-level prosodic morphology. In: Proceedings of SIGPHON, Workshop on Finite-State Phonology, Luxembourg, pp. 13–21 (2000)

    Google Scholar 

  23. Pacak, M.G., Norton, L.M., Dunham, G.S.: Morphosemantic Analysis of -ITIS Forms in Medical Language. In: Methods of Information in Medecine, pp. 99–105 (1980)

    Google Scholar 

  24. Schulz, S., Hahn, U.: Morpheme-based, cross-lingual indexing for medical document retrieval. International Journal of Medical Informatics 58-59, 87–99 (2000)

    Article  Google Scholar 

  25. Markó, K., Schulz, S., Hahn, U.: MorphoSaurus – design and evaluation of an interlingua-based, cross-language docuyment retrieval engine for the medical domain. Methods of Information in Medecine 44(4), 537–545 (2005)

    Google Scholar 

  26. Cartoni, B.: Lexical Morphology in Machine Translation: A Feasibility Study. In: Proceedings of the 12th EACL, pp. 130–138. ACL, Athens (2009)

    Chapter  Google Scholar 

  27. Namer, F., Baud, R.: Defining and relating biomedical terms: towards a cross-language morphosemantics-based system. International Journal of Medical Informatics 76(2-3), 226–233 (2007)

    Article  Google Scholar 

  28. Deléger, L., Namer, F., Zweigenbaum, P.: Morphosemantic parsing of medical compound words: Transferring a French analyzer to English. International Journal of Medical Informatics 78(suppl.1), 48–55 (2009)

    Article  Google Scholar 

  29. Bernhard, D.: Apprentissage de connaissances morphologiques pour l’acquisition automatique de ressources lexicales. Université Joseph Fourier, Grenoble (2006)

    Google Scholar 

  30. Wilbur, W.J.: BioNLP: Biological, Translational and clinical language processing, pp. 201–208. ACL, Prague (2007)

    Google Scholar 

  31. Clark, P., Fellbaum, C., Hobbs, J.R., Harrison, P., Murray, B., Thompson, J.: Augmenting WordNet for deep understanding of text. In: Proceedings of Semantics in Text Processing, pp. 45–57. ACL, Venezia (2008)

    Google Scholar 

  32. Dal, G., Hathout, N., Namer, F.: Construire un lexique dérivationnel: théorie et réalisations. In: TALN 1999, pp. 115–124. Université Paris 7, Cargèse (1999)

    Google Scholar 

  33. Namer, F.: Morphologie, Lexique et TAL: l’analyseur DériF. Hermes Sciences Publishing, London (2009)

    Google Scholar 

  34. Sapir, E.: Language. Harcourt, Brace and Company, New York (1921)

    Google Scholar 

  35. Aikhenvald, A.Y.: Typological distinctions in word-formation. In: Shopen, T. (ed.) Language Typology and Syntactic Description. Grammatical Categories and the Lexicon, vol. III, pp. 1–65. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  36. Corbett, G.: Canonical Derivational Morphology. Word Structure 3(2), 141–155 (2010)

    Article  Google Scholar 

  37. Hathout, N., Namer, F.: Discrepancy between form and meaning in Word Formation: the case of over- and under-marking in French. In: Rainer, F., Dressler, W.U., Gardani, F., Luschützky, H.C. (eds.) Morphology and Meaning (Selected Papers from the 15th International Morphology Meeting), Vienna. John Benjamins, Amsterdam (2010)

    Google Scholar 

  38. Hathout, N., Namer, F.: Règles et paradigmes en morphologie informatique lexématique. In: TALN 2011, pp. 215–220. LIRMM/ATALA, Montpellier (2011)

    Google Scholar 

  39. Lüdeling, A.: Neoclassical word-formation, 2nd edn. Encyclopedia of Language and Linguistics, pp. 580–582. Elsevier (2006)

    Google Scholar 

  40. Baayen, R.H.: Quantitative aspects of morphological productivity. Yearbook of Morphology 1991, 109–149 (1992)

    Article  Google Scholar 

  41. Namer, F., Bouillon, P., Jacquey, E.: Un lexique Génératif de référence pour le Français. In: TALN 2007, pp. 233–242. ERSS, Toulouse (2007)

    Google Scholar 

  42. Namer, F., Jacquey, E.: Word Formation Rules and the Generative Lexicon: Representing noun-to-verb versus verb-to-noun Conversion. In: Pustejovsky, J., Bouillon, P., Isahara, H., Kanzaki, K., Chungmin, L. (eds.) Advances in Generative Lexicon Theory, pp. 385–414. Springer, Heidelberg (2012)

    Google Scholar 

  43. Ruimy, N., Monachini, M., Distnte, R., Guazzini, E., Molino, S., Uliveri, M., Calzolari, N., Zampolli, A.: CLIPS, A Multi-level Italian Computational Lexicon. In: LREC, pp. 792–799. ELRA, Las Palmas de Gran Canaria (2002)

    Google Scholar 

  44. Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)

    Google Scholar 

  45. Namer, F., Bouillon, P., Jacquey, E., Ruimy, N.: Morphology-based enhancement of a French SIMPLE Lexicon. In: 5th International Conference on Generative Approaches to the Lexicon, pp. 153–161. ILC-CNR, Pisa (2009)

    Google Scholar 

  46. Chmielik, J., Grabar, N.: Détection de la spécialisation scientifique et technique des docu-ments biomédicaux grâce aux informations morphologiques. TAL 52(2), 151–179 (2011)

    Google Scholar 

  47. Cartoni, B., Zweigenbaum, P.: Extension of a specialised lexicon using specific termino-logical data: the Unified Medical Lexicon for French (UMLF). In: Proceedings of 14th EURALEX, pp. 892–905. De Skriuwers, Leeuwarden (2010)

    Google Scholar 

  48. Lieber, R., Štekauer, P.: Introduction: status and definition of compounding. In: Lieber, R., Štekauer, P. (eds.) The Oxford Handbook of Compounding, pp. 3–18. Oxford University Press, Oxford (2009)

    Google Scholar 

  49. Montermini, F.: Units in compounding. In: Scalise, S., Vogel, I. (eds.) Cross-Disciplinary Issues in Compounding, pp. 79–82. Benjamins, Amsterdam (2010)

    Google Scholar 

  50. Dal, G., Amiot, D.: La composition néoclassique en français et ordre des constituants. In: Amiot, D. (ed.) La Composition Dans une Perspective Typologique, pp. 89–113. Artois Presse Université, Arras (2008)

    Google Scholar 

  51. Namer, F.: Guessing the meaning of neoclassical compound within LG: the case of pathol-ogy nouns. In: 3d Workshop on Generative Approaches to the Lexicon, pp. 175–184. Université de Genève, Geneva (2005)

    Google Scholar 

  52. Quintard, L., Galibert, O., Adda, G., Grau, B., Laurent, D., Moriceau, V.R., Rosset, S., Tannier, X., Vilnat, A.: Question Answering on Web Data: The QA Evaluation in Quæro. In: LREC 2010, pp. 2368–2374. ELRA, La Valletta (2010)

    Google Scholar 

  53. Ayache, C., Grau, B., Vilnat, A.: EQueR: the French Evaluation campaign of Question-Answering Systems. In: LREC 2006, pp. 1157–1160. ELRA, Genova (2006)

    Google Scholar 

  54. Grappy, A., Grau, B., Ferret, O., Grouin, C., Moriceau, V.R., Robba, I., Tannier, X., Vilnat, A., Barbier, V.: A Corpus for Studying Full Answer Justification. In: LREC 2010, pp. 2361–2367. ELRA, La Valletta (2010)

    Google Scholar 

  55. Namer, F.: Analyse automatique des noms déverbaux composés: pourquoi et comment faire intéragir analogie et système de règles. In: TALN 2009, pp. 1–10. ATALA, Senlis (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Namer, F. (2013). A Rule-Based Morphosemantic Analyzer for French for a Fine-Grained Semantic Annotation of Texts. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40486-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40486-3_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40485-6

  • Online ISBN: 978-3-642-40486-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics