Abstract
Nowadays, in the context of email as well as many other domains, there are more and more French texts wrongly accented or completely unaccented. Furthermore, it should be noted that in French, the accent has a value and a linguistic function. It expresses the language’s subtleties and especially allows avoiding ambiguities and misinterpretation. Even though in most cases the loss of information resulting from the absence of accents is not a major issue for human beings, it is very problematic for automatic processing of text and increases the ambiguity involved in Natural Language Processing. However, it gets tedious to do this manually hence the importance of automatic accent restoration systems. In this perspective, this paper aims at presenting a novel system for the automatic restoration of accents in French texts. Unlike a few existing approaches using statistical methods, our approach is essentially based on linguistic rules that are more reliable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Simard, M.: Réaccentuation automatique de textes français. Centre d’innovation en technologies de l’information (CITI), Laval (1996)
El-beze, M., Spriet, T.: Réaccentuation automatique de textes. Laboratoire Informatique d’Avignon, LIA (1996)
Mary, V., Le beux, P.: Grepator: Accents & Case Mix for Thesaurus. In: Connecting Medical Informatics and Bio-Informatics: Proceedings of the XIXth International Congress of the European Federation for Medical Informatics, pp. 787–792. IOS Press (2005)
Imprimerie, N.: Lexique des règles typographiques en usage à l’Imprimerie nationale. Imprimerie nationale (2002)
Grevisse, M., Goosse, A.: le bon usage électronique: grammaire française, 14th edn., de boeck duculot (2007)
Doppagne, A.: Majuscules, abréviations, symboles et sigle pour une toilette parfaite du texte, 3e édition, Paris, Bruxelles, Duculot (1998)
Bioud, M.: Une normalisation sur l’emploi de la majuscule et sa représentation formelle pour un système de vérification automatique des majuscules dans un texte: thèse de doctorat, Centre de recherche Lucien Tesnière, Université de Franche-Comté (2006)
Al-Shafi, B.: Traitement informatique des signes diacritiques, pour une application automatique et didactique: thèse de doctorat, Centre de recherche Lucien Tesnière, Université de Franche-Comté (1996)
Feuto, N.P.B.: Rule based approach for normalizing messages in the security domain. In: Natural Language Processing and Human Language Technology, BULAG n36, PUFC (2011) ISSN 0758 6787
Cardey, S., Greenfield, P.: A Core Model of Systemic Linguistic Analysis. In: Proceedings of the International Conference RANLP 2005 Recent Advances in Natural Language Processing, Borovets, Bulgaria (September 2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feuto Njonko, P.B., Cardey-Greenfield, S., Greenfield, P. (2012). Linguistic Rules Based Approach for Automatic Restoration of Accents on French Texts. In: Isahara, H., Kanzaki, K. (eds) Advances in Natural Language Processing. JapTAL 2012. Lecture Notes in Computer Science(), vol 7614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33983-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-33983-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33982-0
Online ISBN: 978-3-642-33983-7
eBook Packages: Computer ScienceComputer Science (R0)