Abstract
The aim of the French Media project was to define a protocol for the evaluation of speech understanding modules for dialog systems. Accordingly, a corpus of 1,257 real spoken dialogs related to hotel reservation and tourist information was recorded, transcribed and semantically annotated, and a semantic attribute-value representation was defined in which each conceptual relationship was represented by the names of the attributes. Two semantic annotation levels are distinguished in this approach. At the first level, each utterance is considered separately and the annotation represents the meaning of the statement without taking into account the dialog context. The second level of annotation then corresponds to the interpretation of the meaning of the statement by taking into account the dialog context; in this way a semantic representation of the dialog context is defined. This paper discusses the data collection, the detailed definition of both annotation levels, and the annotation scheme. Then the paper comments on both evaluation campaigns which were carried out during the project and discusses some results.
Similar content being viewed by others
Notes
Exception: indefinite alterity expressions (e.g. another N) are annotated. In this case, the excluded entity has been annotated instead of the actual referent, which is undetermined. This is observed in turn C 16 of the dialog given in the Appendix.
References
Allemandou, J. (2007). SIMDIAL, un paradigme d’évaluation automatique de systèmes de dialogue homme-machine par simulation déterministe d’utilisateurs. Ph.D. thesis, Université Paris XI, Orsay.
Barras C., Geoffrois E., et al. (2001). Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication, 33(1–2), 5–22.
Bonneau-Maynard, H., Ayache, C., Bechet, F., et al. (2006). Results of the French Evalda-Media evaluation campaign for literal understanding. In Proceedings of the international conference on language resources and evaluation (LREC), Genoa (pp. 2054–2059).
Bonneau-Maynard, H., Devillers, L., & Rosset, S. (2000). Predictive performance of dialog systems. In Proceedings of the international conference on language resources and evaluation (LREC), Athens. (pp. 177–181).
Bonneau-Maynard, H., & Lefevre, F. (2005). A 2+1-level stochastic understanding model. In Proceedings of the IEEE automatic speech recognition and understanding workshop (ASRU), San Juan (pp. 256–261).
Bonneau-Maynard, H., & Rosset, S. (2003). Semantic representation for spoken dialog. In Proceedings of the European conference on speech communication and technology (Eurospeech), Geneva (pp. 253–256).
Carletta, J. (1996). Assessing agreement on classification tasks: The kappa statistics. Computational Linguistics, 2(22), 249–254.
Chinchor, N., & Hirschmann, L. (1997). MUC-7 coreference task definition (version 3.0). In Proceedings of message understanding conference (MUC-7).
Denis, A. (2008). Robustesse dans les systèmes de dialogue finalisés: Modélisation et évaluation du processus d’ancrage pour la gestion de l’incompréhension. Ph.D. thesis, Université Henri Poincaré, Nancy.
Denis, A., Béchet, F., & Quignard, M. (2007). Résolution de la référence dans des dialogues homme-machine : évaluation sur corpus de deux approches symbolique et probabiliste. In: Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Toulouse (pp. 261–270).
Denis, A., Quignard, M., & Pitel, G. (2006). A deep-parsing approach to natural language understanding in dialogue system: Results of a corpus-based evaluation. In Proceedings of the international conference on language resources and evaluation (LREC) (pp. 339–344).
Devillers, L., Bonneau-Maynard, H., et al. (2003). The PEACE SLDS understanding evaluation paradigm of the French MEDIA campaign. In EACL workshop on evaluation initiatives in natural language processing, Budapest (pp. 11–18).
FIPA. (2002). Communicative act library specification. Technical report SC00037J. Foundations for Intelligent Physical Agents, http://www.fipa.org/specs/fipa00037/.
Fiscus, J. (1997). A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER). In Proceedings of the IEEE automatic speech recognition and understanding workshop (ASRU), Santa Barbara, CA (pp. 347–352).
Giachim, E., & McGlashan, S. (1997). Spoken language dialog systems. In S. Young & G. Bloothooft (Eds.), Corpus based methods in language and speech processing (pp. 69–117). Dordrecht: Kluwer.
Gibbon, D., Moore, P., & Winski, R. (1997). Handbook of standards and resources for spoken language resources. New York: Mouton de Gruyter.
Hirschman, L. (1992). Multi-site data collection for a spoken language corpus. In Proceedings of the DARPA speech and natural language Workshop (pp. 7–14).
King, M., Maegaard, B., Schutz, J., et al. (1996). EAGLES—evaluation of natural language processing systems. Technical report EAG-EWG-PR.2, Centre for Language Technology, University of Copenhagen.
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th international conference on machine learning (ICML), Williamstown, MA (pp. 282–289).
Lamel, L., Rosset, S., et al. (1999). The LIMSI ARISE system for train travel information. In IEEE conference on acoustics, speech, and signal processing (pp. 501–504).
Lefévre, F., & Bonneau-Maynard, H. (2002). Issues in the development of a stochastic speech understanding system. In Proceedings of the international conference on spoken language processing (ICSLP), Denver (pp. 365–368).
Popescu-Belis, A., Rigouste, L., Salmon-Alt, S., & Romary, L. (2004). Online evaluation of coreference resolution. In Proceedings of the international conference on language resources and evaluation (LREC), Lisbon. (pp. 1507–1510).
Raymond, C., Béchet, F., De Mori, R., & Damnati, G. (2006). On the use of finite state transducers for semantic interpretation. Speech Communication, 48(3–4), 288–304.
Rosset, S., & Tribout, D. (2005). Multi-level information and automatic dialog acts detection in human–human spoken dialogs’. In Proceedings of ISCA InterSpeech 2005, Lisbon (pp. 2789–2792).
Salmon-Alt, S. (2001). Référence et Dialogue finalisé : de la linguistique à un modéle opérationnel. Ph.D. thesis, Université Henri Poincaré, Nancy.
Salmon-Alt, S., & Romary, L. (2004). Towards a reference annotation framework. In Proceedings of the international conference on language resources and evaluation (LREC), Lisbon.
van Deemter, K., & Kibble, R. (2000). On coreferring: Coreference in MUC and related annotation schemes. Computational Linguistics, 26(4):629–637.
Vanderveken, D. (1990). Meaning and speech acts. Cambridge: Cambridge University Press.
Villaneau, J., Antoine, J.-Y., & Ridoux, O. (2004). Logical approach to natural language understanding in a spoken dialogue system. In Proceedings of the 7th international conference on text, speech and dialogue (TSD), Brno (pp. 637–644).
Walker, M., Litman, D., et al. (1998). Evaluating spoken cialogue agents with PARADISE: 2 Cases studies. Computer Speech and Language, 3(12), 317–347.
Walker, M., Passonneau, R., & Boland, J. (2001). Quantitative and qualitative evaluation of Darpa communicator sopken dialog systems. In Proceedings of the annual meeting of the association for computational linguistics (ACL), Toulouse (pp. 515–522).
Walker, M., Rudnicky, A., et al. (2002). Darpa communicator: cross-system results for the 2001 evaluation. In Proceedings of the international conference on spoken language processing (ICSLP), Denver (pp. 269–272).
Acknowledgments
Thanks to Christelle Ayache, Frédéric Béchet, Laurence Devillers, Anne Kuhn, Fabrice Lefévre, Djamel Mostefa, Sophie Rosset and Jeanne Villaneau for their participation in the project.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
We give a full annotated dialog (#1037) from the media corpus, where W is the wizard, and C the client. Below each utterance the sequence of segments with their corresponding contextual annotation is given. The segment numbers (1–85) may be referred to, for referring expression annotation.
W 1 | “...quelle information désirez-vous” | ||
“...which information would you like” | |||
C 2 | “je voudrais faire une réservation pour le trente et un mai deux jours deux nuits à Paris mais dans un hôtel qui se trouverait près de la place de la Bastille s’ il vous plaît pour six chambres individuelles” | ||
“I’d like to book for may the 31st 2 days 2 nights in Paris but in an hotel which is near place de la Bastille please six single | |||
1 | je voudrais faire une réservation | +/command-task: reservation | |
2 | pour le trente et un mai | +/time-date: 05/31 | |
3 | deux jours deux nuits | +/stay-nbNight-reservation: 2 | |
4 | à Paris | +/location-city: paris | |
5 | mais | +/connectProp: addition | |
6 | dans un hôtel | +/DBObject: hotel | |
7 | près de | +/location-relativeDistance-hotel: near | |
8 | la place de la Bastille | +/location-street-hotel: bastille | |
9 | pour six | +/number-room-reservation: 6 | |
10 | chambres individuelles | +/room-type: single | |
W 3 | “vous souhaitez faire une réservation à Paris” | ||
“You’d like to book in Paris | |||
C 4 | “oui {yes}” | ||
11 | oui | +/response: yes | |
W 5 | “près de la place de la Bastille” | ||
“near the place de la Bastille” | |||
C 6 | “oui madame {yes}” | ||
12 | oui | +/response: yes | |
W 7 | “veuillez patienter je recherche vos informations” | ||
“please wait I’m looking for you information” | |||
C 8 | “merci bien {thanks}” | ||
W 9 | “ à Paris je vous propose trois hôtels le Méridien Bastille la chambre est à soixante euros l’ athanor hôtel la chambre est à quatre-vingt-cinq euros l’ hôtel Richard Lenoir la chambre est à cinquante-cinq euros voulez-vous réserver dans l’ un de ces hôtels ou obtenir plus d’ informations” | ||
“in Paris I propose you 3 hotels the Bastille Méridien the room is 60 euros the Athanor hotel the room is 85 euros the Richard Lenoir hotel the room is 55 euros do you want to book in one of those hotels or ask for more information” | |||
13 | à Paris | +/location-city-hotel: paris | |
14 | le Méridien | +/hotel-trademark: Méridien | |
15 | Bastille | +/name-hotel: bastille | |
16 | soixante | +/payment-amount-integer-room: 60 | |
17 | euros | +/payment-unit: euro | |
18 | l’ athanor hôtel | +/name-hotel: athanor | |
19 | quatre-vingt-cinq | +/payment-amount-integer-room: 85 | |
20 | euros | +/payment-unit: euro | |
21 | l’ hôtel Richard Lenoir | +/name-hotel: richard lenoir | |
22 | cinquante-cinq | +/payment-amount-integer-room: 55 | |
23 | euros | +/payment-unit: euro | |
C 10 | “je veux dire je voudrais savoir si les chambres que je vais réserver les chambres six chambres individuelles donnent sur une cour et est-ce qu’ il y a un parking privé” | ||
“I mean I’d like to know if the rooms I’m going to book the rooms six single rooms overlook a courtyard and if there is a private parking” | |||
24 | les | +/refLink-coRef: plural | |
reference="13,14,15,16,17; 13,18,19,20; 13,21,22,23" | |||
25 | chambres | +/object: room | |
26 | que je vais réserver | +/command-task: reservation | |
27 | les | +/refLink-coRef: plural | |
reference="13,14,15,16,17; 13,18,19,20; 13,21,22,23" | |||
28 | chambres | +/object: room | |
29 | six | +/number-room-reservation: 6 | |
30 | chambres individuelles | +/room-type: single | |
31 | donnent sur | ?/location-relativeDistance-hotel: near | |
32 | une cour | ?/location-relativePlace-general-hotel: unknown | |
33 | et | +/connectProp: addition | |
34 | un parking privé | ?/hotel-parking: private | |
W 11 | “veuillez patienter je recherche cette information je vous propose l’ hôtel Richard Lenoir cet hôtel se situe dans un endroit calme près de la place de la Bastille l’ hôtel est équipé d’ un parking privé surveillé souhaitez-vous faire une réservation dans cet hôtel” | ||
“please wait I’m looking for your information I propose the Richard Lenoir hotel this hotel is located in a quiet place near the place de la Bastille and has got a private parking do you want to book in this hotel” | |||
C 12 | “euh j(e) il y a le parking privé mais c’est un hôtel vous me dites qui est très calme donc il ne donne pas sur une cour il donne sur un boulevard ou pouvez-vous me le situer s’ il vous plaît” | ||
“euh I there is a private parking but you tell me it is a very quiet hotel so it does not overlook a courtyard or can you locate it for me please” | |||
35 | le parking privé | +/hotel-parking: private | |
36 | mais | +/connectProp: opposition | |
37 | c’est | +/refLink-coRef: singular | |
reference="13,21" | |||
38 | un hôtel | +/DBObject: hotel | |
39 | très calme | -/location-relativePlace-general-hotel: livelyDistrict | |
40 | donc | +/connectProp: implies | |
41 | il | +/refLink-coRef: singular | |
reference="13,21" | |||
42 | donne pas sur | -/location-relativeDistance-hotel: near | |
43 | une cour | -/location-relativePlace-general-hotel: unknown | |
44 | il | +/refLink-coRef: singular | |
reference="13,21" | |||
45 | donne sur | ?/location-relativeDistance-hotel: near | |
46 | un boulevard | ?/location-relativePlace-general-hotel: unknown | |
47 | le | +/refLink-coRef: singular | |
reference="13,21" | |||
48 | situer | ?/object: location-hotel | |
W 13 | “je suis désolée je n’ ai pas ce type d’ informations” | ||
“Sorry I don’t have that kind of information” | |||
C 14 | “bon ben écoutez je vais réserver dans cet hôtel hôtel Richard Lenoir donc six chambres individuelles pour le trente et un mai deux jours et deux nuits hein” | ||
“well listen I’ll book in this hotel hotel Richard Lenoir so 6 single rooms on the 31st of may 2 days and 2 nights OK” | |||
49 | je vais réserver | +/command-task: reservation | |
50 | dans cet hôtel hôtel Richard Lenoir | +/name-hotel: richard lenoir | |
51 | six | +/number-room-reservation: 6 | |
52 | chambres individuelles | +/room-type: single | |
53 | pour le trente et un mai | +/time-date-reservation: 05/31 | |
54 | deux jours et deux nuits | +/stay-nbNight-reservation: 2 | |
W 15 | “merci de patienter je vérifie les disponibilités cet hôtel est complet il n’ y a plus de chambres libres correspondant à vos critères souhaitez-vous changer de dates ou réserver dans un autre hôtel” | ||
“please wait I’m checking for the availability this hotel is full there is no more free room corresponding to your choices do you wish to change the date or book in another hotel” | |||
C 16 | “alors je réserve dans un autre hôtel qui a les mêmes critères hein” | ||
“so I book in another hotel with the same conditions OK” | |||
55 | je réserve | +/command-task: reservation | |
56 | un | +/number-hotel: 1 | |
57 | autre | +/refLink-coDom-exclusion: singular | |
reference="13,21" | |||
58 | hôtel | +/DBObject: hotel | |
59 | les mêmes critères | +/object: undetermined | |
W 17 | “merci de patienter je vous propose le Méridien Bastille la chambre est à soixante euros souhaitez-vous faire une réservation dans cet hôtel” | ||
“please wait I propose the Méridien Bastille the room is 60 euros do you wish to book in this hotel” | |||
C 18 | “mais écoutez je vais faire la réservation dans cet hôtel il y a bien un parking privé et ça donne s() est-ce que ça donne sur une cour ou sur une rue tranquille” | ||
“but listen I will book in this hotel there is indeed a private parking and it overlooks does it overlook a courtyard or a quiet road” | |||
60 | je vais faire la réservation | +/command-task: reservation | |
61 | cet | +/refLink-coRef: singular | |
reference="13,14,15" | |||
62 | hôtel | +/DBObject: hotel | |
63 | il y a bien | +/command-dial: confirmation-request | |
64 | un parking privé | ?/hotel-parking: private | |
65 | et | +/connectProp: addition | |
66 | donne sur | ?/location-relativeDistance-hotel: near | |
67 | une cour | ?/location-relativePlace-general-hotel: unknown | |
68 | ou | +/connectProp: alternative | |
69 | sur | ?/location-relativeDistance-hotel: near | |
70 | une rue tranquille | ?/location-relativePlace-general-hotel: livelyDistrict | |
W 19 | “cet hôtel se situe dans un endroit calme près de la place de la Bastille l’ hôtel est équipé d’ un parking privé souhaitez-vous faire une réservation dans cet hôtel” | ||
“this hotel is located in a quiet place near the place de la Batille the hotel has got a private parking do you want to book in this hotel” | |||
C 20 | “ben écoutez je vais faire une réservation dans cet hôtel pour six chambres individuelles hein” | ||
“well listen I’me going to book in this hotel 6 single OK” | |||
71 | je vais faire une réservation | +/command-task: reservation | |
72 | cet | +/refLink-coRef: singular | |
reference="13,14,15" | |||
73 | hôtel | +/DBObject: hotel | |
74 | pour six | +/number-room-reservation: 6 | |
75 | chambres individuelles | +/room-type: single | |
W 21 | “j’ effectue votre réservation le montant de votre séjour s’ élève à sept cent vingt euros le numéro de dossier correspondant est le zéro soixante-neuf cent quatre-vingts désirez-vous une autre information” | ||
“I’m doing your reservation the amount of you stay will be 720 euros the file number is 069180 would you like another information’ | |||
C 22 | “oui euh j’ aimerais savoir est-ce que le petit déjeuner est compris dans la réservation enfin de la réservation dans le prix de la chambre” | ||
“euh yes I’d like to know if breakfast is included in the reservation well the reservation the price for the room” | |||
76 | oui | +/response: oui | |
77 | le petit déjeuner est compris | ?/hotel-services : breakfastInclude | |
78 | dans la réservation | +/command-task: reservation | |
79 | le | +/refLink-coRef: singular | |
reference="16,17" | |||
80 | prix | +/object: payment-amount-reservation-room | |
81 | la | +/refLink-coRef: singular | |
reference "13,14,15,16,17,10" | |||
82 | chambre | ?/object: room | |
W 23 | “il vous sera demandé cinq euros supplémentaires pour une formule petit déjeuner” | ||
“breakfast is 5 euros more” | |||
C 24 | “bon ben écoutez je vous remercie de tous ces renseignements donc je confirme et je réserve” | ||
“well listen I thank you for this information so I confirm and I book” | |||
83 | je confirme | +/command-dial: confirmation-notice | |
84 | et | +/connectProp: addition | |
85 | je réserve | +/command-task: reservation | |
W 25 | “merci d’ avoir utilisé le serveur vocal MEDIA au revoir” | ||
“thanks to have called the MEDIA vocal server goodbye” | |||
C 26 | “au revoir madame et à bientôt au revoir” | ||
“goodbye madam and see you soon goodbye” |
Rights and permissions
About this article
Cite this article
Bonneau-Maynard, H., Quignard, M. & Denis, A. MEDIA: a semantically annotated corpus of task oriented dialogs in French. Lang Resources & Evaluation 43, 329–354 (2009). https://doi.org/10.1007/s10579-009-9103-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-009-9103-2