skip to main content
10.1145/1452392.1452413acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

Knowledge and data flow architecture for reference processing in multimodal dialog systems

Published: 20 October 2008 Publication History

Abstract

This paper is concerned with the part of the system dedicated to the processing of the user's designation activities for multimodal search of information. We highlight the necessity of using specific knowledge for multimodal input processing. We propose and describe knowledge modeling as well as the associated processing architecture. Knowledge modeling is concerned with the natural language and the visual context; it is adapted to the kind of application and allows several types of filtering of the inputs. Part of this knowledge is dynamically updated to take into account the interaction history. In the proposed architecture, each input modality is processed first by using the modeled knowledge, producing intermediate structures. Next a fusion of these structures allows the determination of the referent aimed at by using dynamic knowledge. The steps of this last process take into account the possible combinations of modalities as well as the clues carried by each modality (linguistic clues, gesture type). The development of this part of our system is mainly complete and tested.

References

[1]
A. Choumane and J. Siroux. Interpretation of multimodal designation with imprecise gesture. In IE07, pages 232--238, Germany, 2007.
[2]
A. Choumane and J. Siroux. A model for multimodal representation and processing for reference resolution. In WMISI '07, pages 39--42. ACM, 2007.
[3]
F. Landragin. Referring to objects with spoken and haptic modalities. In ICMI'02, page 99. IEEE Computer Society, 2002.
[4]
R. Mitkov. Anaphora Resolution. Pearson Education, 2002. isbn: 0-582-32505-6.
[5]
S. Qu and J. Y. Chai. Salience modeling based on non-verbal modalities for spoken language understanding. In ICMI '06, pages 193--200. ACM, 2006.
[6]
J. Siroux, M. Guyomard, F. Multon, and C. Rémondeau. Multimodal references in georal tactile. In Workshop of ACL Meeting, Spain, 1997.
[7]
R. Vieira and M. Poesio. An empirically-based system for processing definite descriptions. Computational Linguistics, 26 (4): 539--593, 2000.
[8]
W3C. Speech Recognition Grammar Specification. http://www.w3.org/TR/speech-grammar/, March 2004.
[9]
W3C. Semantic Interpretation for Speech Recognition. http://www.w3.org/TR/semantic-interpretation/, April 2007.

Cited By

View all
  • (2009)Salience in the generation of multimodal referring actsProceedings of the 2009 international conference on Multimodal interfaces10.1145/1647314.1647351(207-210)Online publication date: 2-Nov-2009
  • (2009)Modeling and Using Salience in Multimodal Interaction SystemsProceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques10.1007/978-3-642-02577-8_2(12-18)Online publication date: 14-Jul-2009

Index Terms

  1. Knowledge and data flow architecture for reference processing in multimodal dialog systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces
      October 2008
      322 pages
      ISBN:9781605581989
      DOI:10.1145/1452392
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 October 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. gesture
      2. multimodal fusion
      3. multimodal human-computer communication
      4. natural language
      5. reference

      Qualifiers

      • Poster

      Conference

      ICMI '08
      Sponsor:
      ICMI '08: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES
      October 20 - 22, 2008
      Crete, Chania, Greece

      Acceptance Rates

      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2009)Salience in the generation of multimodal referring actsProceedings of the 2009 international conference on Multimodal interfaces10.1145/1647314.1647351(207-210)Online publication date: 2-Nov-2009
      • (2009)Modeling and Using Salience in Multimodal Interaction SystemsProceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques10.1007/978-3-642-02577-8_2(12-18)Online publication date: 14-Jul-2009

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media