research-article

Location grounding in multimodal local search

Authors:

Michael JohnstonAuthors Info & Claims

ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

Article No.: 32, Pages 1 - 4

https://doi.org/10.1145/1891903.1891945

Published: 08 November 2010 Publication History

Abstract

Computational models of dialog context have often focused on unimodal spoken dialog or text, using the language itself as the primary locus of contextual information. But as we move from spoken interaction to situated multimodal interaction on mobile platforms supporting a combination of spoken dialog with graphical interaction, touch-screen input, geolocation, and other non-linguistic contextual factors, we will need more sophisticated models of context that capture the influence of these factors on semantic interpretation and dialog flow. Here we focus on how users establish the location they deem salient from the multimodal context by grounding it through interactions with a map-based query system. While many existing systems rely on geolocation to establish the location context of a query, we hypothesize that this approach often ignores the grounding actions users make, and provide an analysis of log data from one such system that reveals errors that arise from that faulty treatment of grounding. We then explore and evaluate, using live field data from a deployed multimodal search system, several different context classification techniques that attempt to learn the location contexts users make salient by grounding them through their multimodal actions.

References

[1]

Alshawi, H. 1987. Memory and Context for Language Interpretation. Cambridge, Cambridge University Press.

Digital Library

[2]

Bangalore, S. and Johnston, M. 2009. Robust Understanding in Multimodal Interfaces. Computational Linguistics 35:3, 345--397.

Digital Library

[3]

Clark, H., Schreuder, R., and Buttrick, S. 1983. Common Ground and the Understanding of Demonstrative Reference. Journal of Verbal Learning and Verbal Behavior 22, 245--258.

[4]

Clark, H. and Brennan, S. E. 1991. Grounding in Communication. In L. B. Resnick, J. M. Levine, and J. S. Teasley (eds.) Perspectives on socially shared cognition, American Psychological Association.

[5]

Clark, H. 1996. Using Language. Cambridge, MA: Cambridge University Press.

[6]

Cohen, P. R., Johnston, M., McGee, D., Oviatt, S. L., Pittman, J., Smith, I., Chen, L., and Clow, J. 1998. Multimodal Interaction for Distributed Interactive Simulation. In M. Maybury and W. Wahlster (eds.) Readings in Intelligent Interfaces. Morgan Kaufmann Publishers, San Francisco, CA, 562--571.

Digital Library

[7]

DiFabbrizio, G., Okken, T., and Wilpon, J. 2009. A Speech Mashup Framework for Multimodal Mobile Services. In Proceedings of the 2009 International Conference on Multimodal Interfaces, 71--78.

Digital Library

[8]

Ehlen, P., Zajac, R., and Rao, B. P. 2009. Location and Relevance. In E. Wilde, S. Boll, K. Cheverst, P. Fröhlich, R. Purves, and J. Schöning (eds.). Proceedings of the Second International Workshop on Location and the Web (LocWeb 2009), Boston, Massachusetts, April 2009, 17--19.

Digital Library

[9]

Feng, J., Bangalore, S., and Gilbert, M. 2009. Role of Natural Language Understanding in Voice Local Search. Proceedings of Interspeech 2009, 1859--1862.

[10]

Gustafson J., Bell, L., Beskow, J., Boye, J., Carlson, R., Edlund, J., Granström, B., House D., and Wirén, M. 2000. AdApt -- A Multimodal Conversational Dialogue system in an Apartment Domain. In Proceedings of ICSLP 2000. Vol. 2, 134--137.

[11]

Hastie, H., Johnston, M., and Ehlen. P. 2002. Context-sensitive Help for Multimodal Dialog. In Proceedings of the 4^th IEEE Conference on Multimodal Interfaces, 93--101.

Digital Library

[12]

Huls, C., Bos, E., Claassen, W. 1995. Automatic Referent Resolution of Deictic and Anaphoric Expressions. Computational Linguistics. 21:1, 59--79.

Digital Library

[13]

Johnston, M., Bangalore, S. Vasireddy, G., Stent, A., Ehlen, P., Walker, M., Whittaker, S., Maloor, P. 2002. MATCH: An Architecture for Multimodal Dialogue Systems. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 376--383.

Digital Library

[14]

Kehler, A., Martin, J-C., Cheyer, A., Julia, L., Hobbs, J. and Bear, J. 1998. On Representing Salience and Reference in Multimodal Human Computer Interaction. AAAI'98 Workshop on Representations for Multi-Modal Human-Computer Interaction. Madison, WI. (USA), 33--39.

[15]

Oviatt, S. L. 1997. Multimodal Interactive Maps: Designing for Human Performance. Human Computer Interaction. 12, 93--129.

Digital Library

[16]

Oviatt, S. L. and Kuhn, K. 1998. Referential Features and Linguistic Indirection in Multimodal Language. Proceedings of ICSLP 1998, 286--304.

[17]

Rubine, D. 1991. Specifying Gestures by Example. Computer Graphics 25:4, 329--337.

Digital Library

[18]

Wahlster, W. 2006. SmartKom: Foundations of Multimodal Dialogue Systems (Cognitive Technologies). Springer-Verlag New York, Inc.

Digital Library

[19]

Witten, I. H. and Frank, E. 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd Edition. Morgan Kaufmann, San Francisco.

Digital Library

[20]

http://speak4it.com/

[21]

http://www.google.com/mobile/apple/app.html

[22]

http://www.vlingo.com

Cited By

Johnston M(2019)Multimodal integration for interactive conversational systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233798(21-76)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3233795.3233798
Ehlen PJohnston MKim JNichols JSzekely P(2013)A multimodal dialogue interface for mobile local searchProceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion10.1145/2451176.2451200(63-64)Online publication date: 19-Mar-2013
https://dl.acm.org/doi/10.1145/2451176.2451200
Rafailidis DManolopoulou SDaras P(2013)A unified framework for multimodal retrievalPattern Recognition10.1016/j.patcog.2013.05.02346:12(3358-3370)Online publication date: 1-Dec-2013
https://dl.acm.org/doi/10.1016/j.patcog.2013.05.023
Show More Cited By

Index Terms

Location grounding in multimodal local search
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Touch screens
    2. Interaction paradigms
      1. Natural language interfaces

Recommendations

Multimodal dialogue in mobile local search
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction

Speak4it^SM is a multimodal, mobile search application that provides information about local businesses. Users can combine speech and touch input simultaneously to make search queries or commands to the application. For example, a user might say, "gas ...
Multimodal interaction patterns in mobile local search
IUI '12: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces

Speak4it™ is a mobile search application that leverages multimodal input and integration to allow users to search for and act on local business information. We present an initial empirical analysis of user interaction with a multimodal local search ...
Speak4it: multimodal interaction for local search
ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

Speak4it^SM is a consumer-oriented mobile search application that leverages multimodal input and output to allow users to search for and act on local business information. It supports true multimodal integration where user inputs can be distributed over ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

November 2010

311 pages

ISBN:9781450304146

DOI:10.1145/1891903

General Chairs:
Wen Gao
PKU, China
,
Chin-Hui Lee
Georgia Tech
,
Jie Yang
Carnegie Mellon
,
Program Chairs:
Xilin Chen
ICT, CAS, China
,
Maxine Eskenazi
Carnegie Mellon
,
Zhengyou Zhang
MSR

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMI-MLMI '10

Sponsor:

SIGCHI

ICMI-MLMI '10: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES/WORKSHOP ON MACHINE LEARNING FOR MULTIMODAL INTERFACES

November 8 - 10, 2010

Beijing, China

Acceptance Rates

ICMI-MLMI '10 Paper Acceptance Rate 41 of 100 submissions, 41%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
205
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Johnston M(2019)Multimodal integration for interactive conversational systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233798(21-76)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3233795.3233798
Ehlen PJohnston MKim JNichols JSzekely P(2013)A multimodal dialogue interface for mobile local searchProceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion10.1145/2451176.2451200(63-64)Online publication date: 19-Mar-2013
https://dl.acm.org/doi/10.1145/2451176.2451200
Rafailidis DManolopoulou SDaras P(2013)A unified framework for multimodal retrievalPattern Recognition10.1016/j.patcog.2013.05.02346:12(3358-3370)Online publication date: 1-Dec-2013
https://dl.acm.org/doi/10.1016/j.patcog.2013.05.023
Ehlen PJohnston MMorency LBohus DAghajan HCassell JNijholt AEpps J(2012)Multimodal dialogue in mobile local searchProceedings of the 14th ACM international conference on Multimodal interaction10.1145/2388676.2388741(303-304)Online publication date: 22-Oct-2012
https://dl.acm.org/doi/10.1145/2388676.2388741
Johnston MEhlen PDuarte CCarriço LJorge JOviatt SGonçalves D(2012)Collecting multimodal data in the wildProceedings of the 2012 ACM international conference on Intelligent User Interfaces10.1145/2166966.2167042(339-340)Online publication date: 14-Feb-2012
https://dl.acm.org/doi/10.1145/2166966.2167042
Ehlen PJohnston MDuarte CCarriço LJorge JOviatt SGonçalves D(2012)Multimodal interaction patterns in mobile local searchProceedings of the 2012 ACM international conference on Intelligent User Interfaces10.1145/2166966.2166970(21-24)Online publication date: 14-Feb-2012
https://dl.acm.org/doi/10.1145/2166966.2166970
Ehlen PJohnston MPu PPazzani MAndré ERiecken D(2011)Multimodal local search in Speak4itProceedings of the 16th international conference on Intelligent user interfaces10.1145/1943403.1943486(435-436)Online publication date: 13-Feb-2011
https://dl.acm.org/doi/10.1145/1943403.1943486
Feng JJohnston MBangalore S(2011)Speech and Multimodal Interaction in Mobile SearchIEEE Signal Processing Magazine10.1109/MSP.2011.94107328:4(40-49)Online publication date: Jul-2011
https://doi.org/10.1109/MSP.2011.941073
Johnston MEhlen P(2010)Speak4IT: Multimodal interaction in the wild2010 IEEE Spoken Language Technology Workshop10.1109/SLT.2010.5700840(159-160)Online publication date: Dec-2010
https://doi.org/10.1109/SLT.2010.5700840

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten