short-paper

Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding

Authors:

Yun-Nung Chen,

Ming Sun,

Alexander I. Rudnicky,

Anatole GershmanAuthors Info & Claims

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

Pages 83 - 86

https://doi.org/10.1145/2818346.2820781

Published: 09 November 2015 Publication History

Get Access

Abstract

Spoken language interfaces are appearing in various smart devices (e.g. smart-phones, smart-TV, in-car navigating systems) and serve as intelligent assistants (IAs). However, most of them do not consider individual users' behavioral profiles and contexts when modeling user intents. Such behavioral patterns are user-specific and provide useful cues to improve spoken language understanding (SLU). This paper focuses on leveraging the app behavior history to improve spoken dialog systems performance. We developed a matrix factorization approach that models speech and app usage patterns to predict user intents (e.g. launching a specific app). We collected multi-turn interactions in a WoZ scenario; users were asked to reproduce the multi-app tasks that they had performed earlier on their smart-phones. By modeling latent semantics behind lexical and behavioral patterns, the proposed multi-model system achieves about 52% of turn accuracy for intent prediction on ASR transcripts.

References

[1]

A. Bhargava, A. Celikyilmaz, D. Hakkani-Tur, and R. Sarikaya. Easy contextual intent prediction and slot detection. In ICASSP, 2013.

Crossref

Google Scholar

[2]

A. Celikyilmaz, Z. Feizollahi, D. Hakkani-Tür, and R. Sarikaya. Resolving referring expressions in conversational dialogs for natural user interfaces. In EMNLP, 2014.

Crossref

Google Scholar

[3]

Y.-N. Chen and A. I. Rudnicky. Dynamically supporting unexplored domains in conversational interactions by enriching semantics with neural word embeddings. In SLT, 2014.

Crossref

Google Scholar

[4]

Y.-N. Chen, W. Y. Wang, A. Gershman, and A. I. Rudnicky. Matrix factorization with knowledge graph propagation for unsupervised spoken language understanding. In ACL-IJCNLP, 2015.

Crossref

Google Scholar

[5]

M. Collins, S. Dasgupta, and R. E. Schapire. A generalization of principal components analysis to the exponential family. In NIPS, 2001.

Google Scholar

[6]

Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. MyMedialite: A free recommender system library. In RecSys, 2011.

Digital Library

Google Scholar

[7]

D. Hakkani-Tür, M. Slaney, A. Celikyilmaz, and L. Heck. Eye gaze for spoken language understanding in multi-modal conversational interactions. In ICMI, 2014.

Digital Library

Google Scholar

[8]

Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, (8):30{37, 2009.

Digital Library

Google Scholar

[9]

S. Kousidis, C. Kennington, T. Baumann, H. Buschmeier, S. Kopp, and D. Schlangen. A multimodal in-car dialogue system that tracks the driver's attention. In ICMI, 2014.

Digital Library

Google Scholar

[10]

S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, 2009.

Digital Library

Google Scholar

[11]

C. Shin, J.-H. Hong, and A. K. Dey. Understanding and prediction of mobile application usage for smart phones. In UbiComp, 2012.

Digital Library

Google Scholar

Cited By

View all

Su SChung YChen Y(2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 2024
https://doi.org/10.1109/TASLP.2024.3364063
Cao ZLiu J(2024)Spoken language understanding via graph contrastive learning on the context-aware graph convolutional networkPattern Analysis and Applications10.1007/s10044-024-01362-027:4Online publication date: 6-Nov-2024
https://doi.org/10.1007/s10044-024-01362-0
Qin LChe WNi MLi YLiu T(2021)Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language UnderstandingIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2021.305340029(1280-1289)Online publication date: 2021
https://doi.org/10.1109/TASLP.2021.3053400
Show More Cited By

Index Terms

Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Eye Gaze for Spoken Language Understanding in Multi-modal Conversational Interactions
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

When humans converse with each other, they naturally amalgamate information from multiple modalities (i.e., speech, gestures, speech prosody, facial expressions, and eye gaze). This paper focuses on eye gaze and its combination with speech. We develop a ...
Salience modeling based on non-verbal modalities for spoken language understanding
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

Previous studies have shown that, in multimodal conversational systems, fusing information from multiple modalities together can improve the overall input interpretation through mutual disambiguation. Inspired by these findings, this paper investigates ...
Learning Dialogue History for Spoken Language Understanding
Natural Language Processing and Chinese Computing
Abstract
In task-oriented dialogue systems, spoken language understanding (SLU) aims to convert users’ queries expressed by natural language to structured representations. SLU usually consists of two parts, namely intent identification and slot filling. ...

Comments

Information & Contributors

Information

Published In

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

November 2015

678 pages

ISBN:9781450339124

DOI:10.1145/2818346

General Chairs:
Zhengyou Zhang
Microsoft Research, USA
,
Phil Cohen
VoiceBox Technologies, USA
,
Program Chairs:
Dan Bohus
Microsoft Research, USA
,
Radu Horaud
INRIA Grenoble Rhone-Alpes, France
,
Helen Meng
Chinese University of Hong Kong, China

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

ICMI '15

Sponsor:

SIGCHI

ICMI '15: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 9 - 13, 2015

Washington, Seattle, USA

Acceptance Rates

ICMI '15 Paper Acceptance Rate 52 of 127 submissions, 41%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
231
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Su SChung YChen Y(2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 2024
https://doi.org/10.1109/TASLP.2024.3364063
Cao ZLiu J(2024)Spoken language understanding via graph contrastive learning on the context-aware graph convolutional networkPattern Analysis and Applications10.1007/s10044-024-01362-027:4Online publication date: 6-Nov-2024
https://doi.org/10.1007/s10044-024-01362-0
Qin LChe WNi MLi YLiu T(2021)Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language UnderstandingIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2021.305340029(1280-1289)Online publication date: 2021
https://doi.org/10.1109/TASLP.2021.3053400
Priyanga PNadira Banu Kamal A(2021)Mobile App Usage Pattern Prediction Using Hierarchical Flexi-Ensemble Clustering (HFEC) for Mobile Service RatingWireless Personal Communications10.1007/s11277-021-09048-0Online publication date: 8-Oct-2021
https://doi.org/10.1007/s11277-021-09048-0
Su SYuan PChen Y(2019)Dynamically Context-sensitive Time-decay Attention for Dialogue ModelingICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682561(7200-7204)Online publication date: May-2019
https://doi.org/10.1109/ICASSP.2019.8682561
Chen QZhuo ZWang WXu Q(2019)Transfer Learning for Context-Aware Spoken Language Understanding2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)10.1109/ASRU46091.2019.9003902(779-786)Online publication date: Dec-2019
https://doi.org/10.1109/ASRU46091.2019.9003902
Guo BOuyang YGuo TCao LYu Z(2019)Enhancing Mobile App User Understanding and Marketing With Heterogeneous Crowdsourced Data: A ReviewIEEE Access10.1109/ACCESS.2019.29183257(68557-68571)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2918325
Yuan YThompson SWatson KChase ASenthilkumar ABernheim Brush AYarosh S(2019)Speech interface reformulations and voice assistant personification preferences of children and parentsInternational Journal of Child-Computer Interaction10.1016/j.ijcci.2019.04.00521:C(77-88)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.1016/j.ijcci.2019.04.005
Yarosh SThompson SWatson KChase ASenthilkumar AYuan YBrush AGiannakos MJaccheri LDivitini M(2018)Children asking questionsProceedings of the 17th ACM Conference on Interaction Design and Children10.1145/3202185.3202207(300-312)Online publication date: 19-Jun-2018
https://dl.acm.org/doi/10.1145/3202185.3202207
Chen PChi TSu SChen Y(2017)Dynamic time-aware attention to speaker roles and contexts for spoken language understanding2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)10.1109/ASRU.2017.8268985(554-560)Online publication date: Dec-2017
https://doi.org/10.1109/ASRU.2017.8268985
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Eye Gaze for Spoken Language Understanding in Multi-modal Conversational Interactions

Salience modeling based on non-verbal modalities for spoken language understanding

Learning Dialogue History for Spoken Language Understanding

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations