skip to main content
10.1145/2856767.2856818acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
short-paper

An Intelligent Assistant for High-Level Task Understanding

Published: 07 March 2016 Publication History

Abstract

People are able to interact with domain-specific intelligent assistants (IAs) and get help with tasks. But sometimes user goals are complex and may require interactions with multiple applications. However current IAs are limited to specific applications and users have to directly manage execution spanning multiple applications in order to engage in more complex activities. An ideal personal agent would be able to learn, over time, about tasks that span different resources. This paper addresses the problem of cross-domain task assistance in the context of spoken dialogue systems. We propose approaches to discover users' high-level intentions and using this information to assist users in their task. We collected real-life smartphone usage data from 14 participants and investigated how to extract high-level intents from users' descriptions of their activities. Our experiments show that understanding high-level tasks allows the agent to actively suggest apps relevant to pursuing particular user goals and reduce the cost of users' self-management.

References

[1]
Berry, M. W., and Kogan., J. Text mining: applications and theory (2010).
[2]
Bohus, D., and Rudnicky, A. I. Constructing accurate beliefs in spoken dialog systems. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (2005).
[3]
Chen, Y.-N., and Rudnicky, A. I. Dynamically supporting unexplored domains in conversational interactions by enriching semantics with neural word embeddings. In Proceedings of 2014 IEEE Spoken Language Technology Workshop (SLT), IEEE (2014), 590--595.
[4]
Chen, Y.-N., Sun, M., and Rudnicky, A. I. Leveraging behavioral patterns of mobile applications for personalized spoken language understanding. In Proceedings of 2015 International Conference on Multimodal Interaction (ICMI) (2015).
[5]
Chen, Y.-N., Sun, M., and Rudnicky, A. I. Matrix factorization with domain knowledge and behavioral patterns for intent modeling. In NIPS Workshop on Machine Learning for SLU and Interaction (2015).
[6]
Chen, Y.-N., Sun, M., Rudnicky, A. I., and Gershman, A. Unsupervised user intent modeling by feature-enriched matrix factorization. In Proceedings of The 41th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2016).
[7]
Duda, R., Hart, P., and Stork, D. Pattern Classification. John Wiley and Sons, 2012.
[8]
Fiscus, J. G. A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (rover). In Proceedings of Automatic Speech Recognition and Understanding Workshop (ASRU) (1997), 347--352.
[9]
Ganesan, K., Zhai, C., and Han, J. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics (COLING), ACL (2010), 340--348.
[10]
Harrison, C., Xiao, R., Schwarz, J., and Hudson, S. E. Touchtools: leveraging familiarity and skill with physical tools to augment touch interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2014), 2913--2916.
[11]
Hulth, A. Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 conference on Empirical methods in natural language processing (EMNLP), ACL (2003), 216--223.
[12]
Le, Q. V., and Mikolov, T. Distributed representations of sentences and documents. In ICML (2014).
[13]
Li, Q., Tur, G., Hakkani-Tur, D., Li, X., Paek, T., Gunawardana, A., and Quirk, C. Distributed open-domain conversational understanding framework with domain independent extractors. In Spoken Language Technology Workshop (SLT), 2014 IEEE, IEEE (2014), 566--571.
[14]
Lin, B.-s., Wang, H.-m., and Lee, L.-s. A distributed architecture for cooperative spoken dialogue agents with coherent dialogue state and history. In Proceedings of 1999 IEEE Workshop on Automatic Speech Recognition and Understanding Workshop (ASRU), vol. 99 (1999), 4.
[15]
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and Smith, N. A. Toward abstractive summarization using semantic representations. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) (2015).
[16]
Lucchese, C., Orlando, S., Perego, R., Silvestri, F., and Tolomei., G. Identifying task-based sessions in search engine query logs. In Proceedings of the fourth ACM international conference on Web search and data mining, ACM (2011), 277--286.
[17]
Lunati, J.-M., and Rudnicky, A. I. Spoken language interfaces: The OM system. CHI91 Human Factors on Computing Systems (1991).
[18]
Medelyan, O. Human-competitive automatic topic indexing. In Thesis Dissertation, University of Waikato (2009).
[19]
Mihalcea, R., and Tarau, P. Textrank: Bringing order into texts. In ACL (2004).
[20]
Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. In Proceedings of Workshop at International Conference on Learning Representations (ICLR) (2013).
[21]
Nakano, M., Sato, S., Komatani, K., Matsuyama, K., Funakoshi, K., and Okuno, H. G. A two-stage domain selection framework for extensible multi-domain spoken dialogue systems. In SIGdial Workshop on Discourse and Dialogue (SIGDIAL), Association for Computational Linguistics (2011), 18--29.
[22]
Rudnicky, A. I., Lunati, J.-M., and Franz, A. M. Spoken language recognition in an office management domain. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE (1991), 829--832.
[23]
Ryu, S., Song, J., Koo, S., Kwon, S., and Lee, G. G. Detecting multiple domains from users utterance in spoken dialog system. In Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS) (2015).
[24]
Shen, X., Tan, B., and Zhai., C. Implicit user modeling for personalized search. In Proceedings of the 14th ACM international conference on Information and knowledge management, ACM (2005), 824--831.
[25]
Sun, M., Chen, Y.-N., and Rudnicky., A. I. Learning OOV through semantic relatedness in spoken dialog systems. In 16th Annual Conference of the International Speech Communication Association (Interspeech) (2015).
[26]
Sun, M., Chen, Y.-N., and Rudnicky, A. I. Understanding user's cross-domain intentions in spoken dialog systems. In NIPS Workshop on Machine Learning for SLU and Interaction (2015).
[27]
Sun, M., Chen, Y.-N., and Rudnicky, A. I. HELPR: A framework to break the barrier across domains in spoken dialog systems. In International Workshop on Spoken Dialog Systems (2016).
[28]
Tibshirani, R., Walther, G., and Hastie., T. Estimating the number of clusters in a data set via the gap statistic. In Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2001), 411--423.
[29]
Witten, I. H., Paynter, G. W., Frank, E., Gutwin, C., and Nevill-Manning, C. G. Kea: Practical automatic keyphrase extraction. In Proceedings of the fourth ACM conference on Digital libraries (1999), 254--255.
[30]
Zhao, R., Papangelis, A., and Cassell, J. Towards a dyadic computational model of rapport management for human-virtual agent interaction. In Intelligent Virtual Agents. 2014, 514--527.

Cited By

View all
  • (2025)SPFusion: A multi-task semantic perception infrared and visible light fusion method with quality assessmentDisplays10.1016/j.displa.2024.10290387(102903)Online publication date: Apr-2025
  • (2024)WorkR: Occupation Inference for Intelligent Task AssistanceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676622(118-124)Online publication date: 5-Oct-2024
  • (2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IUI '16: Proceedings of the 21st International Conference on Intelligent User Interfaces
March 2016
446 pages
ISBN:9781450341370
DOI:10.1145/2856767
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. language understanding
  2. multi-domain
  3. spoken dialog system (sds)
  4. user intention

Qualifiers

  • Short-paper

Conference

IUI'16
Sponsor:

Acceptance Rates

IUI '16 Paper Acceptance Rate 49 of 194 submissions, 25%;
Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)SPFusion: A multi-task semantic perception infrared and visible light fusion method with quality assessmentDisplays10.1016/j.displa.2024.10290387(102903)Online publication date: Apr-2025
  • (2024)WorkR: Occupation Inference for Intelligent Task AssistanceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676622(118-124)Online publication date: 5-Oct-2024
  • (2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 1-Jan-2024
  • (2023)Characterization and Prediction of Mobile TasksACM Transactions on Information Systems10.1145/352271141:1(1-39)Online publication date: 9-Jan-2023
  • (2022)Display-Size Dependent Effects of 3D Viewing on Subjective ImpressionsACM Transactions on Applied Perception10.1145/351046119:2(1-15)Online publication date: 11-Jul-2022
  • (2021)Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language UnderstandingIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2021.305340029(1280-1289)Online publication date: 21-Jan-2021
  • (2020)Step-wise Recommendation for Complex Task SupportProceedings of the 2020 Conference on Human Information Interaction and Retrieval10.1145/3343413.3377964(203-212)Online publication date: 14-Mar-2020
  • (2020)Usage-Based Learning in Human Interaction With an Adaptive Virtual AssistantIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2019.292739912:1(109-123)Online publication date: Mar-2020
  • (2020)Artificial Generation of Partial Discharge Sources Through an Algorithm Based on Deep Convolutional Generative Adversarial NetworksIEEE Access10.1109/ACCESS.2020.29713198(24561-24575)Online publication date: 2020
  • (2019)Modeling and Computational Characterization of Twitter Customer Service ConversationsACM Transactions on Interactive Intelligent Systems10.1145/32130149:2-3(1-28)Online publication date: 18-Mar-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media