short-paper

An Intelligent Assistant for High-Level Task Understanding

Authors:

Alexander I. RudnickyAuthors Info & Claims

IUI '16: Proceedings of the 21st International Conference on Intelligent User Interfaces

Pages 169 - 174

https://doi.org/10.1145/2856767.2856818

Published: 07 March 2016 Publication History

Abstract

People are able to interact with domain-specific intelligent assistants (IAs) and get help with tasks. But sometimes user goals are complex and may require interactions with multiple applications. However current IAs are limited to specific applications and users have to directly manage execution spanning multiple applications in order to engage in more complex activities. An ideal personal agent would be able to learn, over time, about tasks that span different resources. This paper addresses the problem of cross-domain task assistance in the context of spoken dialogue systems. We propose approaches to discover users' high-level intentions and using this information to assist users in their task. We collected real-life smartphone usage data from 14 participants and investigated how to extract high-level intents from users' descriptions of their activities. Our experiments show that understanding high-level tasks allows the agent to actively suggest apps relevant to pursuing particular user goals and reduce the cost of users' self-management.

References

[1]

Berry, M. W., and Kogan., J. Text mining: applications and theory (2010).

[2]

Bohus, D., and Rudnicky, A. I. Constructing accurate beliefs in spoken dialog systems. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (2005).

[3]

Chen, Y.-N., and Rudnicky, A. I. Dynamically supporting unexplored domains in conversational interactions by enriching semantics with neural word embeddings. In Proceedings of 2014 IEEE Spoken Language Technology Workshop (SLT), IEEE (2014), 590--595.

[4]

Chen, Y.-N., Sun, M., and Rudnicky, A. I. Leveraging behavioral patterns of mobile applications for personalized spoken language understanding. In Proceedings of 2015 International Conference on Multimodal Interaction (ICMI) (2015).

Digital Library

[5]

Chen, Y.-N., Sun, M., and Rudnicky, A. I. Matrix factorization with domain knowledge and behavioral patterns for intent modeling. In NIPS Workshop on Machine Learning for SLU and Interaction (2015).

[6]

Chen, Y.-N., Sun, M., Rudnicky, A. I., and Gershman, A. Unsupervised user intent modeling by feature-enriched matrix factorization. In Proceedings of The 41th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2016).

[7]

Duda, R., Hart, P., and Stork, D. Pattern Classification. John Wiley and Sons, 2012.

[8]

Fiscus, J. G. A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (rover). In Proceedings of Automatic Speech Recognition and Understanding Workshop (ASRU) (1997), 347--352.

[9]

Ganesan, K., Zhai, C., and Han, J. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics (COLING), ACL (2010), 340--348.

Digital Library

[10]

Harrison, C., Xiao, R., Schwarz, J., and Hudson, S. E. Touchtools: leveraging familiarity and skill with physical tools to augment touch interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2014), 2913--2916.

Digital Library

[11]

Hulth, A. Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 conference on Empirical methods in natural language processing (EMNLP), ACL (2003), 216--223.

Digital Library

[12]

Le, Q. V., and Mikolov, T. Distributed representations of sentences and documents. In ICML (2014).

[13]

Li, Q., Tur, G., Hakkani-Tur, D., Li, X., Paek, T., Gunawardana, A., and Quirk, C. Distributed open-domain conversational understanding framework with domain independent extractors. In Spoken Language Technology Workshop (SLT), 2014 IEEE, IEEE (2014), 566--571.

[14]

Lin, B.-s., Wang, H.-m., and Lee, L.-s. A distributed architecture for cooperative spoken dialogue agents with coherent dialogue state and history. In Proceedings of 1999 IEEE Workshop on Automatic Speech Recognition and Understanding Workshop (ASRU), vol. 99 (1999), 4.

[15]

Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and Smith, N. A. Toward abstractive summarization using semantic representations. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) (2015).

[16]

Lucchese, C., Orlando, S., Perego, R., Silvestri, F., and Tolomei., G. Identifying task-based sessions in search engine query logs. In Proceedings of the fourth ACM international conference on Web search and data mining, ACM (2011), 277--286.

Digital Library

[17]

Lunati, J.-M., and Rudnicky, A. I. Spoken language interfaces: The OM system. CHI91 Human Factors on Computing Systems (1991).

Digital Library

[18]

Medelyan, O. Human-competitive automatic topic indexing. In Thesis Dissertation, University of Waikato (2009).

[19]

Mihalcea, R., and Tarau, P. Textrank: Bringing order into texts. In ACL (2004).

[20]

Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. In Proceedings of Workshop at International Conference on Learning Representations (ICLR) (2013).

[21]

Nakano, M., Sato, S., Komatani, K., Matsuyama, K., Funakoshi, K., and Okuno, H. G. A two-stage domain selection framework for extensible multi-domain spoken dialogue systems. In SIGdial Workshop on Discourse and Dialogue (SIGDIAL), Association for Computational Linguistics (2011), 18--29.

Digital Library

[22]

Rudnicky, A. I., Lunati, J.-M., and Franz, A. M. Spoken language recognition in an office management domain. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE (1991), 829--832.

Digital Library

[23]

Ryu, S., Song, J., Koo, S., Kwon, S., and Lee, G. G. Detecting multiple domains from users utterance in spoken dialog system. In Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS) (2015).

[24]

Shen, X., Tan, B., and Zhai., C. Implicit user modeling for personalized search. In Proceedings of the 14th ACM international conference on Information and knowledge management, ACM (2005), 824--831.

Digital Library

[25]

Sun, M., Chen, Y.-N., and Rudnicky., A. I. Learning OOV through semantic relatedness in spoken dialog systems. In 16th Annual Conference of the International Speech Communication Association (Interspeech) (2015).

[26]

Sun, M., Chen, Y.-N., and Rudnicky, A. I. Understanding user's cross-domain intentions in spoken dialog systems. In NIPS Workshop on Machine Learning for SLU and Interaction (2015).

[27]

Sun, M., Chen, Y.-N., and Rudnicky, A. I. HELPR: A framework to break the barrier across domains in spoken dialog systems. In International Workshop on Spoken Dialog Systems (2016).

[28]

Tibshirani, R., Walther, G., and Hastie., T. Estimating the number of clusters in a data set via the gap statistic. In Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2001), 411--423.

[29]

Witten, I. H., Paynter, G. W., Frank, E., Gutwin, C., and Nevill-Manning, C. G. Kea: Practical automatic keyphrase extraction. In Proceedings of the fourth ACM conference on Digital libraries (1999), 254--255.

Digital Library

[30]

Zhao, R., Papangelis, A., and Cassell, J. Towards a dyadic computational model of rapport management for human-virtual agent interaction. In Intelligent Virtual Agents. 2014, 514--527.

Cited By

Liang ZYu MSun YDong M(2025)SPFusion: A multi-task semantic perception infrared and visible light fusion method with quality assessmentDisplays10.1016/j.displa.2024.10290387(102903)Online publication date: Apr-2025
https://doi.org/10.1016/j.displa.2024.102903
Khaokaew YXue HRahaman MSalim FKostakos VKay JHoang T(2024)WorkR: Occupation Inference for Intelligent Task AssistanceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676622(118-124)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3675095.3676622
Su SChung YChen Y(2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TASLP.2024.3364063
Show More Cited By

Index Terms

An Intelligent Assistant for High-Level Task Understanding
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents
2. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Expert systems

Recommendations

Learning the Structure of Task-Driven Human–Human Dialogs

With the availability of large corpora of spoken dialog, it is now possible to use data-driven techniques to build and use models of task-oriented dialogs. In this paper, we use data-driven techniques to build task structures for individual dialogs, and ...
Situated language understanding for a spoken dialog system within vehicles

HighlightsWe implemented and analyzed issues in situated language understanding in moving car.We analyzed timing of utterances, spatial relationships between the car and targets.Our algorithms improved the target identification rate by 24.1%. In this ...
Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

The increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '16: Proceedings of the 21st International Conference on Intelligent User Interfaces

March 2016

446 pages

ISBN:9781450341370

DOI:10.1145/2856767

General Chairs:
Jeffrey Nichols
Google Inc, USA
,
Jalal Mahmud
IBM Research, USA
,
John O'Donovan
UC Santa Barbara, USA
,
Program Chairs:
Cristina Conati
University of British Columbia, Canada
,
Massimo Zancanaro
Bruno Kessler Foundation, Italy

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

IUI'16

Sponsor:

IUI'16: 21st International Conference on Intelligent User Interfaces

March 7 - 10, 2016

California, Sonoma, USA

Acceptance Rates

IUI '16 Paper Acceptance Rate 49 of 194 submissions, 25%;

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
575
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liang ZYu MSun YDong M(2025)SPFusion: A multi-task semantic perception infrared and visible light fusion method with quality assessmentDisplays10.1016/j.displa.2024.10290387(102903)Online publication date: Apr-2025
https://doi.org/10.1016/j.displa.2024.102903
Khaokaew YXue HRahaman MSalim FKostakos VKay JHoang T(2024)WorkR: Occupation Inference for Intelligent Task AssistanceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676622(118-124)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3675095.3676622
Su SChung YChen Y(2024)Joint Dual Learning With Mutual Information Maximization for Natural Language Understanding and Generation in DialoguesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.336406332(2445-2452)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TASLP.2024.3364063
Tian YZhou KPelleg D(2023)Characterization and Prediction of Mobile TasksACM Transactions on Information Systems10.1145/352271141:1(1-39)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3522711
Miyashita YSawahata YSakai AHarasawa MHara KMorita TKomine K(2022)Display-Size Dependent Effects of 3D Viewing on Subjective ImpressionsACM Transactions on Applied Perception10.1145/351046119:2(1-15)Online publication date: 11-Jul-2022
https://dl.acm.org/doi/10.1145/3510461
Qin LChe WNi MLi YLiu T(2021)Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language UnderstandingIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2021.305340029(1280-1289)Online publication date: 21-Jan-2021
https://dl.acm.org/doi/10.1109/TASLP.2021.3053400
Nouri ESim RFourney AWhite RO'Brien HFreund LArapakis IHoeber OLopatovska I(2020)Step-wise Recommendation for Complex Task SupportProceedings of the 2020 Conference on Human Information Interaction and Retrieval10.1145/3343413.3377964(203-212)Online publication date: 14-Mar-2020
https://dl.acm.org/doi/10.1145/3343413.3377964
Delgrange CDussoux JDominey P(2020)Usage-Based Learning in Human Interaction With an Adaptive Virtual AssistantIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2019.292739912:1(109-123)Online publication date: Mar-2020
https://doi.org/10.1109/TCDS.2019.2927399
Ardila-Rey JOrtiz JCreixell WMuhammad-Sukki FBani N(2020)Artificial Generation of Partial Discharge Sources Through an Algorithm Based on Deep Convolutional Generative Adversarial NetworksIEEE Access10.1109/ACCESS.2020.29713198(24561-24575)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2971319
Oraby SBhuiyan MGundecha PMahmud JAkkiraju R(2019)Modeling and Computational Characterization of Twitter Customer Service ConversationsACM Transactions on Interactive Intelligent Systems10.1145/32130149:2-3(1-28)Online publication date: 18-Mar-2019
https://dl.acm.org/doi/10.1145/3213014
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten