research-article

SUGILITE: Creating Multimodal Smartphone Automation by Demonstration

Authors:

Toby Jia-Jun Li,

Brad A. MyersAuthors Info & Claims

CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

Pages 6038 - 6049

https://doi.org/10.1145/3025453.3025483

Published: 02 May 2017 Publication History

Abstract

SUGILITE is a new programming-by-demonstration (PBD) system that enables users to create automation on smartphones. SUGILITE uses Android's accessibility API to support automating arbitrary tasks in any Android app (or even across multiple apps). When the user gives verbal commands that SUGILITE does not know how to execute, the user can demonstrate by directly manipulating the regular apps' user interface. By leveraging the verbal instructions, the demonstrated procedures, and the apps? UI hierarchy structures, SUGILITE can automatically generalize the script from the recorded actions, so SUGILITE learns how to perform tasks with different variations and parameters from a single demonstration. Extensive error handling and context checking support forking the script when new situations are encountered, and provide robustness if the apps change their user interface. Our lab study suggests that users with little or no programming knowledge can successfully automate smartphone tasks using SUGILITE.

Supplementary Material

suppl.mov (pn1153.mp4)

Supplemental video

Download
82.75 MB

suppl.mov (pn1153p.mp4)

Supplemental video

Download
4.77 MB

References

[1]

James Allen, Nathanael Chambers, George Ferguson, et al. 2007. Plow: A collaborative task learning agent. In Proceedings of the National Conference on Artificial Intelligence, 1514.

Digital Library

[2]

V. Antila, J. Polet, A. Lämsä, and J. Liikka. 2012. RoutineMaker: Towards end-user automation of daily routines using smartphones. In 2012 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), 399--402.

[3]

Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. 2009. A survey of robot learning from demonstration. Robotics and autonomous systems 57, 5: 469--483.

Digital Library

[4]

Amos Azaria, Jayant Krishnamurthy, and Tom M. Mitchell. 2016. Instructable intelligent personal agent. In Proc. The 30th AAAI Conference on Artificial Intelligence (AAAI).

Digital Library

[5]

Lawrence Bergman, Vittorio Castelli, Tessa Lau, and Daniel Oblinger. 2005. DocWizards: A System for Authoring Follow-me Documentation Wizards. In Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology (UIST '05), 191--200.

Digital Library

[6]

Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal. 2008. Robot programming by demonstration. In Springer handbook of robotics. Springer, 1371-- 1394.

[7]

Michael Bolin, Matthew Webber, Philip Rha, Tom Wilson, and Robert C. Miller. 2005. Automation and customization of rendered web pages. In Proceedings of the 18th annual ACM symposium on User interface software and technology, 163--172.

Digital Library

[8]

David L. Chen and Raymond J. Mooney. 2011. Learning to Interpret Natural Language Navigation Instructions from Observations. In AAAI, 1--2.

Digital Library

[9]

Jiun-Hung Chen and Daniel S. Weld. 2008. Recovering from Errors During Programming by Demonstration. In Proceedings of the 13th International Conference on Intelligent User Interfaces (IUI '08), 159--168.

Digital Library

[10]

Yun-Nung Chen, Ming Sun, and Alexander I. Rudnicky. 2015. Matrix factorization with domain knowledge and behavioral patterns for intent modeling. In NIPS Workshop on Machine Learning for SLU and Interaction.

[11]

Allen Cypher and Daniel Conrad Halbert. 1993. Watch what I do: programming by demonstration. MIT press.

Digital Library

[12]

Martin R. Frank and James D. Foley. 1993. Model-based User Interface Design by Example and by Interview. In Proceedings of the 6th Annual ACM Symposium on User Interface Software and Technology (UIST '93), 129--137.

Digital Library

[13]

Martin R. Frank and James D. Foley. 1994. A Pure Reasoning Engine for Programming by Demonstration. In Proceedings of the 7th Annual ACM Symposium on User Interface Software and Technology (UIST '94), 95--101.

Digital Library

[14]

Krzysztof Gajos and Daniel S. Weld. 2004. SUPPLE: automatically generating user interfaces. In Proceedings of the 9th international conference on Intelligent user interfaces, 93--100.

Digital Library

[15]

Floraine Grabler, Maneesh Agrawala, Wilmot Li, Mira Dontcheva, and Takeo Igarashi. 2009. Generating Photo Manipulation Tutorials by Demonstration. In ACM SIGGRAPH 2009 Papers (SIGGRAPH '09), 66:1--66:9.

Digital Library

[16]

Ting-Hao Kenneth Huang, Amos Azaria, and Jeffrey P. Bigham. 2016. InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd. 1555--1562.

Digital Library

[17]

IFTTT. IFTTT. IFTTT / Connect the apps you love.

[18]

Jiepu Jiang, Ahmed Hassan Awadallah, Rosie Jones, et al. 2015. Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web (WWW '15), 506--516.

Digital Library

[19]

Ken Kahn. 1996. Toontalk TM - an animated programming environment for children. Journal of Visual Languages & Computing 7, 2: 197--217.

[20]

Simon Khalaf. Seven Years Into The Mobile Revolution: Content is King... Again. Yahoo Developer Network.

[21]

Tessa Lau. 2009. Why Programming-By-Demonstration Systems Fail: Lessons Learned for Usable AI. AI Magazine 30, 4: 65.

Digital Library

[22]

Tessa A. Lau and Daniel S. Weld. 1999. Programming by Demonstration: An Inductive Learning Formulation. In Proceedings of the 4th International Conference on Intelligent User Interfaces (IUI '99), 145--152.

Digital Library

[23]

Gilly Leshed, Eben M. Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-to Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08), 1719--1728.

Digital Library

[24]

Henry Lieberman. 2001. Your wish is my command: Programming by example. Morgan Kaufmann.

[25]

Pattie Maes. 1994. Agents That Reduce Work and Information Overload. Commun. ACM 37, 7: 30--40.

Digital Library

[26]

Rodrigo de A. Maués and Simone Diniz Junqueira Barbosa. 2013. Keep Doing What I Just Did: Automating Smartphones by Demonstration. In Proceedings of the 15th International Conference on Human-computer Interaction with Mobile Devices and Services (MobileHCI '13), 295--303.

Digital Library

[27]

Richard G. McDaniel and Brad A. Myers. 1997. Gamut: Demonstrating Whole Applications. In Proceedings of the 10th Annual ACM Symposium on User Interface Software and Technology (UIST '97), 81--82.

Digital Library

[28]

Richard G. McDaniel and Brad A. Myers. 1999. Getting More out of Programming-by-demonstration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '99), 442--449.

Digital Library

[29]

Francesmary Modugno and Brad A. Myers. 1994. Pursuit: Graphically Representing Programs in a Demonstrational Visual Shell. In Conference Companion on Human Factors in Computing Systems (CHI '94), 455-- 456.

Digital Library

[30]

Brad. A. Myers. 1986. Visual Programming, Programming by Example, and Program Visualization: A Taxonomy. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '86), 59-- 66.

Digital Library

[31]

Brad A. Myers. 1990. Creating user interfaces using programming by example, visual programming, and constraints. ACM Transactions on Programming Languages and Systems (TOPLAS) 12, 2: 143--177.

Digital Library

[32]

Brad A. Myers. 1991. Graphical techniques in a spreadsheet for specifying user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 243--249.

Digital Library

[33]

Brad A. Myers and Richard McDaniel. 2001. Sometimes you need a little intelligence, sometimes you need a lot. Your Wish is My Command: Programming by Example. San Francisco, CA: Morgan Kaufmann Publishers: 45-- 60.

Digital Library

[34]

Brad A. Myers, Brad Vandcr Zanden, and Roger B. Dannenberg. 1989. Creating graphical interactive application objects by demonstration. In Proceedings of the 2nd annual ACM SIGGRAPH symposium on User interface software and technology, 95--104.

Digital Library

[35]

Shin 'ichiro Nakaoka, Atsushi Nakazawa, Fumio Kanehiro, et al. 2007. Learning from observation paradigm: Leg task models for enabling a biped humanoid robot to imitate human dances. The International Journal of Robotics Research 26, 8: 829--844.

Digital Library

[36]

A. Namoun, A. Daskalopoulou, N. Mehandjiev, and Z. Xun. 2016. Exploring Mobile End User Development: Existing Use and Design Factors. IEEE Transactions on Software Engineering PP, 99: 1--1.

Digital Library

[37]

Nielsen. 2015. So Many Apps, So Much More Time for Entertainment.

[38]

Lenin Ravindranath, Arvind Thiagarajan, Hari Balakrishnan, and Samuel Madden. 2012. Code in the Air: Simplifying Sensing and Coordination Tasks on Smartphones. In Proceedings of the Twelfth Workshop on Mobile Computing Systems & Applications (HotMobile '12), 4:1--4:6.

Digital Library

[39]

André Rodrigues. 2015. Breaking Barriers with Assistive Macros. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '15), 351--352.

Digital Library

[40]

Ben Shneiderman, Catherine Plaisant, Maxine Cohen, Steven Jacobs, Niklas Elmqvist, and Nicholas Diakopoulos. 2016. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Pearson, Boston.

Digital Library

[41]

Ming Sun, Yun-Nung Chen, and Alexander I. Rudnicky. 2016. HELPR: A Framework to Break the Barrier across Domains in Spoken Dialog Systems. In International Workshop on Spoken Dialog Systems.

[42]

Jesse Thomason, Shiqi Zhang, Raymond Mooney, and Peter Stone. 2015. Learning to interpret natural language commands through human-robot dialog. In Proceedings of the Twenty-Fourth international joint conference on Artificial Intelligence (IJCAI).

Digital Library

[43]

Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller. 2009. Sikuli: Using GUI Screenshots for Search and Automation. In Proceedings of the 22Nd Annual ACM Symposium on User Interface Software and Technology (UIST '09), 183--192.

Digital Library

[44]

Sha Zhao, Julian Ramos, Jianrong Tao, et al. 2016. Discovering Different Kinds of Smartphone Users Through Their Application Usage Behaviors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '16), 498--509.

Digital Library

[45]

SiriKit - Apple Developer. https://developer.apple.com/sirikit/

[46]

Automate everyday automation for Android LlamaLab. http://llamalab.com/automate/

[47]

Workato - Connect your apps. Automate your work. Workato. https://www.workato.com/

Cited By

Tian YKummerfeld JLi TZhang T(2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676368
Ning ZTian YZhang ZZhang TLi T(2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650114
Lu YZhang CYang YYao YLi T(2024)From Awareness to Action: Exploring End-User Empowerment Interventions for Dark Patterns in UXProceedings of the ACM on Human-Computer Interaction10.1145/36373368:CSCW1(1-41)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637336
Show More Cited By

Index Terms

SUGILITE: Creating Multimodal Smartphone Automation by Demonstration
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms

Recommendations

Keep doing what i just did: automating smartphones by demonstration
MobileHCI '13: Proceedings of the 15th international conference on Human-computer interaction with mobile devices and services

Automating tasks can make a smartphone easier to use and more battery efficient. However, currently little work has been done to help end-users to create such automations. In this paper, we explore an approach for automating smartphone tasks by ...
VASTA: a vision and language-assisted smartphone task automation system
IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces

We present VASTA, a novel vision and language-assisted Programming By Demonstration (PBD) system for smartphone task automation. Development of a robust PBD automation system requires overcoming three key challenges: first, how to make a particular ...
Inter-Widget communication by demonstration in user interface mashups
ICWE'13: Proceedings of the 13th international conference on Web Engineering

User Interface Mashups have become increasingly popular, as they allow end users with little programming skills to create situational Web applications on their own. Those are built by composing interactive components, so-called widgets, whose ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

May 2017

7138 pages

ISBN:9781450346559

DOI:10.1145/3025453

General Chairs:
Gloria Mark
University of California Irvine
,
Susan Fussell
Cornell University
,
Program Chairs:
Cliff Lampe
University of Michigan
,
m.c. schraefel
University of Southampton
,
Juan Pablo Hourcade
University of Iowa
,
Caroline Appert
Université Paris-Sud
,
Daniel Wigdor
University of Toronto

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Honorable Mention

Author Tags

Qualifiers

Research-article

Funding Sources

Samsung
Yahoo!

Conference

CHI '17

Sponsor:

SIGCHI

CHI '17: CHI Conference on Human Factors in Computing Systems

May 6 - 11, 2017

Colorado, Denver, USA

Acceptance Rates

CHI '17 Paper Acceptance Rate 600 of 2,400 submissions, 25%;

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

90
Total Citations
View Citations
1,095
Total Downloads

Downloads (Last 12 months)177
Downloads (Last 6 weeks)33

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tian YKummerfeld JLi TZhang T(2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676368
Ning ZTian YZhang ZZhang TLi T(2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650114
Lu YZhang CYang YYao YLi T(2024)From Awareness to Action: Exploring End-User Empowerment Interventions for Dark Patterns in UXProceedings of the ACM on Human-Computer Interaction10.1145/36373368:CSCW1(1-41)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637336
Lee SChoi JLee JWasi MChoi HKo SOh SShin IGanesan DLane NShi W(2024)MobileGPT: Augmenting LLM with Human-like App Memory for Mobile Task AutomationProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3690682(1119-1133)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3690682
Hur JCunningham K(2024)Profiling Conversational Programmers at University: Insights into their Motivations and Goals from a Broad Sample of Non-MajorsProceedings of the 2024 ACM Conference on International Computing Education Research - Volume 110.1145/3632620.3671123(293-311)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3632620.3671123
Jiang Y(2024)Computational Representations for Graphical User InterfacesExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3638191(1-6)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3638191
Jiang YLu YKnearem TKliman-Silver CLutteroth CLi TNichols JStuerzlinger W(2024)Computational Methodologies for Understanding, Automating, and Evaluating User InterfacesExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3636316(1-7)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3636316
Jiang YZhou CGarg VOulasvirta A(2024)Graph4GUI: Graph Neural Networks for Representing Graphical User InterfacesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642822(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642822
Ning ZWimer BJiang KChen KBan JTian YZhao YLi T(2024)SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision ViewersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642632(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642632
Cho HYan YTodi KParent MSmith MJonker TBenko HLindlbauer D(2024)MineXR: Mining Personalized Extended Reality InterfacesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642394(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642394
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten