skip to main content
10.1145/3025453.3025483acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

SUGILITE: Creating Multimodal Smartphone Automation by Demonstration

Published: 02 May 2017 Publication History

Abstract

SUGILITE is a new programming-by-demonstration (PBD) system that enables users to create automation on smartphones. SUGILITE uses Android's accessibility API to support automating arbitrary tasks in any Android app (or even across multiple apps). When the user gives verbal commands that SUGILITE does not know how to execute, the user can demonstrate by directly manipulating the regular apps' user interface. By leveraging the verbal instructions, the demonstrated procedures, and the apps? UI hierarchy structures, SUGILITE can automatically generalize the script from the recorded actions, so SUGILITE learns how to perform tasks with different variations and parameters from a single demonstration. Extensive error handling and context checking support forking the script when new situations are encountered, and provide robustness if the apps change their user interface. Our lab study suggests that users with little or no programming knowledge can successfully automate smartphone tasks using SUGILITE.

Supplementary Material

suppl.mov (pn1153.mp4)
Supplemental video
suppl.mov (pn1153p.mp4)
Supplemental video

References

[1]
James Allen, Nathanael Chambers, George Ferguson, et al. 2007. Plow: A collaborative task learning agent. In Proceedings of the National Conference on Artificial Intelligence, 1514.
[2]
V. Antila, J. Polet, A. Lämsä, and J. Liikka. 2012. RoutineMaker: Towards end-user automation of daily routines using smartphones. In 2012 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), 399--402.
[3]
Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. 2009. A survey of robot learning from demonstration. Robotics and autonomous systems 57, 5: 469--483.
[4]
Amos Azaria, Jayant Krishnamurthy, and Tom M. Mitchell. 2016. Instructable intelligent personal agent. In Proc. The 30th AAAI Conference on Artificial Intelligence (AAAI).
[5]
Lawrence Bergman, Vittorio Castelli, Tessa Lau, and Daniel Oblinger. 2005. DocWizards: A System for Authoring Follow-me Documentation Wizards. In Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology (UIST '05), 191--200.
[6]
Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal. 2008. Robot programming by demonstration. In Springer handbook of robotics. Springer, 1371-- 1394.
[7]
Michael Bolin, Matthew Webber, Philip Rha, Tom Wilson, and Robert C. Miller. 2005. Automation and customization of rendered web pages. In Proceedings of the 18th annual ACM symposium on User interface software and technology, 163--172.
[8]
David L. Chen and Raymond J. Mooney. 2011. Learning to Interpret Natural Language Navigation Instructions from Observations. In AAAI, 1--2.
[9]
Jiun-Hung Chen and Daniel S. Weld. 2008. Recovering from Errors During Programming by Demonstration. In Proceedings of the 13th International Conference on Intelligent User Interfaces (IUI '08), 159--168.
[10]
Yun-Nung Chen, Ming Sun, and Alexander I. Rudnicky. 2015. Matrix factorization with domain knowledge and behavioral patterns for intent modeling. In NIPS Workshop on Machine Learning for SLU and Interaction.
[11]
Allen Cypher and Daniel Conrad Halbert. 1993. Watch what I do: programming by demonstration. MIT press.
[12]
Martin R. Frank and James D. Foley. 1993. Model-based User Interface Design by Example and by Interview. In Proceedings of the 6th Annual ACM Symposium on User Interface Software and Technology (UIST '93), 129--137.
[13]
Martin R. Frank and James D. Foley. 1994. A Pure Reasoning Engine for Programming by Demonstration. In Proceedings of the 7th Annual ACM Symposium on User Interface Software and Technology (UIST '94), 95--101.
[14]
Krzysztof Gajos and Daniel S. Weld. 2004. SUPPLE: automatically generating user interfaces. In Proceedings of the 9th international conference on Intelligent user interfaces, 93--100.
[15]
Floraine Grabler, Maneesh Agrawala, Wilmot Li, Mira Dontcheva, and Takeo Igarashi. 2009. Generating Photo Manipulation Tutorials by Demonstration. In ACM SIGGRAPH 2009 Papers (SIGGRAPH '09), 66:1--66:9.
[16]
Ting-Hao Kenneth Huang, Amos Azaria, and Jeffrey P. Bigham. 2016. InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd. 1555--1562.
[17]
IFTTT. IFTTT. IFTTT / Connect the apps you love.
[18]
Jiepu Jiang, Ahmed Hassan Awadallah, Rosie Jones, et al. 2015. Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web (WWW '15), 506--516.
[19]
Ken Kahn. 1996. Toontalk TM - an animated programming environment for children. Journal of Visual Languages & Computing 7, 2: 197--217.
[20]
Simon Khalaf. Seven Years Into The Mobile Revolution: Content is King... Again. Yahoo Developer Network.
[21]
Tessa Lau. 2009. Why Programming-By-Demonstration Systems Fail: Lessons Learned for Usable AI. AI Magazine 30, 4: 65.
[22]
Tessa A. Lau and Daniel S. Weld. 1999. Programming by Demonstration: An Inductive Learning Formulation. In Proceedings of the 4th International Conference on Intelligent User Interfaces (IUI '99), 145--152.
[23]
Gilly Leshed, Eben M. Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-to Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08), 1719--1728.
[24]
Henry Lieberman. 2001. Your wish is my command: Programming by example. Morgan Kaufmann.
[25]
Pattie Maes. 1994. Agents That Reduce Work and Information Overload. Commun. ACM 37, 7: 30--40.
[26]
Rodrigo de A. Maués and Simone Diniz Junqueira Barbosa. 2013. Keep Doing What I Just Did: Automating Smartphones by Demonstration. In Proceedings of the 15th International Conference on Human-computer Interaction with Mobile Devices and Services (MobileHCI '13), 295--303.
[27]
Richard G. McDaniel and Brad A. Myers. 1997. Gamut: Demonstrating Whole Applications. In Proceedings of the 10th Annual ACM Symposium on User Interface Software and Technology (UIST '97), 81--82.
[28]
Richard G. McDaniel and Brad A. Myers. 1999. Getting More out of Programming-by-demonstration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '99), 442--449.
[29]
Francesmary Modugno and Brad A. Myers. 1994. Pursuit: Graphically Representing Programs in a Demonstrational Visual Shell. In Conference Companion on Human Factors in Computing Systems (CHI '94), 455-- 456.
[30]
Brad. A. Myers. 1986. Visual Programming, Programming by Example, and Program Visualization: A Taxonomy. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '86), 59-- 66.
[31]
Brad A. Myers. 1990. Creating user interfaces using programming by example, visual programming, and constraints. ACM Transactions on Programming Languages and Systems (TOPLAS) 12, 2: 143--177.
[32]
Brad A. Myers. 1991. Graphical techniques in a spreadsheet for specifying user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 243--249.
[33]
Brad A. Myers and Richard McDaniel. 2001. Sometimes you need a little intelligence, sometimes you need a lot. Your Wish is My Command: Programming by Example. San Francisco, CA: Morgan Kaufmann Publishers: 45-- 60.
[34]
Brad A. Myers, Brad Vandcr Zanden, and Roger B. Dannenberg. 1989. Creating graphical interactive application objects by demonstration. In Proceedings of the 2nd annual ACM SIGGRAPH symposium on User interface software and technology, 95--104.
[35]
Shin 'ichiro Nakaoka, Atsushi Nakazawa, Fumio Kanehiro, et al. 2007. Learning from observation paradigm: Leg task models for enabling a biped humanoid robot to imitate human dances. The International Journal of Robotics Research 26, 8: 829--844.
[36]
A. Namoun, A. Daskalopoulou, N. Mehandjiev, and Z. Xun. 2016. Exploring Mobile End User Development: Existing Use and Design Factors. IEEE Transactions on Software Engineering PP, 99: 1--1.
[37]
Nielsen. 2015. So Many Apps, So Much More Time for Entertainment.
[38]
Lenin Ravindranath, Arvind Thiagarajan, Hari Balakrishnan, and Samuel Madden. 2012. Code in the Air: Simplifying Sensing and Coordination Tasks on Smartphones. In Proceedings of the Twelfth Workshop on Mobile Computing Systems & Applications (HotMobile '12), 4:1--4:6.
[39]
André Rodrigues. 2015. Breaking Barriers with Assistive Macros. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '15), 351--352.
[40]
Ben Shneiderman, Catherine Plaisant, Maxine Cohen, Steven Jacobs, Niklas Elmqvist, and Nicholas Diakopoulos. 2016. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Pearson, Boston.
[41]
Ming Sun, Yun-Nung Chen, and Alexander I. Rudnicky. 2016. HELPR: A Framework to Break the Barrier across Domains in Spoken Dialog Systems. In International Workshop on Spoken Dialog Systems.
[42]
Jesse Thomason, Shiqi Zhang, Raymond Mooney, and Peter Stone. 2015. Learning to interpret natural language commands through human-robot dialog. In Proceedings of the Twenty-Fourth international joint conference on Artificial Intelligence (IJCAI).
[43]
Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller. 2009. Sikuli: Using GUI Screenshots for Search and Automation. In Proceedings of the 22Nd Annual ACM Symposium on User Interface Software and Technology (UIST '09), 183--192.
[44]
Sha Zhao, Julian Ramos, Jianrong Tao, et al. 2016. Discovering Different Kinds of Smartphone Users Through Their Application Usage Behaviors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '16), 498--509.
[45]
SiriKit - Apple Developer. https://developer.apple.com/sirikit/
[46]
Automate everyday automation for Android LlamaLab. http://llamalab.com/automate/
[47]
Workato - Connect your apps. Automate your work. Workato. https://www.workato.com/

Cited By

View all
  • (2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
  • (2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
  • (2024)From Awareness to Action: Exploring End-User Empowerment Interventions for Dark Patterns in UXProceedings of the ACM on Human-Computer Interaction10.1145/36373368:CSCW1(1-41)Online publication date: 26-Apr-2024
  • Show More Cited By

Index Terms

  1. SUGILITE: Creating Multimodal Smartphone Automation by Demonstration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
    May 2017
    7138 pages
    ISBN:9781450346559
    DOI:10.1145/3025453
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 May 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Honorable Mention

    Author Tags

    1. end-user development
    2. programming by demonstration
    3. smartphone automation

    Qualifiers

    • Research-article

    Funding Sources

    • Samsung
    • Yahoo!

    Conference

    CHI '17
    Sponsor:

    Acceptance Rates

    CHI '17 Paper Acceptance Rate 600 of 2,400 submissions, 25%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)177
    • Downloads (Last 6 weeks)33
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
    • (2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
    • (2024)From Awareness to Action: Exploring End-User Empowerment Interventions for Dark Patterns in UXProceedings of the ACM on Human-Computer Interaction10.1145/36373368:CSCW1(1-41)Online publication date: 26-Apr-2024
    • (2024)MobileGPT: Augmenting LLM with Human-like App Memory for Mobile Task AutomationProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3690682(1119-1133)Online publication date: 4-Dec-2024
    • (2024)Profiling Conversational Programmers at University: Insights into their Motivations and Goals from a Broad Sample of Non-MajorsProceedings of the 2024 ACM Conference on International Computing Education Research - Volume 110.1145/3632620.3671123(293-311)Online publication date: 12-Aug-2024
    • (2024)Computational Representations for Graphical User InterfacesExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3638191(1-6)Online publication date: 11-May-2024
    • (2024)Computational Methodologies for Understanding, Automating, and Evaluating User InterfacesExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3636316(1-7)Online publication date: 11-May-2024
    • (2024)Graph4GUI: Graph Neural Networks for Representing Graphical User InterfacesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642822(1-18)Online publication date: 11-May-2024
    • (2024)SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision ViewersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642632(1-18)Online publication date: 11-May-2024
    • (2024)MineXR: Mining Personalized Extended Reality InterfacesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642394(1-17)Online publication date: 11-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media