skip to main content
10.1145/3173574.3173859acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections

Leveraging Community-Generated Videos and Command Logs to Classify and Recommend Software Workflows

Published: 21 April 2018 Publication History


Users of complex software applications often rely on inefficient or suboptimal workflows because they are not aware that better methods exist. In this paper, we develop and validate a hierarchical approach combining topic modeling and frequent pattern mining to classify the workflows offered by an application, based on a corpus of community-generated videos and command logs. We then propose and evaluate a design space of four different workflow recommender algorithms, which can be used to recommend new workflows and their associated videos to software users. An expert validation of the task classification approach found that 82% of the time, experts agreed with the classifications. We also evaluate our workflow recommender algorithms, demonstrating their potential and suggesting avenues for future work.


Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. 1993. Mining Association Rules Between Sets of Items in Large Databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD '93). ACM, New York, NY, USA, 207--216.
Rakesh Agrawal and Ramakrishnan Srikant. 1995. Mining Sequential Patterns. In Proceedings of the Eleventh International Conference on Data Engineering (ICDE '95). IEEE Computer Society, Washington, DC, USA, 3--14.
Eytan Adar, Jaime Teevan, and Susan T. Dumais. 2008. Large scale analysis of web revisitation patterns. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, 1197--1206.
Autodesk. Retrieved Dec 21, 2017. About the Screencast API and SDK.
Nikola Banovic, Tovi Grossman, Justin Matejka, and George Fitzmaurice. 2012. Waken: reverse engineering usage information and interface structure from software videos. In Proceedings of the 25th annual ACM symposium on User interface software and technology (UIST '12). ACM, New York, NY, USA, 83--92.
Scott Bateman, Jaime Teevan, and Ryen W. White. 2012. The search dashboard: how reflection and comparison impact search behavior. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 1785--1794.
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3: 993--1022.
Boris Cuke, Bart Goethals, and Celine Robardet. A new constraint for mining sets in sequences. In Proceedings of the 2009 SIAM International Conference on Data Mining. 317--328.
Himel Dev and Zhicheng Liu. 2017. Identifying Frequent User Tasks from Application Logs. In Proceedings of the 22nd International Conference on Intelligent User Interfaces (IUI '17). ACM, New York, NY, USA, 263--273.
Gensim Library. Retrieved July 10 2017 from
Thomas L. Griffiths, Mark Steyvers, and Joshua B. Tenenbaum (2007). Topics in semantic representation. Psychological review, 114(2), 211.
Tovi Grossman, George Fitzmaurice, and Ramtin Attar. 2009. A survey of software learnability: metrics, methodologies and guidelines. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, New York, NY, USA, 649658.
Tovi Grossman and George Fitzmaurice. 2010. ToolClips: an investigation of contextual video assistance for functionality understanding. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). ACM, New York, NY, USA, 1515--1524.
Tâm Huynh, Mario Fritz, and Bernt Schiele. 2008. Discovery of activity patterns using topic models. In Proceedings of the 10th international conference on Ubiquitous computing (UbiComp '08). ACM, New York, NY, USA, 10--19.
Juho Kim, Robert C. Miller, and Krzysztof Z. Gajos. 2013. Learnersourcing subgoal labeling to support learning from how-to videos. In CHI '13 Extended Abstracts on Human Factors in Computing Systems (CHI EA '13). ACM, New York, NY, USA, 685--690.
Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 40174026.
Juho Kim, Philip J. Guo, Carrie J. Cai, Shang-Wen (Daniel) Li, Krzysztof Z. Gajos, and Robert C. Miller. 2014. Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th annual ACM symposium on User interface software and technology (UIST '14). ACM, New York, NY, USA, 563--572.
Benjamin Lafreniere, Tovi Grossman, and George Fitzmaurice. 2013. Community enhanced tutorials: improving tutorials with multiple demonstrations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 1779--1788.
Benjamin Lafreniere, Tovi Grossman, Justin Matejka, and George Fitzmaurice. 2014. Investigating the feasibility of extracting tool demonstrations from insitu video content. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 40074016.
Gilly Leshed, Eben M. Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: automating & sharing how-to knowledge in the enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, 1719--1728.
Wei Li, Justin Matejka, Tovi Grossman, Joseph A. Konstan, and George Fitzmaurice. 2011. Design and evaluation of a command recommendation system for software applications. ACM Trans. Comput.-Hum. Interact. 18, 2, Article 6.
Wei Li, Tovi Grossman, and George Fitzmaurice. 2014. CADament: a gamified multiplayer software tutorial system. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 33693378.
Frank Linton, and Hans-Peter Schaefer (2000). Recommender systems for learning: building user and expert models through long-term observation of application use. User Modeling and User-Adapted Interaction, 10(2--3), 181--208.
Frank Linton, Andy Charron and Hans-Peter Schaefer (2000). OWL: A recommender system for organization-wide learning. Educational Technology & Society, 3(1), 62--76.
David J.C. MacKay (2003). Information Theory, Inference, and Learning Algorithms (First ed.). Cambridge University Press. p. 34.
Sylvain Malacria, Joey Scarr, Andy Cockburn, Carl Gutwin, and Tovi Grossman. 2013. Skillometers: reflective widgets that motivate and help users to improve performance. In Proceedings of the 26th annual ACM symposium on User interface software and technology (UIST '13). ACM, New York, NY, USA, 321--330.
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. 1997. Discovery of Frequent Episodes in Event Sequences. Data Min. Knowl. Discov. 1, 3 (Jan. 1997), 259--289.
Justin Matejka, Wei Li, Tovi Grossman, and George Fitzmaurice. 2009. CommunityCommands: command recommendations for software applications. In Proceedings of the 22nd annual ACM symposium on User interface software and technology (UIST '09). ACM, New York, NY, USA, 193--202.
Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011. Ambient help. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, New York, NY, USA, 2751--2760.
Hartmut Obendorf, Harald Weinreich, Eelco Herder, and Matthias Mayer. 2007. Web page revisitation revisited: implications of a long-term click-stream study of browser usage. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07). ACM, New York, NY, USA, 597606.
Jaimie Y. Park, Neil O'Hare, Rossano Schifanella, Alejandro Jaimes, and Chin-Wan Chung. 2015. A Large-Scale Study of User Image Search Behavior on the Web. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 985994.
Marco Pennacchiotti and Siva Gurumurthy. 2011. Investigating topic models for social media user recommendation. In Proceedings of the 20th international conference companion on World wide web (WWW '11). ACM, New York, NY, USA, 101--102.
Adam Perer and Fei Wang. 2014. Frequence: interactive mining and visualization of temporal frequent event sequences. In Proceedings of the 19th international conference on Intelligent User Interfaces (IUI '14). ACM, New York, NY, USA, 153162.
Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D. Manning. 2009. Labeled LDA: a supervised topic model for credit attribution in multilabeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1 (EMNLP '09), Vol. 1. Association for Computational Linguistics, Stroudsburg, PA, USA, 248--256.
Amit Singhal. 2001. Modern information retrieval: A brief overview. IEEE Data Engineering Bulletin, 24(4), 35--43.
SPMF Library. Retrieved July 10 2017 from:
Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y. Zhao. 2016. Unsupervised Clickstream Clustering for User Behavior Analysis. In Proceedings of the 34th Annual ACM Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 225--236.
Miaomiao Wen and Carolyn Penstein Rose. 2014. Identifying Latent Study Habits by Mining Learner Behavior Patterns in Massive Open Online Courses. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM '14). ACM, New York, NY, USA, 1983--1986.
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In Proceedings of the 22nd international conference on World Wide Web (WWW '13). ACM, New York, NY, USA, 1445--1456.
Yi Zhang, Jamie Callan, and Thomas Minka. 2002. Novelty and redundancy detection in adaptive filtering. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '02). ACM, New York, NY, USA, 81--88.

Cited By

View all
  • (2024)Tutorial mismatches: investigating the frictions due to interface differences when following software video tutorialsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661511(1942-1955)Online publication date: 1-Jul-2024
  • (2024)The Impact of Sketch-guided vs. Prompt-guided 3D Generative AIs on the Design Exploration ProcessProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642218(1-18)Online publication date: 11-May-2024
  • (2023)Beyond Instructions: A Taxonomy of Information Types in How-to VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581126(1-21)Online publication date: 19-Apr-2023
  • Show More Cited By

Index Terms

  1. Leveraging Community-Generated Videos and Command Logs to Classify and Recommend Software Workflows



    Information & Contributors


    Published In

    cover image ACM Conferences
    CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
    April 2018
    8489 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 April 2018


    Request permissions for this article.

    Check for updates

    Author Tags

    1. application logs
    2. community-generated videos
    3. software learning
    4. topic modeling
    5. workflow recommendation


    • Research-article


    CHI '18

    Acceptance Rates

    CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 03 Mar 2025

    Other Metrics


    Cited By

    View all
    • (2024)Tutorial mismatches: investigating the frictions due to interface differences when following software video tutorialsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661511(1942-1955)Online publication date: 1-Jul-2024
    • (2024)The Impact of Sketch-guided vs. Prompt-guided 3D Generative AIs on the Design Exploration ProcessProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642218(1-18)Online publication date: 11-May-2024
    • (2023)Beyond Instructions: A Taxonomy of Information Types in How-to VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581126(1-21)Online publication date: 19-Apr-2023
    • (2023)Identifying Multimodal Context Awareness Requirements for Supporting User Interaction with Procedural VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581006(1-17)Online publication date: 19-Apr-2023
    • (2022)PONI: A Personalized Onboarding Interface for Getting Inspiration and Learning About AR/VR CreationNordic Human-Computer Interaction Conference10.1145/3546155.3546642(1-14)Online publication date: 8-Oct-2022
    • (2022)SoftVideo: Improving the Learning Experience of Software Tutorial Videos with Collective Interaction DataProceedings of the 27th International Conference on Intelligent User Interfaces10.1145/3490099.3511106(646-660)Online publication date: 22-Mar-2022
    • (2022)SimCURL: Simple Contrastive User Representation Learning from Command Sequences2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00186(1143-1150)Online publication date: Dec-2022
    • (2021)HelpViz: Automatic Generation of Contextual Visual Mobile Tutorials from Text-Based InstructionsThe 34th Annual ACM Symposium on User Interface Software and Technology10.1145/3472749.3474812(1144-1153)Online publication date: 10-Oct-2021
    • (2021)RubySlippers: Supporting Content-based Voice Navigation for How-to VideosProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445131(1-14)Online publication date: 6-May-2021
    • (2020)Goal-driven Command Recommendations for AnalystsProceedings of the 14th ACM Conference on Recommender Systems10.1145/3383313.3412255(160-169)Online publication date: 22-Sep-2020
    • Show More Cited By

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.







    Share this Publication link

    Share on social media