skip to main content
10.1145/2047196.2047213acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article

Pause-and-play: automatically linking screencast video tutorials with applications

Published: 16 October 2011 Publication History

Abstract

Video tutorials provide a convenient means for novices to learn new software applications. Unfortunately, staying in sync with a video while trying to use the target application at the same time requires users to repeatedly switch from the application to the video to pause or scrub backwards to replay missed steps. We present Pause-and-Play, a system that helps users work along with existing video tutorials. Pause-and-Play detects important events in the video and links them with corresponding events in the target application as the user tries to replicate the depicted procedure. This linking allows our system to automatically pause and play the video to stay in sync with the user. Pause-and-Play also supports convenient video navigation controls that are accessible from within the target application and allow the user to easily replay portions of the video without switching focus out of the application. Finally, since our system uses computer vision to detect events in existing videos and leverages application scripting APIs to obtain real time usage traces, our approach is largely independent of the specific target application and does not require access or modifications to application source code. We have implemented Pause-and-Play for two target applications, Google SketchUp and Adobe Photoshop, and we report on a user study that shows our system improves the user experience of working with video tutorials.

Supplementary Material

JPG File (fp170.jpg)
MP4 File (fp170.mp4)

References

[1]
L. Bergman, V. Castelli, T. Lau, and D. Oblinger. Docwizards: a system for authoring follow-me documentation wizards. In Proc. ACM UIST, pages 191--200, 2005.
[2]
K.-Y. Cheng, S.-J. Luo, B.-Y. Chen, and H.-H. Chu. Smartplayer: user-centric video fast-forwarding. In Proc. SIGCHI, pages 789--798, 2009.
[3]
M. Dixon and J. Fogarty. Prefab: implementing advanced behaviors using pixel-based reverse engineering of interface structure. In Proc. SIGCHI, pages 1525--1534, 2010.
[4]
S. Gilbert, S. B. Blessing, and S. Kodavali. The extensible problem-specific tutor (xpst): Evaluation of an api for tutoring on existing interfaces. In Proc. the conf. Artificial Intelligence in Education, pages 707--709, 2009.
[5]
F. Grabler, M. Agrawala, W. Li, M. Dontcheva, and T. Igarashi. Generating photo manipulation tutorials by demonstration. In Proc. ACM SIGGRAPH, pages 1--9, 2009.
[6]
T. Grossman and G. Fitzmaurice. Toolclips: an investigation of contextual video assistance for functionality understanding. In Proc. SIGCHI, pages 1515--1524, 2010.
[7]
T. Grossman, J. Matejka, and G. Fitzmaurice. Chronicle: capture, exploration, and playback of document workflow histories. In Proc. ACM UIST, pages 143--152, 2010.
[8]
S. M. Harrison. A comparison of still, animated, or nonillustrated on-line help with written or spoken instructions in a graphical user interface. In Computer Human Interaction, pages 82--89, 1995.
[9]
C. Hategekimana, S. Gilbert, and S. Blessing. Effectiveness of using an intelligent tutoring system to train users on off-the-shelf software. In Proc. Society for Info. Tech. and Teacher Education Int'l Conf., AACE, 2008.
[10]
C. Kelleher and R. Pausch. Stencils-based tutorials: design and evaluation. In Proc. SIGCHI, pages 541--550, 2005.
[11]
J. Matejka, T. Grossman, and G. Fitzmaurice. Ambient help. In Proc. SIGCHI, pages 2751--2760, 2011.
[12]
S. Palmiter and J. Elkerton. An evaluation of animated demonstrations of learning computer-based tasks. In Proc SIGCHI, pages 257--263, 1991.
[13]
N. Petrovic, N. Jojic, and T. S. Huang. Adaptive video fast forward. Multimedia Tools Appl., 26:327--344, 2005.
[14]
S. Pongnumkul, J. Wang, G. Ramos, and M. Cohen. Content-aware dynamic timeline for video browsing. In Proc. ACM UIST, pages 139--142, 2010.
[15]
B. Shneiderman. Direct manipulation: A step beyond programming languages. Computer, 16(8):57--69, 1983.
[16]
S. L. Su. Enhanced Visual Authoring Using Operation History. PhD thesis, Massachusetts Institute of Technology, Boston, Massachusetts, 2009.
[17]
T. Yeh, T.-H. Chang, and R. C. Miller. Sikuli: using gui screenshots for search and automation. In Proc. ACM UIST, pages 183--192, 2009.

Cited By

View all
  • (2024)Optimizing OCR Performance for Programming Videos: The Role of Image Super-Resolution and Large Language ModelsMathematics10.3390/math1207103612:7(1036)Online publication date: 30-Mar-2024
  • (2024)EasyAsk: An In-App Contextual Tutorial Search Assistant for Older Adults with Voice and Touch InputsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785168:3(1-27)Online publication date: 9-Sep-2024
  • (2024)Improving Video Navigation for Spatial Task Tutorials by Spatially Segmenting and Situating How-To VideosProceedings of the 2024 ACM Symposium on Spatial User Interaction10.1145/3677386.3682103(1-13)Online publication date: 7-Oct-2024
  • Show More Cited By

Index Terms

  1. Pause-and-play: automatically linking screencast video tutorials with applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology
    October 2011
    654 pages
    ISBN:9781450307161
    DOI:10.1145/2047196
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. instructions
    2. screencast
    3. video tutorial

    Qualifiers

    • Research-article

    Conference

    UIST '11

    Acceptance Rates

    UIST '11 Paper Acceptance Rate 67 of 262 submissions, 26%;
    Overall Acceptance Rate 561 of 2,567 submissions, 22%

    Upcoming Conference

    UIST '25
    The 38th Annual ACM Symposium on User Interface Software and Technology
    September 28 - October 1, 2025
    Busan , Republic of Korea

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimizing OCR Performance for Programming Videos: The Role of Image Super-Resolution and Large Language ModelsMathematics10.3390/math1207103612:7(1036)Online publication date: 30-Mar-2024
    • (2024)EasyAsk: An In-App Contextual Tutorial Search Assistant for Older Adults with Voice and Touch InputsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785168:3(1-27)Online publication date: 9-Sep-2024
    • (2024)Improving Video Navigation for Spatial Task Tutorials by Spatially Segmenting and Situating How-To VideosProceedings of the 2024 ACM Symposium on Spatial User Interaction10.1145/3677386.3682103(1-13)Online publication date: 7-Oct-2024
    • (2024)Tutorial mismatches: investigating the frictions due to interface differences when following software video tutorialsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661511(1942-1955)Online publication date: 1-Jul-2024
    • (2024)Temaneki: Map-Based Collaboration Tool for Consensus-Building in Student-Run Festival Management TeamsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3651013(1-8)Online publication date: 11-May-2024
    • (2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
    • (2024)SwapVid: Integrating Video Viewing and Document Exploration with Direct ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642515(1-13)Online publication date: 11-May-2024
    • (2024)Progress Observation in Augmented Reality Assembly Tutorials Using Dynamic Hand Gesture Recognition2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00275(957-958)Online publication date: 16-Mar-2024
    • (2024)Investigating Developers' Preferences for Learning and Issue Resolution Resources in the ChatGPT Era2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58944.2024.00045(413-425)Online publication date: 6-Oct-2024
    • (2023)The state of art and review on video streamingJournal of High Speed Networks10.3233/JHS-22208729:3(211-236)Online publication date: 1-Jan-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media