demonstration

Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai

Authors:

Na Li,

Ping Yu,

Alex OlwalAuthors Info & Claims

CHI EA '23: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

Article No.: 448, Pages 1 - 4

https://doi.org/10.1145/3544549.3583925

Published: 19 April 2023 Publication History

Get Access

Abstract

We demonstrate Rapsai, a visual programming platform that aims to streamline the rapid and iterative development of end-to-end machine learning (ML)-based multimedia applications. Rapsai features a node-graph editor that enables interactive characterization and visualization of ML model performance, which facilitates the understanding of how the model behaves in different scenarios. Moreover, the platform streamlines end-to-end prototyping by providing interactive data augmentation and model comparison capabilities within a no-coding environment. Our demonstration showcases the versatility of Rapsai through several use cases, including virtual background, visual effects with depth estimation, and audio denoising. The implementation of Rapsai is intended to support ML practitioners in streamlining their workflow, making data-driven decisions, and comprehensively evaluating model behavior with real-world input.

Supplementary Material

VTT File (3544549.3583925-preview.vtt)

Download
.67 KB

VTT File (3544549.3583925-walkthrough.vtt)

Download
3.98 KB

MP4 File (3544549.3583925-walkthrough.mp4)

Walkthrough Video

Download
145.16 MB

MP4 File (3544549.3583925-preview.mp4)

Video Preview

Download
17.60 MB

References

[1]

Michelle Carney, Barron Webster, Irene Alvarado, Kyle Phillips, Noura Howell, Jordan Griffith, Jonas Jongejan, Amit Pitaru, and Alexander Chen. 2020. Teachable Machine: Approachable Web-Based Tool for Exploring Machine Learning Classification. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3334480.3382839

Digital Library

Google Scholar

[2]

John Joon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, and Minsuk Chang. 2022. TaleBrush: Sketching Stories With Generative Pretrained Language Models. In CHI Conference on Human Factors in Computing Systems. 1–19. https://doi.org/10.1145/3491102.3501819

Digital Library

Google Scholar

[3]

Ruofei Du, Na Li, Jing Jin, Michelle Carney, Scott Miles, Maria Kleiner, Xiuxiu Yuan, Yinda Zhang, Anuva Kulkarni, Xingyu Liu, Ahmed Sabie, Sergio Escolano, Abhishek Kar, Ping Yu, Ram Iyengar, Adarsh Kowdle, and Alex Olwal. 2023. Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications Through Visual Programming. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems(CHI). ACM. https://doi.org/10.1145/3544548.3581338

Digital Library

Google Scholar

[4]

Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte, Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces, Shahram Izadi, Adarsh Kowdle, Konstantine Tsotsos, and David Kim. 2020. DepthLab: Real-Time 3D Interaction With Depth Maps for Mobile Augmented Reality. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology(UIST). ACM, 829–843. https://doi.org/10.1145/3379337.3415881

Digital Library

Google Scholar

[5]

Michael Gleicher, Aditya Barve, Xinyi Yu, and Florian Heimerl. 2020. Boxer: Interactive Comparison of Classifier Results. Computer Graphics Forum (Jun. 2020). https://doi.org/10.1111/cgf.13972

Crossref

Google Scholar

[6]

Na Li, Jason Mayes, and Ping Yu. 2021. ML Tools for the Web: a Way for Rapid Prototyping and HCI Research. Springer International Publishing. https://doi.org/10.1007/978-3-030-82681-9_10

Crossref

Google Scholar

[7]

Rohit Pandey, Sergio Escolano, Chloe Legendre, Christian Häne, Sofien Bouaziz, Christoph Rhemann, Paul Debevec, and Sean Fanello. 2021. Total Relighting. ACM Transactions on Graphics (Aug. 2021). https://doi.org/10.1145/3450626.3459872

Digital Library

Google Scholar

[8]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1135–1144. https://doi.org/10.1145/2939672.2939778

Digital Library

Google Scholar

[9]

Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing Cai, Eric Nielsen, David Soergel, Stan Bileschi, Michael Terry, Charles Nicholson, Sandeep N. Gupta, Sarah Sirajuddin, D. Sculley, Rajat Monga, Greg Corrado, Fernanda B. Viégas, and Martin Wattenberg. 2019. TensorFlow.js: Machine Learning for the Web and Beyond. https://doi.org/10.48550/arXiv.1901.05350

Crossref

Google Scholar

[10]

Thilo Spinner, Udo Schlegel, Hanna Schafer, and Mennatallah El-Assady. 2019. ExplAIner: a Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE Transactions on Visualization and Computer Graphics (2019). https://doi.org/10.1109/TVCG.2019.2934629

Crossref

Google Scholar

[11]

Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, and Yinda Zhang. 2021. HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 1820–1830. https://doi.org/10.1109/CVPR46437.2021.00186

Crossref

Google Scholar

[12]

Bingyuan Wu and Yongxiong Wang. 2022. Rich Global Feature Guided Network for Monocular Depth Estimation. SSRN Electronic Journal(2022). https://doi.org/10.2139/ssrn.4057946

Crossref

Google Scholar

[13]

Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jeff Gray, Alejandra Molina, Michael Terry, and Carrie Cai. 2022. PromptChainer: Chaining Large Language Model Prompts Through Visual Programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. ACM. https://doi.org/10.1145/3491101.3519729

Digital Library

Google Scholar

[14]

Tongshuang Wu, Michael Terry, and Carrie Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3491102.3517582

Digital Library

Google Scholar

Index Terms

Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai
1. Computing methodologies
  1. Machine learning
  2. Modeling and simulation
    1. Simulation types and techniques
      1. Visual analytics
2. Software and its engineering
  1. Software notations and tools
    1. Context specific languages
      1. Visual languages

Recommendations

Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications through Visual Programming
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

In recent years, there has been a proliferation of multimedia applications that leverage machine learning (ML) for interactive experiences. Prototyping ML-based applications is, however, still challenging, given complex workflows that are not ideal for ...
Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines
UIST '23 Adjunct: Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology

We demonstrate Visual Blocks for ML, a visual programming platform that facilitates rapid prototyping of ML-based multimedia applications. As the public version of Rapsai [3], we further integrated large language models and custom APIs into the ...
Experiencing InstructPipe: Building Multi-modal AI Pipelines via Prompting LLMs and Visual Programming
CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

Foundational multi-modal models have democratized AI access, yet the construction of complex, customizable machine learning pipelines by novice users remains a grand challenge. This paper demonstrates a visual programming system that allows novices to ...

Comments

Information & Contributors

Information

Published In

CHI EA '23: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

April 2023

3914 pages

ISBN:9781450394222

DOI:10.1145/3544549

Editors:
Albrecht Schmidt
LMU Munich, Germany
,
Kaisa Väänänen
Tampere University, Finland
,
Tesh Goyal
Google Research, USA
,
Per Ola Kristensson
University of Cambridge, UK
,
Anicia Peters
University of Namibia, Namibia

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Check for updates

Author Tags

Qualifiers

Demonstration
Research
Refereed limited

Conference

CHI '23

Sponsor:

SIGCHI

CHI '23: CHI Conference on Human Factors in Computing Systems

April 23 - 28, 2023

Hamburg, Germany

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
154
Total Downloads

Downloads (Last 12 months)92
Downloads (Last 6 weeks)38

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Abstract

Supplementary Material

References

Index Terms

Recommendations

Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications through Visual Programming

Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines

Experiencing InstructPipe: Building Multi-modal AI Pipelines via Prompting LLMs and Visual Programming

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Share

Share this Publication link

Share on social media

Affiliations