skip to main content
10.1145/3543873.3587330acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
demonstration

Weedle: Composable Dashboard for Data-Centric NLP in Computational Notebooks

Published:30 April 2023Publication History

ABSTRACT

Data-centric NLP is a highly iterative process requiring careful exploration of text data throughout entire model development lifecycle. Unfortunately, existing data exploration tools are not suitable to support data-centric NLP because of workflow discontinuity and lack of support for unstructured text. In response, we propose Weedle, a seamless and customizable exploratory text analysis system for data-centric NLP. Weedle is equipped with built-in text transformation operations and a suite of visual analysis features. With its widget, users can compose customizable dashboards interactively and programmatically in computational notebooks.

Skip Supplemental Material Section

Supplemental Material

Weedle Demo Video.mp4

mp4

50.3 MB

References

  1. 2019. Twitter US Airline Sentiment. https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment.Google ScholarGoogle Scholar
  2. Sara Alspaugh, Nava Zokaei, Andrea Liu, Cindy Jin, and Marti A. Hearst. 2019. Futzing and Moseying: Interviews with Professional Data Analysts on Exploration Practices. IEEE TVCG 25, 1 (2019), 22–31.Google ScholarGoogle Scholar
  3. Alex Bäuerle, Ángel Alexander Cabrera, Fred Hohman, Megan Maher, David Koski, Xavier Suau, Titus Barik, and Dominik Moritz. 2022. Symphony: Composing Interactive Interfaces for Machine Learning. In Proc. CHI 2022. Article 210, 14 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. NAACL 2019. 4171–4186.Google ScholarGoogle Scholar
  5. Peter Griggs, Cagatay Demiralp, and Sajjadur Rahman. 2021. Towards integrated, interactive, and extensible text data analytics with Leam. In Proc. DaSH 2021. 52–58.Google ScholarGoogle ScholarCross RefCross Ref
  6. John D. Hunter. 2007. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering 9, 3 (2007), 90–95.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stratos Idreos, Olga Papaemmanouil, and Surajit Chaudhuri. 2015. Overview of Data Exploration Techniques. In Proc. SIGMOD 2015. 277–281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Andrew Ng. 2021. MLOps: from model-centric to data-centric AI. https://www.deeplearning.ai/wp-content/uploads/2021/06/MLOps-From-Model-centric-to-Data-centricAI.pdf.Google ScholarGoogle Scholar
  9. Jinglin Peng, Weiyuan Wu, Brandon Lockhart, Song Bian, Jing Nathan Yan, Linghao Xu, Zhixuan Chi, Jeffrey M. Rzeszotarski, and Jiannan Wang. 2021. DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python. In Proc. SIGMOD 2021. 2271–2280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sajjadur Rahman and Eser Kandogan. 2022. Characterizing Practices, Limitations, and Opportunities Related to Text Information Extraction Workflows: A Human-in-the-Loop Perspective. In Proc. CHI 2022. Article 628, 15 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Frederick Reiss, Hong Xu, Bryan Cutler, Karthik Muthuraman, and Zachary Eichenberger. 2020. Identifying Incorrect Labels in the CoNLL-2003 Corpus. In Proc. CoNLL 2020. 215–226.Google ScholarGoogle ScholarCross RefCross Ref
  12. Adam Rule, Aurélien Tabard, and James D. Hollan. 2018. Exploration and Explanation in Computational Notebooks. In Proc. CHI 2018. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. John W Tukey. 1977. Exploratory Data Analysis. Vol. 2. Reading, MA.Google ScholarGoogle Scholar
  14. Jacob VanderPlas, Brian Granger, Jeffrey Heer, Dominik Moritz, Kanit Wongsuphasawat, Arvind Satyanarayan, Eitan Lees, Ilia Timofeev, Ben Welsh, and Scott Sievert. 2018. Altair: Interactive Statistical Visualizations for Python. Journal of Open Source Software 3, 32 (2018), 1057.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kanit Wongsuphasawat, Yang Liu, and Jeffrey Heer. 2019. Goals, Process, and Challenges of Exploratory Data Analysis: An Interview Study. arXiv1911.00568 (2019).Google ScholarGoogle Scholar
  16. Yifan Wu, Joseph M Hellerstein, and Arvind Satyanarayan. 2020. B2: Bridging Code and Interactive Visualization in Computational Notebooks. In Proc. UIST 2020. 152–165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ge Zhang, Mike A Merrill, Yang Liu, Jeffrey Heer, and Tim Althoff. 2022. CORAL: COde RepresentAtion learning with weakly-supervised transformers for analyzing data analysis. EPJ Data Science 11, 1 (2022), 14.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Weedle: Composable Dashboard for Data-Centric NLP in Computational Notebooks

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
          April 2023
          1567 pages
          ISBN:9781450394192
          DOI:10.1145/3543873

          Copyright © 2023 Owner/Author

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 April 2023

          Check for updates

          Qualifiers

          • demonstration
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

          Upcoming Conference

          WWW '24
          The ACM Web Conference 2024
          May 13 - 17, 2024
          Singapore , Singapore
        • Article Metrics

          • Downloads (Last 12 months)104
          • Downloads (Last 6 weeks)6

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format