Work in Progress

Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart Authoring

Authors:

Nazar Ponochevnyi,

Anastasia KuzminykhAuthors Info & Claims

CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

Article No.: 72, Pages 1 - 7

https://doi.org/10.1145/3613905.3650921

Published: 11 May 2024 Publication History

Abstract

Recent chart-authoring systems, such as Amazon Q in QuickSight and Copilot for Power BI, demonstrate an emergent focus on supporting natural language input to share meaningful insights from data through chart creation. Currently, chart-authoring systems tend to integrate voice input capabilities by relying on speech-to-text transcription, processing spoken and typed input similarly. However, cross-modality input comparisons in other interaction domains suggest that the structure of spoken and typed-in interactions could notably differ, reflecting variations in user expectations based on interface affordances. Thus, in this work, we compare spoken and typed instructions for chart creation. Findings suggest that while both text and voice instructions cover chart elements and element organization, voice descriptions have a variety of command formats, element characteristics, and complex linguistic features. Based on these findings, we developed guidelines for designing voice-based authoring-oriented systems and additional features that can be incorporated into existing text-based systems to support speech modality.

Supplemental Material

MP4 File

Talk Video

Transcript for: Talk Video

References

[1]

Iyad Abu Doush, Enrico Pontelli, Tran Cao Son, Dominic Simon, and Ou Ma. 2010. Multimodal Presentation of Two-Dimensional Charts: An Investigation Using Open Office XML and Microsoft Excel. ACM Trans. Access. Comput. 3, 2, Article 8 (nov 2010), 50 pages. https://doi.org/10.1145/1857920.1857925

Digital Library

[2]

Sriram Karthik Badam, Arjun Srinivasan, and Niklas Elmqvist. 2017. Affordances of Input Modalities for Visual Data Exploration in Immersive Environments. https://api.semanticscholar.org/CorpusID:20980425

[3]

Nicholas J Belkin. 1980. Anomalous states of knowledge as a basis for information retrieval. Canadian journal of information science 5, 1 (1980), 133–143.

[4]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.

[5]

Kenneth Cox, Rebecca Grinter, Stacie Hibino, Lalita Jagadeesan, and David Mantilla. 2001. A Multi-Modal Natural Language Interface to an Information Visualization Environment. International Journal of Speech Technology 4 (07 2001), 297–314. https://doi.org/10.1023/A:1011368926479

[6]

Weiwei Cui, Xiaoyu Zhang, Yun Wang, He Huang, B. Chen, Lei Fang, Haidong Zhang, Jian-Guang Lou, and Dongmei Zhang. 2020. Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements. IEEE Transactions on Visualization and Computer Graphics 26 (2020), 906–916.

[7]

Christin Engel, Emma Franziska Müller, and Gerhard Weber. 2019. SVGPlott: An Accessible Tool to Generate Highly Adaptable, Accessible Audio-Tactile Charts for and from Blind and Visually Impaired People. In Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (Rhodes, Greece) (PETRA ’19). Association for Computing Machinery, New York, NY, USA, 186–195. https://doi.org/10.1145/3316782.3316793

Digital Library

[8]

Gemini Team et al.2023. Gemini: A Family of Highly Capable Multimodal Models. arxiv:2312.11805 [cs.CL]

[9]

Percy Liang et al.2023. Holistic Evaluation of Language Models. arxiv:2211.09110 [cs.CL]

[10]

C Ailie Fraser, Julia M Markel, N James Basa, Mira Dontcheva, and Scott Klemmer. 2020. ReMap: Lowering the barrier to help-seeking with multimodal search. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 979–986.

Digital Library

[11]

Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G. Karahalios. 2015. DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machinery, New York, NY, USA, 489–500. https://doi.org/10.1145/2807442.2807478

Digital Library

[12]

Ido Guy. 2018. The Characteristics of Voice Search: Comparing Spoken with Typed-in Mobile Web Search Queries. ACM Trans. Inf. Syst. 36, 3, Article 30 (mar 2018), 28 pages. https://doi.org/10.1145/3182163

Digital Library

[13]

Enamul Hoque, Vidya Setlur, Melanie Tory, and Isaac Dykeman. 2018. Applying Pragmatics Principles for Interaction with Visual Analytics. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 309–318. https://doi.org/10.1109/TVCG.2017.2744684

[14]

Crescentia Jung, Shubham Mehta, Atharva Kulkarni, Yuhang Zhao, and Yea-Seul Kim. 2022. Communicating Visualizations without Visuals: Investigation of Visualization Alternative Text for People with Visual Impairments. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2022), 1095–1105. https://doi.org/10.1109/TVCG.2021.3114846

Digital Library

[15]

Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. arxiv:2104.08691 [cs.CL]

[16]

Susan Lin, Jeremy Warner, JD Zamfirescu-Pereira, Matthew G Lee, Sauhard Jain, Michael Xuelin Huang, Piyawat Lertvittayakumjorn, Shanqing Cai, Shumin Zhai, Björn Hartmann, 2024. Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation. arXiv preprint arXiv:2401.10838 (2024).

[17]

Can Liu, Yun Han, Ruike Jiang, and Xiaoru Yuan. 2021. ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data. 2021 IEEE 14th Pacific Visualization Symposium (PacificVis) (2021), 11–20.

[18]

Yuyu Luo, Jiawei Tang, and Guoliang Li. 2021. nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task. ArXiv abs/2112.12926 (2021).

[19]

Seyed Mahed Mousavi, Gabriel Roccabruna, Simone Alghisi, Massimo Rizzoli, Mirco Ravanelli, and Giuseppe Riccardi. 2024. Are LLMs Robust for Spoken Dialogues?arXiv e-prints (2024), arXiv–2401.

[20]

Shiri Melumad. 2023. Vocalizing search: How voice technologies alter consumer search processes and satisfaction. Journal of Consumer Research (2023), ucad009.

[21]

Arpit Narechania, Arjun Srinivasan, and John T. Stasko. 2021. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. IEEE Transactions on Visualization and Computer Graphics 27 (2021), 369–379.

[22]

OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774 [cs.CL]

[23]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.

[24]

Md. Mahinur Rashid, Hasin Kawsar Jahan, Annysha Huzzat, Riyasaat Ahmed Rahul, Tamim Bin Zakir, Farhana Firoz Meem, Md. Saddam Hossain Mukta, and Swakkhar Shatabda. 2022. Text2Chart: A Multi-Staged Chart Generator from Natural Language Text. In PAKDD.

[25]

Vidya Setlur, Sarah E. Battersby, Melanie Tory, Rich Gossweiler, and Angel X. Chang. 2016. Eviza: A Natural Language Interface for Visual Analysis. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 365–377. https://doi.org/10.1145/2984511.2984588

Digital Library

[26]

Vidya Setlur and Melanie Tory. 2022. How Do You Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational Behavior. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 29, 17 pages. https://doi.org/10.1145/3491102.3501972

Digital Library

[27]

Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M. Drucker, and Ken Hinckley. 2020. InChorus: Designing Consistent Multimodal Interactions for Data Visualization on Tablet Devices. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376782

Digital Library

[28]

Arjun Srinivasan, Nikhila Nyapathy, Bongshin Lee, Steven Mark Drucker, and John T. Stasko. 2021. Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (2021).

Digital Library

[29]

Arjun Srinivasan and Vidya Setlur. 2021. Snowy: Recommending Utterances for Conversational Visual Analysis. In The 34th Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY, USA, 864–880. https://doi.org/10.1145/3472749.3474792

Digital Library

[30]

Statista. 2007. Statista - The Statistics Portal for Market Data, Market Research and Market Studies. https://www.statista.com. [Accessed: April 17, 2023].

[31]

Jiawei Tang, Yuyu Luo, Mourad Ouzzani, Guoliang Li, and Hongyang Chen. 2022. Sevi: Speech-to-Visualization through Neural Machine Translation. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA, 2353–2356. https://doi.org/10.1145/3514221.3520150

Digital Library

[32]

Philip Tucker and Dylan M. Jones. 1991. Voice as interface: An overview. International Journal of Human–Computer Interaction 3, 2 (1991), 145–170. https://doi.org/10.1080/10447319109526002 arXiv:https://doi.org/10.1080/10447319109526002

[33]

Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2020. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. arxiv:1905.00537 [cs.CL]

[34]

Yun Wang, Zhitao Hou, Leixian Shen, Tongshuang Wu, Jiaqi Wang, He Huang, Haidong Zhang, and Dongmei Zhang. 2023. Towards Natural Language-Based Visualization Authoring. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2023), 1222–1232. https://doi.org/10.1109/TVCG.2022.3209357

Index Terms

Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart Authoring
1. Human-centered computing
  1. Visualization
    1. Empirical studies in visualization

Recommendations

Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Natural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural ...
Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition

Psycholinguistic studies on human communication have shown that during human interaction individuals tend to adapt their behaviors mimicking the spoken style, gestures, and expressions of their conversational partners. This synchronization pattern is ...
System-initiated digressive proposals in automated human-computer telephone dialogues: the use of contrasting politeness strategies

System-initiated digressive proposals may be used to introduce new and unexpected information into automated telephone services. These digressions may be viewed as particularly pronounced forms of unsolicited interruptions as they contain information ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

May 2024

4761 pages

ISBN:9798400703317

DOI:10.1145/3613905

Editors:
Florian Floyd Mueller
Monash University
,
Penny Kyburz
The Australian National University
,
Julie R. Williamson
University of Glasgow
,
Corina Sas
Lancaster University

Copyright © 2024 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Check for updates

Author Tags

Qualifiers

Work in progress
Research
Refereed limited

Conference

CHI '24

Sponsor:

CHI '24: CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
253
Total Downloads

Downloads (Last 12 months)253
Downloads (Last 6 weeks)67

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Table of Contents