ABSTRACT
While natural language interfaces (NLIs) are increasingly utilized to simplify the interaction with data visualization tools, improving and adapting the NLIs to the individual needs of users still requires the support of developers. ONYX introduces an interactive task learning (ITL) based approach which enables NLIs to learn from users through natural interactions. Users can personalize the NLI with new commands using direct manipulation, known commands, or by combining both. To further support users during the training process, we derived two design goals for the user interface, namely providing suggestions based on sub-parts of the command and addressing ambiguities through follow-up questions and instantiated them in ONYX. In order to trigger reflections and gain feedback on possible design trade-offs of ONYX and the instantiated design goals, we performed a formative user study to understand how to successfully integrate the suggestions and follow-up question into the interaction.
Supplemental Material
- James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, and William Taysom. 2007. PLOW: A collaborative task learning agent. In Proceedings of the National Conference on Artificial Intelligence, Vol. 2. AAAI Press, 1514–1519. https://doi.org/10.5555/1619797.1619888Google ScholarDigital Library
- Mattias Appelgren and Alex Lascarides. 2020. Interactive task learning via embodied corrective feedback. Autonomous Agents and Multi-Agent Systems 34, 2 (10 2020), 1–45. https://doi.org/10.1007/S10458-020-09481-8/TABLES/6Google Scholar
- Merav Chkroun and Amos Azaria. 2019. LIA: A Virtual Assistant that Can Be Taught New Commands by Speech. International Journal of Human-Computer Interaction 35, 17(2019), 1596–1607. https://doi.org/10.1080/10447318.2018.1557972Google ScholarCross Ref
- Joëlle Coutaz, Laurence Nigay, Daniel Salber, Ann Blandford, Jon May, and Richard M. Young. 1995. Four Easy Pieces for Assessing the Usability of Multimodal Interaction: The Care Properties. In Human—Computer Interaction. 115–120. https://doi.org/10.1007/978-1-5041-2896-4_19Google Scholar
- Kenneth Cox, Rebecca E. Grinter, Stacie L. Hibino, Lalita Jategaonkar Jagadeesan, and David Mantilla. 2001. A multi-modal natural language interface to an information visualization environment. International Journal of Speech Technology 4, 3-4 (2001), 297–314. https://doi.org/10.1023/A:1011368926479Google ScholarCross Ref
- Kedar Dhamdhere, Kevin S. McCurley, Ralfi Nahmias, Mukund Sundararajan, and Qiqi Yan. 2017. Analyza. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, New York, NY, USA, 493–504. https://doi.org/10.1145/3025171.3025227Google Scholar
- Bruno Dumas, Denis Lalanne, and Sharon Oviatt. 2009. Multimodal Interfaces: A Survey of Principles, Models and Frameworks. In Human Machine Interaction: Research Results of the MMI Program, Denis Lalanne and Jürg Kohlas (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 3–26. https://doi.org/10.1007/978-3-642-00437-7_1Google ScholarDigital Library
- Ethan Fast, Binbin Chen, Julia Mendelsohn, Jonathan Bassen, and Michael S. Bernstein. 2018. Iris. In Conference on Human Factors in Computing Systems - Proceedings. ACM Press, New York, New York, USA, 1–12. https://doi.org/10.1145/3173574.3174047Google Scholar
- Neil Fraser. 2015. Ten things we’ve learned from Blockly. In 2015 IEEE Blocks and Beyond Workshop (Blocks and Beyond). IEEE, 49–50. https://doi.org/10.1109/BLOCKS.2015.7369000Google ScholarDigital Library
- Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G. Karahalios. 2015. DataTone. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology - UIST ’15. ACM Press, New York, New York, USA, 489–500. https://doi.org/10.1145/2807442.2807478Google Scholar
- Hilary Hutchinson, Wendy Mackay, Bosse Westerlund, Benjamin B Bederson, Allison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy, Helen Evans, Heiko Hansen, Nicolas Roussel, Björn Eiderbäck, Sinna Lindquist, and Yngve Sundblad. 2003. Technology probes: Inspiring design for and with families. In Conference on Human Factors in Computing Systems - Proceedings. 17–24.Google ScholarDigital Library
- Yoonji Kim, Youngkyung Choi, Daye Kang, Minkyeong Lee, Tek Jin Nam, and Andrea Bianchi. 2019. HeyTeddy: Conversational test-driven development for physical computing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 4 (12 2019), 1–21. https://doi.org/10.1145/3369838Google ScholarDigital Library
- Young Ho Kim, Bongshin Lee, Arjun Srinivasan, and Eun Kyoung Choe. 2021. Data@hand: Fostering visual exploration of personal data on smartphones leveraging speech and touch interaction. In Conference on Human Factors in Computing Systems - Proceedings. ACM, New York, NY, USA, 1–17. https://doi.org/10.1145/3411764.3445421Google ScholarDigital Library
- John E Laird, Kevin Gluck, John Anderson, Kenneth D Forbus, Odest Chadwicke Jenkins, Christian Lebiere, Dario Salvucci, Matthias Scheutz, Andrea Thomaz, Greg Trafton, Robert E Wray, Shiwali Mohan, and James R Kirk. 2017. Interactive Task Learning. IEEE Intelligent Systems 32, 4 (2017), 6–21. https://doi.org/10.1109/MIS.2017.3121552Google ScholarDigital Library
- Tessa Lau. 2009. Why Programming-By-Demonstration Systems Fail: Lessons Learned for Usable AI. AI Magazine 30, 4 (10 2009), 65–65. https://doi.org/10.1609/AIMAG.V30I4.2262Google ScholarCross Ref
- Bongshin Lee, Arjun Srinivasan, Petra Isenberg, and John Stasko. 2021. Post-wimp interaction for information visualization. Foundations and Trends in Human-Computer Interaction 14, 1(2021), 1–95. https://doi.org/10.1561/1100000081Google ScholarDigital Library
- Toby Jia Jun Li, Amos Azaria, and Brad A. Myers. 2017. SUGILITE: Creating multimodal smartphone automation by demonstration. In Conference on Human Factors in Computing Systems - Proceedings, Vol. 2017-May. ACM, New York, NY, USA, 6038–6049. https://doi.org/10.1145/3025453.3025483Google Scholar
- Toby Jia-Jun Li, Igor Labutov, Xiaohan Nancy Li, Xiaoyi Zhang, Wenze Shi, Wanling Ding, Tom M. Mitchell, and Brad A. Myers. 2018. APPINITE: A Multi-Modal Interface for Specifying Data Descriptions in Programming by Demonstration Using Natural Language Instructions. In 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Vol. 2018-Octob. IEEE, 105–114. https://doi.org/10.1109/VLHCC.2018.8506506Google Scholar
- Toby Jia-Jun Li, Marissa Radensky, Justin Jia, Kirielle Singarajah, Tom M. Mitchell, and Brad A. Myers. 2019. PUMICE: A Multi-Modal Agent that Learns Concepts and Conditionals from Natural Language and Demonstrations. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. ACM, New York, NY, USA, 577–589. https://doi.org/10.1145/3332165.3347899Google ScholarDigital Library
- Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, and Xuedi Qin. 2022. Natural Language to Visualization by Neural Machine Translation. IEEE Transactions on Visualization and Computer Graphics 28, 1 (1 2022), 217–226. https://doi.org/10.1109/TVCG.2021.3114848Google ScholarDigital Library
- John Maloney, Mitchel Resnick, Natalie Rusk, Brian Silverman, and Evelyn Eastmond. 2010. The scratch programming language and environment. ACM Transactions on Computing Education 10, 4 (2010). https://doi.org/10.1145/1868358.1868363Google ScholarDigital Library
- Richard E. Mayer and Roxana Moreno. 2003. Nine Ways to Reduce Cognitive Load in Multimedia Learning. Educational Psychologist 38, 1 (3 2003), 43–52. https://doi.org/10.1207/S15326985EP3801_6Google Scholar
- Arpit Narechania, Adam Fourney, Bongshin Lee, and Gonzalo Ramos. 2021. DIY: Assessing the Correctness of Natural Language to SQL Systems. In 26th International Conference on Intelligent User Interfaces. ACM, New York, NY, USA, 597–607. https://doi.org/10.1145/3397481.3450667Google ScholarDigital Library
- Arpit Narechania, Arjun Srinivasan, and John Stasko. 2021. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2 2021), 369–379. https://doi.org/10.1109/TVCG.2020.3030378Google ScholarCross Ref
- Leah M. Reeves, Jennifer Lai, James A. Larson, Sharon Oviatt, T. S. Balaji, Stéphanie Buisine, Penny Collings, Phil Cohen, Ben Kraal, Jean Claude Martin, Michael McTear, T. V. Raman, Kay M. Stanney, Hui Su, and Qian Ying Wang. 2004. Guidelines for multimodal user interface design. Commun. ACM 47, 1 (1 2004), 57–59. https://doi.org/10.1145/962081.962106Google ScholarDigital Library
- Marcel Ruoff and Ulrich Gnewuch. 2021. Designing Multimodal BI&A Systems for Co-Located Team Interactions. ECIS 2021 Research Papers (6 2021). https://aisel.aisnet.org/ecis2021_rp/113Google Scholar
- Ritam Jyoti Sarmah, Yunpeng Ding, Di Wang, Cheuk Yin Phipson Lee, Toby Jia-Jun Li, and Xiang Anthony Chen. 2020. Geno. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. ACM, New York, NY, USA, 1169–1181. https://doi.org/10.1145/3379337.3415848Google Scholar
- Jaydeep Sen, Greg Stager, Chuan Lei, Fatma Ozcan, Ashish Mittal, Diptikalyan Saha, Abdul Quamar, Manasa Jammi, and Karthik Sankaranarayanan. 2019. Natural language querying of complex business intelligence queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 1997–2000. https://doi.org/10.1145/3299869.3320248Google ScholarDigital Library
- Vidya Setlur, Sarah E Battersby, Melanie Tory, Rich Gossweiler, and Angel X Chang. 2016. Eviza. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, New York, NY, USA, 365–377. https://doi.org/10.1145/2984511.2984588Google Scholar
- Vidya Setlur, Melanie Tory, and Alex Djalali. 2019. Inferencing underspecified natural language utterances in visual analysis. In Proceedings of the 24th International Conference on Intelligent User Interfaces, Vol. Part F1476. ACM, New York, NY, USA, 40–51. https://doi.org/10.1145/3301275.3302270Google ScholarDigital Library
- Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M. Drucker, and Ken Hinckley. 2020. InChorus: Designing Consistent Multimodal Interactions for Data Visualization on Tablet Devices. In Conference on Human Factors in Computing Systems - Proceedings. ACM, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376782Google Scholar
- Arjun Srinivasan, Nikhila Nyapathy, Bongshin Lee, Steven M. Drucker, and John Stasko. 2021. Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1–10. https://doi.org/10.1145/3411764.3445400Google ScholarDigital Library
- Arjun Srinivasan and John Stasko. 2018. Orko: Facilitating Multimodal Interaction for Visual Exploration and Analysis of Networks. IEEE Transactions on Visualization and Computer Graphics 24, 1(2018), 511–521. https://doi.org/10.1109/TVCG.2017.2745219Google ScholarCross Ref
- Arjun Srinivasan and John Stasko. 2020. How to ask what to say?: Strategies for evaluating natural language interfaces for data visualization. IEEE Computer Graphics and Applications 40, 4 (2020), 96–103. https://doi.org/10.1109/MCG.2020.2986902Google ScholarDigital Library
- Yiwen Sun, Jason Leigh, Andrew Johnson, and Sangyoon Lee. 2010. Articulate: A Semi-automated Model for Translating Natural Language Queries into Meaningful Visualizations. In Lecture Notes in Computer Science. Vol. 6133 LNCS. 184–195. https://doi.org/10.1007/978-3-642-13544-6_18Google Scholar
- David R. Thomas. 2006. A General Inductive Approach for Analyzing Qualitative Evaluation Data. American Journal of Evaluation 27, 2 (2006), 237–246. https://doi.org/10.1177/1098214005283748Google ScholarCross Ref
- David Weintrop. 2017. Comparing Block-Based and Text-Based Programming in High School Computer Science Classrooms. ACM Trans. Comput. Educ 18, 3 (2017). https://doi.org/10.1145/3089799Google ScholarDigital Library
- Bowen Yu and Claudio T. Silva. 2020. FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System. IEEE Transactions on Visualization and Computer Graphics 26, 1 (1 2020), 1–11. https://doi.org/10.1109/TVCG.2019.2934668Google ScholarCross Ref
- Sha Zhao, Julian Ramos, Jianrong Tao, Ziwen Jiang, Shijian Li, Zhaohui Wu, Gang Pan, and Anind K Dey. 2016. Discovering different kinds of smartphone users through their application usage behaviors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, New York, NY, USA, 498–509. https://doi.org/10.1145/2971648.2971696Google ScholarDigital Library
Recommendations
ONYX: Assisting Users in Teaching Natural Language Interfaces Through Multi-Modal Interactive Task Learning
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing SystemsUsers are increasingly empowered to personalize natural language interfaces (NLIs) by teaching how to handle new natural language (NL) inputs. However, our formative study found that when teaching new NL inputs, users require assistance in clarifying ...
The economics of natural language interfaces: natural language processing technology as a scarce resource
This paper discusses appropriate application areas for natural language interfaces (NLIs) to databases. This requires comparing NLIs with competing approaches, including other user-friendly interfaces, and training of users with less user-friendly ...
Programmatic semantics for natural language interfaces
CHI EA '05: CHI '05 Extended Abstracts on Human Factors in Computing SystemsAn important way of making interfaces usable by non-expert users is to enable the use of natural language input, as in natural language query interfaces to databases, or MUDs and MOOs. When the subject matter is about procedures, however, we have ...
Comments