skip to main content
10.1145/3629606.3629672acmotherconferencesArticle/Chapter ViewAbstractPublication Pageschinese-chiConference Proceedingsconference-collections
research-article

From Lab to Virtual: Comparing Real and AI-Generated User Interviews in Home Appliance Evaluation.

Authors Info & Claims
Published:27 February 2024Publication History

ABSTRACT

This study provides insights into the use of conversational AI, particularly ChatGPT, in household appliance evaluation interviews and how it differs from real user behaviour. Three comparison experiments (real researcher-real user, real researcher-simulated user vs. simulated researcher and simulated user) reveal the differences in the responses of ChatGPT simulated and real users in specific evaluation scenarios, especially in the evaluation of product appearance, GUI, and PUI. The study found that although simulated users agreed with real users in evaluating the core features of smart appliances, there were limitations in certain practical experience aspects and significant differences in SUS, learning ability, and usability scores across experimental settings. The study also explores the advantages and disadvantages of incorporating simulated users into the product evaluation process, concluding that this introduces an innovative approach to product evaluation that, although challenging, demonstrates the great potential of simulated users in future product evaluation.

References

  1. AN Averkin and SA Yarushev. 2021. Review of research in the field of developing methods to extract rules from artificial neural networks. Journal of Computer and Systems Sciences International 60 (2021), 966–980.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Available at SSRN 4337484 (2023).Google ScholarGoogle Scholar
  3. M Barandas, H Gamboa, and JM Fonseca. 2015. A real time biofeedback system using visual user interface for physical rehabilitation. Procedia Manufacturing 3 (2015), 823–828.Google ScholarGoogle ScholarCross RefCross Ref
  4. Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big?. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 610–623.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. John Brooke. 1996. Sus: a “quick and dirty’usability. Usability evaluation in industry 189, 3 (1996), 189–194.Google ScholarGoogle Scholar
  6. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).Google ScholarGoogle Scholar
  7. Md Naseef-Ur-Rahman Chowdhury and Ahshanul Haque. 2023. ChatGPT: Its Applications and Limitations. In 2023 3rd International Conference on Intelligent Technologies (CONIT). IEEE, 1–7.Google ScholarGoogle Scholar
  8. Rushabh Doshi, Kanhai Amin, Pavan Khosla, Simar Bajaj, Sophie Chheang, and Howard P Forman. 2023. Utilizing Large Language Models to Simplify Radiology Reports: A Comparative Analysis of ChatGPT3. 5, ChatGPT4. 0, Google Bard, and Microsoft Bing. medRxiv (2023), 2023–06.Google ScholarGoogle Scholar
  9. Mirza Niaz Zaman Elin. [n. d.]. Comparative Analysis of Decision-Making Efficiency of Large Language Models. IJFMR-International Journal For Multidisciplinary Research 5, 3 ([n. d.]).Google ScholarGoogle Scholar
  10. Andrew J Flanagin, Miriam J Metzger, Rebekah Pure, Alex Markov, and Ethan Hartsell. 2014. Mitigating risk in ecommerce transactions: perceptions of information credibility and the role of user-generated ratings in product quality and purchase intention. Electronic Commerce Research 14 (2014), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A Shaji George and AS Hovan George. 2023. A review of ChatGPT AI’s impact on several business sectors. Partners Universal International Innovation Journal 1, 1 (2023), 9–23.Google ScholarGoogle Scholar
  12. John Giorgi, Augustin Toma, Ronald Xie, Sondra Chen, Kevin An, Grace Zheng, and Bo Wang. 2023. WangLab at MEDIQA-Chat 2023: Clinical Note Generation from Doctor-Patient Conversations using Large Language Models. In Proceedings of the 5th Clinical Natural Language Processing Workshop. 323–334.Google ScholarGoogle ScholarCross RefCross Ref
  13. John Giorgi, Augustin Toma, Ronald Xie, Sondra Chen, Kevin R An, Grace X Zheng, and Bo Wang. 2023. Clinical Note Generation from Doctor-Patient Conversations using Large Language Models: Insights from MEDIQA-Chat. arXiv preprint arXiv:2305.02220 (2023).Google ScholarGoogle Scholar
  14. Walid Hariri. 2023. Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing. arXiv preprint arXiv:2304.02017 (2023).Google ScholarGoogle Scholar
  15. Wilson Cheong Hin Hong. 2023. The impact of ChatGPT on foreign language teaching and learning: opportunities in education and research. Journal of Educational Technology and Innovation 5, 1 (2023).Google ScholarGoogle Scholar
  16. Farid Huseynov. 2023. Chatbots in Digital Marketing: Enhanced Customer Experience and Reduced Customer Service Costs. In Contemporary Approaches of Digital Marketing and the Role of Machine Intelligence. IGI Global, 46–72.Google ScholarGoogle Scholar
  17. Deepak Kapgate. 2022. Efficient quadcopter flight control using hybrid SSVEP+ P300 visual brain computer interface. International Journal of Human–Computer Interaction 38, 1 (2022), 42–52.Google ScholarGoogle ScholarCross RefCross Ref
  18. Turgut Karakose, Murat Demirkol, Ramazan Yirci, Hakan Polat, Tuncay Yavuz Ozdemir, and Tijen Tülübaş. 2023. A Conversation with ChatGPT about Digital Leadership and Technology Integration: Comparative Analysis Based on Human–AI Collaboration. Administrative Sciences 13, 7 (2023), 157.Google ScholarGoogle ScholarCross RefCross Ref
  19. Brady D Lund, Ting Wang, Nishith Reddy Mannuru, Bing Nie, Somipam Shimray, and Ziang Wang. 2023. ChatGPT and a new academic reality: Artificial Intelligence-written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology 74, 5 (2023), 570–581.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ana Isabel Martins, Ana Filipa Rosa, Alexandra Queirós, Anabela Silva, and Nelson Pacheco Rocha. 2015. European Portuguese validation of the system usability scale (SUS). Procedia computer science 67 (2015), 293–300.Google ScholarGoogle Scholar
  21. Gioacchino Mauro, Harold Thimbleby, Andrea Domenici, and Cinzia Bernardeschi. 2017. Extending a user interface prototyping tool with automatic MISRA C code generation. arXiv preprint arXiv:1701.08468 (2017).Google ScholarGoogle Scholar
  22. Stanislas Polu and Ilya Sutskever. 2020. Generative language modeling for automated theorem proving. arXiv preprint arXiv:2009.03393 (2020).Google ScholarGoogle Scholar
  23. Laria Reynolds and Kyle McDonell. 2021. Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Matthew Rueben, Frank J Bernieri, Cindy M Grimm, and William D Smart. 2016. User feedback on physical marker interfaces for protecting visual privacy from mobile robots. In 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 507–508.Google ScholarGoogle ScholarCross RefCross Ref
  25. J Sandlin. 2022. ChatGPT arrives in the academic world. Boing Boing (2022).Google ScholarGoogle Scholar
  26. R Santhosh, M Abinaya, V Anusuya, and D Gowthami. 2023. ChatGPT: Opportunities, Features and Future Prospects. In 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI). IEEE, 1614–1622.Google ScholarGoogle Scholar
  27. Shailja Thakur, Baleegh Ahmad, Zhenxing Fan, Hammond Pearce, Benjamin Tan, Ramesh Karri, Brendan Dolan-Gavitt, and Siddharth Garg. 2023. Benchmarking Large Language Models for Automated Verilog RTL Code Generation. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–6.Google ScholarGoogle Scholar
  28. Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez, Ting Fang Tan, and Daniel Shu Wei Ting. 2023. Large language models in medicine. Nature Medicine (2023), 1–11.Google ScholarGoogle Scholar
  29. Krzysztof Wach, Cong Doanh Duong, Joanna Ejdys, Rūta Kazlauskaitė, Pawel Korzynski, Grzegorz Mazurek, Joanna Paliszkiewicz, and Ewa Ziemba. 2023. The dark side of generative artificial intelligence: A critical analysis of controversies and risks of ChatGPT. Entrepreneurial Business and Economics Review 11, 2 (2023), 7–24.Google ScholarGoogle ScholarCross RefCross Ref
  30. Yufei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, and Qun Liu. 2023. Aligning large language models with human: A survey. arXiv preprint arXiv:2307.12966 (2023).Google ScholarGoogle Scholar
  31. Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Cheng Yang, Lingang Wu, Kun Tan, Chunyang Yu, Yuliang Zhou, Ye Tao, and Yu Song. 2021. Online user review analysis for product evaluation and improvement. Journal of Theoretical and Applied Electronic Commerce Research 16, 5 (2021), 1598–1611.Google ScholarGoogle ScholarCross RefCross Ref
  33. Jingye Yang, Cong Liu, Wendy Deng, Da Wu, Chunhua Weng, Yunyun Zhou, and Kai Wang. 2023. Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT. arXiv preprint arXiv:2308.06294 (2023).Google ScholarGoogle Scholar
  34. Shu-Yu Yeh. 2010. Involving consumers in product design through collaboration: the case of online role-playing games. Cyberpsychology, Behavior, and Social Networking 13, 6 (2010), 601–610.Google ScholarGoogle ScholarCross RefCross Ref
  35. Pengyuan Zhou. 2023. Unleasing chatgpt on the metaverse: Savior or destroyer?arXiv preprint arXiv:2303.13856 (2023).Google ScholarGoogle Scholar
  36. Terry Yue Zhuo, Yujin Huang, Chunyang Chen, and Zhenchang Xing. 2023. Exploring ai ethics of chatgpt: A diagnostic analysis. arXiv preprint arXiv:2301.12867 (2023).Google ScholarGoogle Scholar
  37. Terry Yue Zhuo, Yujin Huang, Chunyang Chen, and Zhenchang Xing. 2023. Red teaming ChatGPT via jailbreaking: Bias, robustness, reliability and toxicity. arXiv preprint arXiv:2301.12867 (2023), 12–2.Google ScholarGoogle Scholar

Index Terms

  1. From Lab to Virtual: Comparing Real and AI-Generated User Interviews in Home Appliance Evaluation.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CHCHI '23: Proceedings of the Eleventh International Symposium of Chinese CHI
      November 2023
      634 pages
      ISBN:9798400716454
      DOI:10.1145/3629606

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 February 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate17of40submissions,43%
    • Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)17

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format