skip to main content
10.1145/3626772.3661362acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Reflections on the Coding Ability of LLMs for Analyzing Market Research Surveys

Published: 11 July 2024 Publication History

Abstract

The remarkable success of large language models (LLMs) has drawn people's great interest in their deployment in specific domains and downstream applications. In this paper, we present the first systematic study of applying large language models (in our case, GPT-3.5 and GPT-4) for the automatic coding (multi-class classification) problem in market research. Our experimental results show that large language models could achieve a macro F1 score of over 0.5 for all our collected real-world market research datasets in a zero-shot setting. We also provide in-depth analyses of the errors made by the large language models. We hope this study sheds light on the lessons we learn and the open challenges large language models have when adapting to a specific market research domain.

References

[1]
James Brand, Ayelet Israeli, and Donald Ngwe. 2023. Using GPT for market research. Harvard Business School Marketing Unit Working Paper No. 23-062 (Mar 2023).
[2]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems.
[3]
Tirth Dave, Sai Anirudh Athaluri, and Satyam Singh. 2023. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Frontiers in Artificial Intelligence (2023).
[4]
Dorottya Demszky, Diyi Yang, David S. Yeager, Christopher J. Bryan, Margarett Clapper, Susannah Chandhok, Johannes C. Eichstaedt, Cameron Hecht, Jeremy Jamieson, Meghann Johnson, Michaela Jones, Danielle Krettek-Cobb, Leslie Lai, Nirel JonesMitchell, Desmond C. Ong, Carol S. Dweck, James J. Gross, and James W. Pennebaker. 2023. Using large language models in psychology. Nature Reviews Psychology (2023).
[5]
Minhao Jiang, Ken Ziyu Liu, Ming Zhong, Rylan Schaeffer, Siru Ouyang, Jiawei Han, and Sanmi Koyejo. 2024. Investigating data contamination for pre-training language models. arXiv:2401.06059
[6]
Youri Peskine, Damir Korenčić, Ivan Grubisic, Paolo Papotti, Raphael Troncy, and Paolo Rosso. 2023. Definitions matter: guiding GPT for multi-label classification. In Findings of the Association for Computational Linguistics: EMNLP 2023.
[7]
Tim Rietz and Alexander Maedche. 2021. Cody: an AI-based system to semi-automate coding for qualitative research. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.
[8]
Josh Seltzer, Jiahua Pan, Kathy Cheng, Yuxiao Sun, Santosh Kolagati, Jimmy Lin, and Shi Zong. 2023. SmartProbe: a virtual moderator for market research surveys. arXiv:2305.08271
[9]
Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, and Gideon Mann. 2023. BloombergGPT: a Large language model for finance. arXiv:2303.17564
[10]
Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, and Pierre-Yves Oudeyer. 2023. Supporting qualitative analysis with large language models: combining codebook with GPT-3 for deductive coding. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces.
[11]
He Zhang, Chuhao Wu, Jingyi Xie, Chan Min Kim, and John M. Carroll. 2023. QualiGPT: GPT as an easy-to-use tool for qualitative coding. arXiv:2310.070061

Index Terms

  1. Reflections on the Coding Ability of LLMs for Analyzing Market Research Surveys

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. coding
    2. large language models
    3. market research surveys

    Qualifiers

    • Short-paper

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 127
      Total Downloads
    • Downloads (Last 12 months)127
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media