skip to main content
10.1145/2998181.2998363acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
Open access

Inquire: Large-scale Early Insight Discovery for Qualitative Research

Published: 25 February 2017 Publication History


We introduce Inquire, a tool designed to enable qualitative exploration of utterances in social media and large-scale texts. As opposed to keyword search, Inquire allows the effective use of sentences as queries to quickly explore millions of documents to retrieve semantically-similar sentences. We apply Inquire to (LJ) database, which contains millions of personal diaries, and we use semantic embeddings trained in LJ or Google News (GN) datasets. We present the system design through iterative evaluations with qualitative researchers. We show how queries become a part of the inductive process, enabling researchers to try multiple ideas while gaining intuition and discovering less-obvious insights. We discuss the choice of LJ as a rich source of public posts, the preference for GN embeddings which link formal language (e.g. "reminiscence triggers") with colloquial expressions (e.g. "music brings back memories"), the interplay between tool and user, and potential qualitative and social research opportunities.

Supplementary Material

MP4 File (cscwp0718-file3.mp4)


David M. Blei. 2012. Introduction to Probabilistic Topic Modeling. Communications of the ACM 55: 77--84.
John Canny and Huasha Zhao. 2013. Big Data Analytics with Small Footprint?: Squaring the Cloud. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM: 95--103.
Cecilia Castro-Ledesma. 2010. Investigacion Cualitativa. Visión Teórica y Técnicas Operativas. Universidad de Cuenca: Facultad de Jurisprudencia, Cuenca, Ecuador.
K Charmaz. 2003. Grounded theory. Strategies of Qualitative Inquiry 22: 124--127.
Munmun De Choudhury, Scott Counts, and Eric Horvitz. 2013. Predicting postpartum changes in emotion and behavior via social media. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI '13: 3267.
John W. Creswell. 2007. Qualitative Inquiry and Research Design: Choosing Among Five Approaches.
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. Proceedings of the twentieth annual symposium on Computational geometry: 253--262.
Nemanja Djuric, Hao Wu, Vladan Radosavljevic, Mihajlo Grbovic, and Narayan Bhamidipati. 2015. Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content. International World Wide Web Conference Committee (IW3C2).
Susan T. Dumais. 2004. Latent semantic analysis. Annual Review of Information Science and Technology 38: 188--230.
Jacob Eisenstein, Duen Horng Chau, Aniket Kittur, and Eric Xing. 2012. TopicViz: Interactive topic exploration in document collections. In Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems Extended Abstracts, 2177--2182.
Ethan Fast, Binbin Chen, and Michael Bernstein. 2016. Empath: Understanding Topic Signals in Large-Scale Text.
Marguerite Fischer. 1966. The KWIC Index Concept?: A Retrospective View. April: 57--70.
Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis. IJCAI International Joint Conference on Artificial Intelligence: 1606--161
Jeffrey T. Hancock, Christopher Landrigan, and Courtney Silver. 2007. Expressing emotion in text-based communication. Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '07: 929--932.
Marti A. Hearst. 1998. Automated discovery of wordnet relations. WordNet: an electronic lexical database: 131--152. Retrieved from
Tom Kenter and Maarten De Rijke. 2015. Short Text Similarity with Word Embeddings Categories and Subject Descriptors. In Proceedings of the twenty fourthth ACM International Conference on Information and Knowledge Management ACM International Conference on Information and Knowledge Management, Vol. 15. 115.
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-Thought Vectors. ArxiV, 786: 1--11.
Hans P. Luhn. 1960. Key Word-In-Context Index for Technical Literature. American Documentation XI, 4: 288--295.
Eva Martinez and Lluis Marquez. 2014. Document-Level Machine Translation on. In 20th International Joint Conference of the European Association for Machine Translation, 59--66.
Rada Mihalcea, Courtney Corley, and Carlo Strapparava. 2006. Corpus-based and knowledge-based measures of text semantic similarity. Proceedings of the 21st national conference on Artificial intelligence 1: 775--780.
Rada Mihalcea and Andras Csomai. 2007. Wikify!: linking documents to encyclopedic knowledge. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management: 233--242.
Thomas Mikolov. 2015. word2vec: Tool for computing continuous distributed representations of words.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. Nips: 1--9.
Tomas Mikolov, Greg Corrado, Kai Chen, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. Proceedings of the International Conference on Learning Representations (ICLR 2013): 1--12.
Aditi Muralidharan, MA Hearst, and Christopher Fan. 2013. WordSeer: a knowledge synthesis environment for textual data. ... on information & knowledge ...: 2533--2536.
Aditi Muralidharan and Marti A Hearst. 2013. Supporting exploratory text analysis in literature study. Literary and Linguistic Computing 28: 283--295.
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1998. The PageRank Citation Ranking: Bringing Order to the Web. World Wide Web Internet And Web Information Systems 54, 1999--66: 1--17.
Johnny Saldana. 2009. The Coding Manual for Qualitative Researchers.
Helen M. Smith. 2006. Interpreting Qualitative Data: Methods for Analyzing Talk, Text and Interaction (3rd edition. Sociological Research Online 11.
Peter D Turney. 2001. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the 12th European Conference on Machine Learning (ECML-2001), Freiburg, Germany: 491--502.
Yiran Wang, Melissa Niiya, Gloria Mark, Stephanie M. Reich, and Mark Warschauer. 2015. Coming of Age (Digitally): An Ecological View of Social Media Use among College Students. Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing - CSCW '15: 571--582.
Martin Wattenberg and Fernanda B. Viégas. 2008. The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14, 6: 1221--1228.
Douglas Zytko, Sukeshini A. Grandhi, and Quentin Jones. 2014. Impression Management Struggles in Online Dating. In Proceedings of the 18th International Conference on Supporting Group Work - GROUP '14, 53--62.
Wikipedia. Retrieved January 17, 2016 from
2007. QSR - NVivo Products. Retrieved from
2007. MaxQDA: The art of text analysis.
2007. Atlas.Ti: The Qualitative Data Analysis & Research Software.
2016. hyperRESEARCH. Retrieved from
2016. Sysomos Scout.
2016. HootSuite: The best way to manage social media.
2016. mBlast: Personalized ads and audiences.

Cited By

View all
  • (2024)Fixing FieldnotesSocial Science Computer Review10.1177/0894439323122048842:5(1223-1243)Online publication date: 1-Oct-2024
  • (2024)Bridging Qualitative Data SilosSocial Science Computer Review10.1177/0894439323121545942:3(760-776)Online publication date: 15-May-2024
  • (2024)A New Method Supporting Qualitative Data Analysis Through Prompt Generation for Inductive Coding2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI62200.2024.00043(164-169)Online publication date: 7-Aug-2024
  • Show More Cited By

Index Terms

  1. Inquire: Large-scale Early Insight Discovery for Qualitative Research



    Information & Contributors


    Published In

    cover image ACM Conferences
    CSCW '17: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing
    February 2017
    2556 pages
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 February 2017

    Check for updates

    Author Tags

    1. big data
    2. exploratory
    3. hypothesis formation
    4. insights
    5. keyword search
    6. large-scale data
    7. qualitative research
    8. semantic
    9. text data


    • Research-article


    CSCW '17
    CSCW '17: Computer Supported Cooperative Work and Social Computing
    February 25 - March 1, 2017
    Oregon, Portland, USA

    Acceptance Rates

    CSCW '17 Paper Acceptance Rate 183 of 530 submissions, 35%;
    Overall Acceptance Rate 2,235 of 8,521 submissions, 26%

    Upcoming Conference

    CSCW '25


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)1,137
    • Downloads (Last 6 weeks)65
    Reflects downloads up to 08 Mar 2025

    Other Metrics


    Cited By

    View all
    • (2024)Fixing FieldnotesSocial Science Computer Review10.1177/0894439323122048842:5(1223-1243)Online publication date: 1-Oct-2024
    • (2024)Bridging Qualitative Data SilosSocial Science Computer Review10.1177/0894439323121545942:3(760-776)Online publication date: 15-May-2024
    • (2024)A New Method Supporting Qualitative Data Analysis Through Prompt Generation for Inductive Coding2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI62200.2024.00043(164-169)Online publication date: 7-Aug-2024
    • (2023)Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive CodingCompanion Proceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581754.3584136(75-78)Online publication date: 27-Mar-2023
    • (2021)Putting Tools in Their Place: The Role of Time and Perspective in Human-AI Collaboration for Qualitative AnalysisProceedings of the ACM on Human-Computer Interaction10.1145/34798565:CSCW2(1-25)Online publication date: 18-Oct-2021
    • (2021)Supporting SerendipityProceedings of the ACM on Human-Computer Interaction10.1145/34491685:CSCW1(1-23)Online publication date: 22-Apr-2021
    • (2021)Cody: An AI-Based System to Semi-Automate Coding for Qualitative ResearchProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445591(1-14)Online publication date: 6-May-2021
    • (2021)SAD: A Stress Annotated Dataset for Recognizing Everyday Stressors in SMS-like Conversational SystemsExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411763.3451799(1-7)Online publication date: 8-May-2021
    • (2021)Service Design for Scale—Overcoming Challenges in Large-Scale Qualitative User ResearchDesign for Tomorrow—Volume 210.1007/978-981-16-0119-4_8(91-103)Online publication date: 27-Apr-2021
    • (2019)MDLDA: A New Multi-Dimension Topic Approach2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851812(1-8)Online publication date: Jul-2019
    • Show More Cited By

    View Options

    View options


    View or Download as a PDF file.



    View online with eReader.


    Login options






    Share this Publication link

    Share on social media