research-article

Identifying Informational vs. Conversational Questions on Community Question Answering Archives

Authors:

Victor Makarenkov,

Bracha ShapiraAuthors Info & Claims

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 216 - 224

https://doi.org/10.1145/3159652.3159733

Published: 02 February 2018 Publication History

Abstract

Questions on community question answering websites usually reflect one of two intents: learning information or starting a conversation. In this paper, we revisit this fundamental classification task of informational versus conversational questions, which was originally introduced and studied in 2009. We use a substantially larger dataset of archived questions from Yahoo Answers, which includes the question»s title, description, answers, and votes. We replicate the original experiments over this dataset, point out the common and different from the original results, and present a broad set of characteristics that distinguish the two question types. We also develop new classifiers that make use of additional data types, advanced machine learning, and a large dataset of unlabeled data, which achieve enhanced performance.

References

[1]

Martín Abadi et al. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proc.of OSDI. 265--283.

Digital Library

[2]

Lada A. Adamic, Jun Zhang, Eytan Bakshy, and Mark S. Ackerman. 2008. Knowledge Sharing and Yahoo Answers: Everyone Knows Something. In Proc. of WWW. 665--674.

Digital Library

[3]

Naoyoshi Aikawa, Tetsuya Sakai, and Hayato Yamana. 2011. Community QA question classification: Is the asker looking for subjective answers or not? IPSJ Online Transactions 4(2011), 160--168.

[4]

Michael S. Bernstein, Jaime Teevan, Susan Dumais, Daniel Liebling, and Eric Horvitz. 2012. Direct Answers for Search Queries in the Long Tail. In Proc. of CHI. 237--246.

Digital Library

[5]

Jiang Bian, Yandong Liu, Eugene Agichtein, and Hongyuan Zha. 2008. Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media. In Proc. of WWW. 467--476.

Digital Library

[6]

Steven Bird. 2006. NLTK: The Natural Language Toolkit. In Proc. of COLING-ACL. 69--72.

Digital Library

[7]

Liora Braunstain, Oren Kurland, David Carmel, Idan Szpektor, and Anna Shtok. 2016. Supporting human answers for advice-seeking questions in CQA sites. In Proc. of ECIR. 129--141.

[8]

David J. Brenes, Daniel Gayo-Avello, and Kilian Pérez-González. 2009. Survey and Evaluation of Query Intent Detection Methods. In Proc. of WSCD. 1--7.

Digital Library

[9]

Andrei Broder. 2002. A Taxonomy of Web Search. SIGIR Forum 36, 2(2002), 3--10.

Digital Library

[10]

Li Cai, Guangyou Zhou, Kang Liu, and Jun Zhao. 2011. Large-scale Question Classification in cQA by Leveraging Wikipedia Semantic Knowledge. In Proc. of CIKM. 1321--1330.

Digital Library

[11]

Wen Chan, Weidong Yang, Jinhui Tang, Jintao Du, Xiangdong Zhou, and Wei Wang. 2013. Community Question Topic Categorization via Hierarchical Kernelized Classification. In Proc. of CIKM. 959--968.

Digital Library

[12]

Long Chen, Dell Zhang, and Levene Mark. 2012. Understanding user intent in community question answering. In Proc. of WWW Companion. 823--828.

Digital Library

[13]

François Chollet. 2015. Keras. https://github.com/fchollet/keras.(2015).

[14]

Shiri Dori-Hacohen and James Allan. 2015. Automated controversy detection on the web. In Proc. of ECIR. 423--434.

[15]

Nicole B Ellison, Charles Steinfield, and Cliff Lampe. 2007. The benefits of Facebook friends: Social capital and college students? use of online social network sites. Journal of Computer-Mediated Communication 12, 4(2007), 1143--1168.

[16]

Iryna Gurevych, Eduard H Hovy, Noam Slonim, and Benno Stein. 2016. Debating Technologies. In Dagstuhl Reports, Vol. 5.

[17]

Ido Guy. 2016. Searching by Talking: Analysis of Voice Queries on Mobile Web Search. In Proc. of SIGIR. 35--44.

Digital Library

[18]

Ido Guy and Dan Pelleg. 2016. The Factoid Queries Collection. In Proc. of SIGIR. 717--720.

Digital Library

[19]

Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 11, 1(2009), 10--18.

Digital Library

[20]

F. Maxwell Harper, Daniel Moy, and Joseph A. Konstan. 2009. Facts or Friends?: Distinguishing Informational and Conversational Questions in Social Q&A Sites. In Proc. of CHI. 759--768.

Digital Library

[21]

Jaakko Hintikka and Ilpo Halonen. 1995. Semantics and pragmatics for whyquestions. The Journal of Philosophy 92, 12(1995), 636--657.

[22]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8(Nov. 1997), 1735--1780.

Digital Library

[23]

Bernard J. Jansen, Danielle L. Booth, and Amanda Spink. 2007. Determining the User Intent of Web Search Engine Queries. In Proc. of WWW. 1149--1150.

Digital Library

[24]

Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why We Twitter: Understanding Microblogging Usage and Communities. In Proc. of WebKDD/SNAKDD. 56--65.

Digital Library

[25]

Jiwoon Jeon, W Bruce Croft, and Joon Ho Lee. 2005. Finding similar questions in large question and answer archives. In Proc. of CIKM. 84--90.

Digital Library

[26]

Ludmila I. Kuncheva and Christopher J. Whitaker. 2003. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Machine Learning 51, 2(2003), 181--207.

Digital Library

[27]

Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak. 2012. Analyzing and Predicting Question Quality in Community Question Answering Services. In Proc. of WWW Companion. 775--782.

Digital Library

[28]

Baoli Li, Yandong Liu, and Eugene Agichtein. 2008. CoCQA: Co-training over Questions and Answers with an Application to Predicting Question Subjectivity Orientation. In Proc. EMNLP. 937--946.

Digital Library

[29]

Kuan-Yu Lin and Hsi-Peng Lu. 2011. Why people use social networking sites: An empirical study integrating network externalities and motivation theory. Computers in human behavior 27, 3(2011), 1152--1161.

Digital Library

[30]

Yandong Liu and Eugene Agichtein. 2008. On the Evolution of the Yahoo! Answers QA Community. In Proc. of SIGIR. 737--738.

Digital Library

[31]

Yandong Liu, Nitya Narasimhan, Venu Vasudevan, and Eugene Agichtein. 2009. Is This Urgent?: Exploring Time-sensitive Information Needs in Collaborative Question Answering. In Proc. of SIGIR. 712--713.

Digital Library

[32]

Zhe Liu and Bernard J. Jansen. 2015. A Taxonomy for Classifying Questions Asked in Social Question and Answering. In Proc. of CHI EA '15. 1947--1952.

Digital Library

[33]

Denis Lukovnikov, Asja Fischer, Jens Lehmann, and Sören Auer. 2017. Neural Network-based Question Answering over Knowledge Graphs on Word and Character Level. In Proc. of WWW. 1211--1220.

Digital Library

[34]

Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. 2011. Design Lessons from the Fastest Q&A Site in the West. In Proc. of CHI. 2857--2866.

Digital Library

[35]

Eduarda Mendes Rodrigues and Natasa Milic-Frayling. 2009. Socializing or Knowledge Sharing?: Characterizing Social Intent in Community Question Answering. In Proc. of CIKM. 1127--1136.

Digital Library

[36]

Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural Variational Inference for Text Processing. In Proc. of ICML. 1727--1736.

Digital Library

[37]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint abs/1301.3781(2013).

[38]

Liqiang Nie, Meng Wang, Yue Gao, Zheng-Jun Zha, and Tat-Seng Chua. 2013. Beyond text QA: multimedia answer generation by harvesting web information. IEEE Transactions on Multimedia 15, 2(2013), 426--441.

Digital Library

[39]

Jong-Hoon Oh, Kentaro Torisawa, Chikara Hashimoto, Takuya Kawada, Stijn De Saeger, Jun'ichi Kazama, and Yiou Wang. 2012. Why Question Answering Using Sentiment Analysis and Word Classes. In Proc. EMNLP-CoNLL. 368--378.

Digital Library

[40]

Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. 2, 1--2(Jan. 2008), 1--135.

Digital Library

[41]

Fabian Pedregosa et al. 2011. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12(Nov. 2011), 2825--2830.

Digital Library

[42]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proc. of EMNLP, Vol. 14. 1532--1543.

[43]

Antti Rasmus, Harri Valpola, Mikko Honkala, Mathias Berglund, and Tapani Raiko. 2015. Semi-supervised Learning with Ladder Networks. In Proc. of NIPS. 3546--3554.

Digital Library

[44]

Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In Proc. of LREC 2010 Workshop on New Challenges for NLP Frameworks. 45--50.

[45]

Ivan Srba and Maria Bielikova. 2016. A Comprehensive Survey and Classification of Approaches for Community Question Answering. ACM Trans. Web 10, 3(Aug. 2016), 18:1--18:63.

Digital Library

[46]

Mihai Surdeanu, Massimiliano Ciaramita, and Hugo Zaragoza. 2011. Learning to rank answers to non-factoid questions from web collections. Computational linguistics 37, 2(2011), 351--383.

Digital Library

[47]

Hapnes Toba, Zhao-Yan Ming, Mirna Adriani, and Tat-Seng Chua. 2014. Discovering High Quality Answers in Community Question Answering Archives Using a Hierarchy of Classifiers. Inf. Sci. 261(2014), 101--115.

Digital Library

[48]

Gilad Tsur, Yuval Pinter, Idan Szpektor, and David Carmel. 2016. Identifying Web Queries with Question Intent. In Proc. of WWW. 783--793.

Digital Library

[49]

Marilyn A Walker, Pranav Anand, Robert Abbott, and Ricky Grant. 2012. Stance classification using dialogic properties of persuasion. In Proc. of NAACL-HLT. 592--596.

Digital Library

[50]

Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of 'small-world' networks. Nature 393, 6684(1998), 440--442.

[51]

Ingmar Weber, Antti Ukkonen, and Aris Gionis. 2012. Answers, not links: extracting tips from yahoo! answers to address how-to web queries. In Proc. of WSDM. 613--622.

Digital Library

[52]

Howard T Welser, Eric Gleave, Danyel Fisher, and Marc Smith. 2007. Visualizing the signatures of social roles in online discussion groups. Journal of social structure 8, 2(2007), 1--32.

[53]

Max L. L. Wilson, Paul Resnick, David Coyle, and Ed H. Chi. 2013. RepliCHI: The Workshop. In Proc. of CHI EA. 3159--3162.

Digital Library

[54]

Xiaobing Xue, Jiwoon Jeon, and W. Bruce Croft. 2008. Retrieval Models for Question and Answer Archives. In Proc. of SIGIR. 475--482.

Digital Library

[55]

Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang, Tat-Seng Chua, and Xian-Sheng Hua. 2010. Visual Query Suggestion: Towards Capturing User Intent in Internet Image Search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3(Aug. 2010), 13:1--13:19.

Digital Library

[56]

Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation.(2002).

Cited By

Díaz OFigueroa A(2023)Improving Question Intent Identification by Exploiting Its Synergy With User AgeIEEE Access10.1109/ACCESS.2023.332245711(112044-112059)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3322457
Trewhela AFigueroa A(2023)Text-based neural networks for question intent recognitionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105933121:COnline publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.engappai.2023.105933
Zhang XWang DTang YXiao Q(2023)How question type influences knowledge withholding in social Q&A communityJournal of the Association for Information Science and Technology10.1002/asi.2481374:10(1170-1184)Online publication date: 7-Sep-2023
https://dl.acm.org/doi/10.1002/asi.24813
Show More Cited By

Index Terms

Identifying Informational vs. Conversational Questions on Community Question Answering Archives
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Recommendations

Understanding user intent in community question answering
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide Web

Community Question Answering (CQA) services, such as Yahoo! Answers, are specifically designed to address the innate limitation of Web search engines by helping users obtain information from a community. Understanding the user intent of questions would ...
A community question-answering refinement system
HT '11: Proceedings of the 22nd ACM conference on Hypertext and hypermedia

Community Question Answering (CQA) websites, which archive millions of questions and answers created by CQA users to provide a rich resource of information that is missing at web search engines and QA websites, have become increasingly popular. Web users ...
Improving search relevance for short queries in community question answering
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

Relevant question retrieval and ranking is a typical task in community question answering (CQA). Existing methods mainly focus on long and syntactically structured queries. However, when an input query is short, the task becomes challenging, due to a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

February 2018

821 pages

ISBN:9781450355810

DOI:10.1145/3159652

General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 February 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM 2018

Sponsor:

WSDM 2018: The Eleventh ACM International Conference on Web Search and Data Mining

February 5 - 9, 2018

CA, Marina Del Rey, USA

Acceptance Rates

WSDM '18 Paper Acceptance Rate 81 of 514 submissions, 16%;

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
379
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Díaz OFigueroa A(2023)Improving Question Intent Identification by Exploiting Its Synergy With User AgeIEEE Access10.1109/ACCESS.2023.332245711(112044-112059)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3322457
Trewhela AFigueroa A(2023)Text-based neural networks for question intent recognitionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105933121:COnline publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.engappai.2023.105933
Zhang XWang DTang YXiao Q(2023)How question type influences knowledge withholding in social Q&A communityJournal of the Association for Information Science and Technology10.1002/asi.2481374:10(1170-1184)Online publication date: 7-Sep-2023
https://dl.acm.org/doi/10.1002/asi.24813
Bolotova VBlinov VScholer FCroft WSanderson MAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)A Non-Factoid Question-Answering TaxonomyProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531926(1196-1207)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531926
Roy PSaumya SSingh JBanerjee SGutub A(2022)Analysis of community question‐answering issues via machine learning and deep learning: State‐of‐the‐art reviewCAAI Transactions on Intelligence Technology10.1049/cit2.120818:1(95-117)Online publication date: 4-May-2022
https://doi.org/10.1049/cit2.12081
Figueroa ATimilsina M(2021)What identifies different age cohorts in Yahoo! Answers?Knowledge-Based Systems10.1016/j.knosys.2021.107278228:COnline publication date: 27-Sep-2021
https://dl.acm.org/doi/10.1016/j.knosys.2021.107278
Chang JCheng JDanescu-Niculescu-Mizil C(2020)Don’t Let Me Be Misunderstood:Comparing Intentions and Perceptions in Online DiscussionsProceedings of The Web Conference 202010.1145/3366423.3380273(2066-2077)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366423.3380273
Li MLu XChen LWang J(2020)Knowledge map construction for question and answer archivesExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.112923141:COnline publication date: 1-Mar-2020
https://dl.acm.org/doi/10.1016/j.eswa.2019.112923
Makarenkov VGuy IHazon NMeisels TShapira BRokach L(2019)Implicit dimension identification in user-generated text with LSTM networksInformation Processing and Management: an International Journal10.1016/j.ipm.2019.02.00756:5(1880-1893)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.1016/j.ipm.2019.02.007
Guy IPera SEkstrand MAmatriain XO'Donovan J(2018)Connecting sellers and buyers on the world's largest inventoryProceedings of the 12th ACM Conference on Recommender Systems10.1145/3240323.3241733(490-491)Online publication date: 27-Sep-2018
https://dl.acm.org/doi/10.1145/3240323.3241733
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten