skip to main content
10.1145/3159652.3159733acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Identifying Informational vs. Conversational Questions on Community Question Answering Archives

Published: 02 February 2018 Publication History

Abstract

Questions on community question answering websites usually reflect one of two intents: learning information or starting a conversation. In this paper, we revisit this fundamental classification task of informational versus conversational questions, which was originally introduced and studied in 2009. We use a substantially larger dataset of archived questions from Yahoo Answers, which includes the question»s title, description, answers, and votes. We replicate the original experiments over this dataset, point out the common and different from the original results, and present a broad set of characteristics that distinguish the two question types. We also develop new classifiers that make use of additional data types, advanced machine learning, and a large dataset of unlabeled data, which achieve enhanced performance.

References

[1]
Martín Abadi et al. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proc.of OSDI. 265--283.
[2]
Lada A. Adamic, Jun Zhang, Eytan Bakshy, and Mark S. Ackerman. 2008. Knowledge Sharing and Yahoo Answers: Everyone Knows Something. In Proc. of WWW. 665--674.
[3]
Naoyoshi Aikawa, Tetsuya Sakai, and Hayato Yamana. 2011. Community QA question classification: Is the asker looking for subjective answers or not? IPSJ Online Transactions 4(2011), 160--168.
[4]
Michael S. Bernstein, Jaime Teevan, Susan Dumais, Daniel Liebling, and Eric Horvitz. 2012. Direct Answers for Search Queries in the Long Tail. In Proc. of CHI. 237--246.
[5]
Jiang Bian, Yandong Liu, Eugene Agichtein, and Hongyuan Zha. 2008. Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media. In Proc. of WWW. 467--476.
[6]
Steven Bird. 2006. NLTK: The Natural Language Toolkit. In Proc. of COLING-ACL. 69--72.
[7]
Liora Braunstain, Oren Kurland, David Carmel, Idan Szpektor, and Anna Shtok. 2016. Supporting human answers for advice-seeking questions in CQA sites. In Proc. of ECIR. 129--141.
[8]
David J. Brenes, Daniel Gayo-Avello, and Kilian Pérez-González. 2009. Survey and Evaluation of Query Intent Detection Methods. In Proc. of WSCD. 1--7.
[9]
Andrei Broder. 2002. A Taxonomy of Web Search. SIGIR Forum 36, 2(2002), 3--10.
[10]
Li Cai, Guangyou Zhou, Kang Liu, and Jun Zhao. 2011. Large-scale Question Classification in cQA by Leveraging Wikipedia Semantic Knowledge. In Proc. of CIKM. 1321--1330.
[11]
Wen Chan, Weidong Yang, Jinhui Tang, Jintao Du, Xiangdong Zhou, and Wei Wang. 2013. Community Question Topic Categorization via Hierarchical Kernelized Classification. In Proc. of CIKM. 959--968.
[12]
Long Chen, Dell Zhang, and Levene Mark. 2012. Understanding user intent in community question answering. In Proc. of WWW Companion. 823--828.
[13]
François Chollet. 2015. Keras. https://github.com/fchollet/keras.(2015).
[14]
Shiri Dori-Hacohen and James Allan. 2015. Automated controversy detection on the web. In Proc. of ECIR. 423--434.
[15]
Nicole B Ellison, Charles Steinfield, and Cliff Lampe. 2007. The benefits of Facebook friends: Social capital and college students? use of online social network sites. Journal of Computer-Mediated Communication 12, 4(2007), 1143--1168.
[16]
Iryna Gurevych, Eduard H Hovy, Noam Slonim, and Benno Stein. 2016. Debating Technologies. In Dagstuhl Reports, Vol. 5.
[17]
Ido Guy. 2016. Searching by Talking: Analysis of Voice Queries on Mobile Web Search. In Proc. of SIGIR. 35--44.
[18]
Ido Guy and Dan Pelleg. 2016. The Factoid Queries Collection. In Proc. of SIGIR. 717--720.
[19]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 11, 1(2009), 10--18.
[20]
F. Maxwell Harper, Daniel Moy, and Joseph A. Konstan. 2009. Facts or Friends?: Distinguishing Informational and Conversational Questions in Social Q&A Sites. In Proc. of CHI. 759--768.
[21]
Jaakko Hintikka and Ilpo Halonen. 1995. Semantics and pragmatics for whyquestions. The Journal of Philosophy 92, 12(1995), 636--657.
[22]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8(Nov. 1997), 1735--1780.
[23]
Bernard J. Jansen, Danielle L. Booth, and Amanda Spink. 2007. Determining the User Intent of Web Search Engine Queries. In Proc. of WWW. 1149--1150.
[24]
Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why We Twitter: Understanding Microblogging Usage and Communities. In Proc. of WebKDD/SNAKDD. 56--65.
[25]
Jiwoon Jeon, W Bruce Croft, and Joon Ho Lee. 2005. Finding similar questions in large question and answer archives. In Proc. of CIKM. 84--90.
[26]
Ludmila I. Kuncheva and Christopher J. Whitaker. 2003. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Machine Learning 51, 2(2003), 181--207.
[27]
Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak. 2012. Analyzing and Predicting Question Quality in Community Question Answering Services. In Proc. of WWW Companion. 775--782.
[28]
Baoli Li, Yandong Liu, and Eugene Agichtein. 2008. CoCQA: Co-training over Questions and Answers with an Application to Predicting Question Subjectivity Orientation. In Proc. EMNLP. 937--946.
[29]
Kuan-Yu Lin and Hsi-Peng Lu. 2011. Why people use social networking sites: An empirical study integrating network externalities and motivation theory. Computers in human behavior 27, 3(2011), 1152--1161.
[30]
Yandong Liu and Eugene Agichtein. 2008. On the Evolution of the Yahoo! Answers QA Community. In Proc. of SIGIR. 737--738.
[31]
Yandong Liu, Nitya Narasimhan, Venu Vasudevan, and Eugene Agichtein. 2009. Is This Urgent?: Exploring Time-sensitive Information Needs in Collaborative Question Answering. In Proc. of SIGIR. 712--713.
[32]
Zhe Liu and Bernard J. Jansen. 2015. A Taxonomy for Classifying Questions Asked in Social Question and Answering. In Proc. of CHI EA '15. 1947--1952.
[33]
Denis Lukovnikov, Asja Fischer, Jens Lehmann, and Sören Auer. 2017. Neural Network-based Question Answering over Knowledge Graphs on Word and Character Level. In Proc. of WWW. 1211--1220.
[34]
Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. 2011. Design Lessons from the Fastest Q&A Site in the West. In Proc. of CHI. 2857--2866.
[35]
Eduarda Mendes Rodrigues and Natasa Milic-Frayling. 2009. Socializing or Knowledge Sharing?: Characterizing Social Intent in Community Question Answering. In Proc. of CIKM. 1127--1136.
[36]
Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural Variational Inference for Text Processing. In Proc. of ICML. 1727--1736.
[37]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint abs/1301.3781(2013).
[38]
Liqiang Nie, Meng Wang, Yue Gao, Zheng-Jun Zha, and Tat-Seng Chua. 2013. Beyond text QA: multimedia answer generation by harvesting web information. IEEE Transactions on Multimedia 15, 2(2013), 426--441.
[39]
Jong-Hoon Oh, Kentaro Torisawa, Chikara Hashimoto, Takuya Kawada, Stijn De Saeger, Jun'ichi Kazama, and Yiou Wang. 2012. Why Question Answering Using Sentiment Analysis and Word Classes. In Proc. EMNLP-CoNLL. 368--378.
[40]
Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. 2, 1--2(Jan. 2008), 1--135.
[41]
Fabian Pedregosa et al. 2011. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12(Nov. 2011), 2825--2830.
[42]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proc. of EMNLP, Vol. 14. 1532--1543.
[43]
Antti Rasmus, Harri Valpola, Mikko Honkala, Mathias Berglund, and Tapani Raiko. 2015. Semi-supervised Learning with Ladder Networks. In Proc. of NIPS. 3546--3554.
[44]
Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In Proc. of LREC 2010 Workshop on New Challenges for NLP Frameworks. 45--50.
[45]
Ivan Srba and Maria Bielikova. 2016. A Comprehensive Survey and Classification of Approaches for Community Question Answering. ACM Trans. Web 10, 3(Aug. 2016), 18:1--18:63.
[46]
Mihai Surdeanu, Massimiliano Ciaramita, and Hugo Zaragoza. 2011. Learning to rank answers to non-factoid questions from web collections. Computational linguistics 37, 2(2011), 351--383.
[47]
Hapnes Toba, Zhao-Yan Ming, Mirna Adriani, and Tat-Seng Chua. 2014. Discovering High Quality Answers in Community Question Answering Archives Using a Hierarchy of Classifiers. Inf. Sci. 261(2014), 101--115.
[48]
Gilad Tsur, Yuval Pinter, Idan Szpektor, and David Carmel. 2016. Identifying Web Queries with Question Intent. In Proc. of WWW. 783--793.
[49]
Marilyn A Walker, Pranav Anand, Robert Abbott, and Ricky Grant. 2012. Stance classification using dialogic properties of persuasion. In Proc. of NAACL-HLT. 592--596.
[50]
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of 'small-world' networks. Nature 393, 6684(1998), 440--442.
[51]
Ingmar Weber, Antti Ukkonen, and Aris Gionis. 2012. Answers, not links: extracting tips from yahoo! answers to address how-to web queries. In Proc. of WSDM. 613--622.
[52]
Howard T Welser, Eric Gleave, Danyel Fisher, and Marc Smith. 2007. Visualizing the signatures of social roles in online discussion groups. Journal of social structure 8, 2(2007), 1--32.
[53]
Max L. L. Wilson, Paul Resnick, David Coyle, and Ed H. Chi. 2013. RepliCHI: The Workshop. In Proc. of CHI EA. 3159--3162.
[54]
Xiaobing Xue, Jiwoon Jeon, and W. Bruce Croft. 2008. Retrieval Models for Question and Answer Archives. In Proc. of SIGIR. 475--482.
[55]
Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang, Tat-Seng Chua, and Xian-Sheng Hua. 2010. Visual Query Suggestion: Towards Capturing User Intent in Internet Image Search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3(Aug. 2010), 13:1--13:19.
[56]
Xiaojin Zhu and Zoubin Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation.(2002).

Cited By

View all
  • (2023)Improving Question Intent Identification by Exploiting Its Synergy With User AgeIEEE Access10.1109/ACCESS.2023.332245711(112044-112059)Online publication date: 2023
  • (2023)Text-based neural networks for question intent recognitionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105933121:COnline publication date: 1-May-2023
  • (2023)How question type influences knowledge withholding in social Q&A communityJournal of the Association for Information Science and Technology10.1002/asi.2481374:10(1170-1184)Online publication date: 7-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
February 2018
821 pages
ISBN:9781450355810
DOI:10.1145/3159652
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 February 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. community question answering
  2. label propagation
  3. long short term memory networks
  4. user intent
  5. yahoo answers

Qualifiers

  • Research-article

Conference

WSDM 2018

Acceptance Rates

WSDM '18 Paper Acceptance Rate 81 of 514 submissions, 16%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Improving Question Intent Identification by Exploiting Its Synergy With User AgeIEEE Access10.1109/ACCESS.2023.332245711(112044-112059)Online publication date: 2023
  • (2023)Text-based neural networks for question intent recognitionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105933121:COnline publication date: 1-May-2023
  • (2023)How question type influences knowledge withholding in social Q&A communityJournal of the Association for Information Science and Technology10.1002/asi.2481374:10(1170-1184)Online publication date: 7-Sep-2023
  • (2022)A Non-Factoid Question-Answering TaxonomyProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531926(1196-1207)Online publication date: 6-Jul-2022
  • (2022)Analysis of community question‐answering issues via machine learning and deep learning: State‐of‐the‐art reviewCAAI Transactions on Intelligence Technology10.1049/cit2.120818:1(95-117)Online publication date: 4-May-2022
  • (2021)What identifies different age cohorts in Yahoo! Answers?Knowledge-Based Systems10.1016/j.knosys.2021.107278228:COnline publication date: 27-Sep-2021
  • (2020)Don’t Let Me Be Misunderstood:Comparing Intentions and Perceptions in Online DiscussionsProceedings of The Web Conference 202010.1145/3366423.3380273(2066-2077)Online publication date: 20-Apr-2020
  • (2020)Knowledge map construction for question and answer archivesExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.112923141:COnline publication date: 1-Mar-2020
  • (2019)Implicit dimension identification in user-generated text with LSTM networksInformation Processing and Management: an International Journal10.1016/j.ipm.2019.02.00756:5(1880-1893)Online publication date: 1-Sep-2019
  • (2018)Connecting sellers and buyers on the world's largest inventoryProceedings of the 12th ACM Conference on Recommender Systems10.1145/3240323.3241733(490-491)Online publication date: 27-Sep-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media