research-article

NCR: A Scalable Network-Based Approach to Co-Ranking in Question-and-Answer Sites

Authors:

Jingyuan Zhang,

Philip S. YuAuthors Info & Claims

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 709 - 718

https://doi.org/10.1145/2661829.2661978

Published: 03 November 2014 Publication History

Abstract

Question-and-answer (Q&A) websites, such as Yahoo! Answers, Stack Overflow and Quora, have become a popular and powerful platform for Web users to share knowledge on a wide range of subjects. This has led to a rapidly growing volume of information and the consequent challenge of readily identifying high quality objects (questions, answers and users) in Q&A sites. Exploring the interdependent relationships among different types of objects can help find high quality objects in Q&A sites more accurately. In this paper, we specifically focus on the ranking problem of co-ranking questions, answers and users in a Q&A website. By studying the tightly connected relationships between Q&A objects, we can gain useful insights toward solving the co-ranking problem. However, co-ranking multiple objects in Q&A sites is a challenging task: a) With the large volumes of data in Q&A sites, it is important to design a model that can scale well; b) The large-scale Q&A data makes extracting supervised information very expensive. In order to address these issues, we propose an unsupervised Network-based Co-Ranking framework (NCR) to rank multiple objects in Q&A sites. Empirical studies on real-world Yahoo! Answers datasets demonstrate the effectiveness and the efficiency of the proposed NCR method.

References

[1]

L. Adamic, J. Zhang, E. Bakshy, and M. Ackerman. Knowledge sharing and yahoo answers: everyone knows something. In Proc. WWW, pages 665--674, 2008.

Digital Library

[2]

E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In Proc. WSDM, pages 183--194, 2008.

Digital Library

[3]

A. Anderson, D. Huttenlocher, J. Kleinberg, and J. Leskovec. Discovering value from community activity on focused question answering sites: a case study of stack overow. In Proc. KDD, pages 850--858, 2012.

Digital Library

[4]

C. Aperjis, B. Huberman, and F. Wu. Human speed-accuracy tradeoffs in search. In Proc. HICSS, pages 1--10, 2011.

Digital Library

[5]

J. Bian, Y. Liu, E. Agichtein, and H. Zha. Finding the right facts in the crowd: factoid question answering over social media. In Proc. WWW, pages 467--476, 2008.

Digital Library

[6]

J. Bian, Y. Liu, D. Zhou, E. Agichtein, and H. Zha. Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In Proc. WWW, pages 51--60, 2009.

Digital Library

[7]

S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1):107--117, 1998.

Digital Library

[8]

G. Dror, D. Pelleg, O. Rokhlenko, and I. Szpektor. Churn prediction in new users of yahoo! answers. In Proc. WWW Companion, pages 829--834, 2012.

Digital Library

[9]

K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. TOIS, 20(4):422--446, 2002.

Digital Library

[10]

J. Jeon, W. Croft, J. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. In Proc. SIGIR, pages 228--235, 2006.

Digital Library

[11]

P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. In Proc. CIKM, pages 919--922, 2007.

Digital Library

[12]

J. M. Kleinberg. Authoritative sources in a hyperlinked environment. JACM, 46(5):604--632, 1999.

Digital Library

[13]

B. Li, T. Jin, M. Lyu, I. King, and B. Mak. Analyzing and predicting question quality in community question answering services. In Proc. WWW Companion, pages 775--782, 2012.

Digital Library

[14]

B. Li, Y. Liu, and E. Agichtein. Cocqa: co-training over questions and answers with an application to predicting question subjectivity orientation. In Proc. EMNLP, pages 937--946, 2008.

Digital Library

[15]

Q. Liu, E. Agichtein, G. Dror, E. Gabrilovich, Y. Maarek, D. Pelleg, and I. Szpektor. Predicting web searcher satisfaction with existing community-based answers. In Proc. SIGIR, pages 415--424, 2011.

Digital Library

[16]

Q. Liu, E. Agichtein, G. Dror, Y. Maarek, and I. Szpektor. When web search fails, searchers become askers: understanding the transition. In Proc. SIGIR, pages 801--810, 2012.

Digital Library

[17]

M. McGee. Yahoo answers hits one billion answers. Search Enginel Land, 2010.

[18]

M. McGee. Yahoo answers hits 300 million questions, but q&a activity is declining. Search Enginel Land, 2012.

[19]

K. Nam, M. Ackerman, and L. Adamic. Questions in, knowledge in?: a study of naver's question answering community. In Proc. CHI, pages 779--788, 2009.

Digital Library

[20]

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In Proc. SIGMOD, pages 1099--1110, 2008.

Digital Library

[21]

J. Preece, B. Nonnecke, and D. Andrews. The top five reasons for lurking: improving community experiences for everyone. Computers in human behavior, 20(2):201--223, 2004.

[22]

T. Sakai, D. Ishikawa, N. Kando, Y. Seki, K. Kuriyama, and C. Y. Lin. Using graded-relevance metrics for evaluating community qa answer selection. In Proc. WSDM, pages 187--196, 2011.

Digital Library

[23]

C. Shah and J. Pomerantz. Evaluating and predicting answer quality in community qa. In Proc. SIGIR, pages 411--418, 2010.

Digital Library

[24]

X. Si, E. Y. Chang, Z. Gyöngyi, and M. Sun. Confucius and its intelligent disciples: integrating social with search. PVLDB, 3(1-2):1505--1516, 2010.

Digital Library

[25]

Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In Proc. EDBT, pages 565--576, 2009.

Digital Library

[26]

M. Suryanto, E. Lim, A. Sun, and R. Chiang. Quality-aware collaborative question answering: methods and evaluation. In Proc. WSDM, pages 142--151, 2009.

Digital Library

[27]

G. Wang, K. Gill, M. Mohanlal, H. Zheng, and B. Zhao. Wisdom in the social crowd: an analysis of quora. In Proc. WWW, pages 1341--1352, 2013.

Digital Library

[28]

G. Wang, S. Xie, B. Liu, and P. Yu. Review graph based online store review spammer detection. In Proc. ICDM, pages 1242--1247, 2011.

Digital Library

[29]

X. J. Wang, X. Tu, D. Feng, and L. Zhang. Ranking community answers by modeling question-answer relationships via analogical reasoning. In Proc. SIGIR, pages 179--186, 2009.

Digital Library

[30]

J. Yang, L. Adamic, and M. Ackerman. Crowdsourcing and knowledge sharing: strategic user behavior on taskcn. In Proc. EC, pages 246--255, 2008.

Digital Library

[31]

X. Yin, J. Han, and P. Yu. Truth discovery with multiple conicting information providers on the web. TKDE, 20(6):796--808, 2008.

Digital Library

[32]

J. Zhang, M. Ackerman, and L. Adamic. Expertise networks in online communities: structure and algorithms. In Proc. WWW, pages 221--230, 2007.

Digital Library

[33]

D. Zhou, S. Orshanskiy, H. Zha, and L. Giles. Co-ranking authors and documents in a heterogeneous network. In Proc. ICDM, pages 739--744, 2007.

Digital Library

Cited By

Yang YTan YYang YHuang Z(2023)Automatic Quality Evaluation for User Generated Contents in Online Q&A Community Based on Word2Vec-CNN2023 International Conference on Neuromorphic Computing (ICNC)10.1109/ICNC59488.2023.10462736(360-366)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICNC59488.2023.10462736
Amancio LDorneles CDalip D(2021)Recency and quality-based ranking question in CQAsInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10255258:4Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.1016/j.ipm.2021.102552
Hoogeveen DWang LBaldwin TVerspoor K(2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
https://dl.acm.org/doi/10.1561/1500000062
Show More Cited By

Index Terms

NCR: A Scalable Network-Based Approach to Co-Ranking in Question-and-Answer Sites
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database Theory

The problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
Scalable and efficient processing of top-k multiple-type integrated queries
Abstract
In this paper, we define a new class of queries, the top-k multiple-type integrated query (simply, top-k MULTI query). It deals with multiple data types and finds the information in the order of relevance between the query and the object. Various ...
Query containment under bag and bag-set semantics

Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

November 2014

2152 pages

ISBN:9781450325981

DOI:10.1145/2661829

General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Division of Computer and Network Systems
U.S. Army
Office of International Science and Engineering
Yahoo! Labs Faculty Research and Engagement Program

Conference

CIKM '14

Sponsor:

CIKM '14: 2014 ACM Conference on Information and Knowledge Management

November 3 - 7, 2014

Shanghai, China

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
371
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang YTan YYang YHuang Z(2023)Automatic Quality Evaluation for User Generated Contents in Online Q&A Community Based on Word2Vec-CNN2023 International Conference on Neuromorphic Computing (ICNC)10.1109/ICNC59488.2023.10462736(360-366)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICNC59488.2023.10462736
Amancio LDorneles CDalip D(2021)Recency and quality-based ranking question in CQAsInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10255258:4Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.1016/j.ipm.2021.102552
Hoogeveen DWang LBaldwin TVerspoor K(2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
https://dl.acm.org/doi/10.1561/1500000062
Li DZhang JLi P(2018)Representation Learning for Question Classification via Topic Sparse Autoencoder and Entity Embedding2018 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2018.8622331(126-133)Online publication date: Dec-2018
https://doi.org/10.1109/BigData.2018.8622331
Le LShah C(2018)Retrieving people: Identifying potential answerers in Community Question‐AnsweringJournal of the Association for Information Science and Technology10.1002/asi.2404269:10(1246-1258)Online publication date: 18-Jul-2018
https://doi.org/10.1002/asi.24042
Amancio LDorneles CRoesler VValdeni de Lima JSaibel Santos CWillrich R(2017)Towards Recency Ranking in Community Question AnsweringProceedings of the 23rd Brazillian Symposium on Multimedia and the Web10.1145/3126858.3126892(173-180)Online publication date: 17-Oct-2017
https://dl.acm.org/doi/10.1145/3126858.3126892
Shi CLi YZhang JSun YYu P(2017)A Survey of Heterogeneous Information Network AnalysisIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.259856129:1(17-37)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TKDE.2016.2598561
Zhang JLu CCao BChang YYu P(2017)Connecting emerging relationships from news via tensor factorization2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258048(1223-1232)Online publication date: Dec-2017
https://doi.org/10.1109/BigData.2017.8258048
Shi CYu PShi CYu P(2017)Survey of Current DevelopmentsHeterogeneous Information Network Analysis and Applications10.1007/978-3-319-56212-4_2(13-30)Online publication date: 26-May-2017
https://doi.org/10.1007/978-3-319-56212-4_2
Zhang JLu CZhou MXie SChang YYu P(2016)HEER: Heterogeneous graph embedding for emerging relation detection from news2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840673(803-812)Online publication date: Dec-2016
https://doi.org/10.1109/BigData.2016.7840673
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten