skip to main content
10.1145/2661829.2661978acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

NCR: A Scalable Network-Based Approach to Co-Ranking in Question-and-Answer Sites

Published: 03 November 2014 Publication History

Abstract

Question-and-answer (Q&A) websites, such as Yahoo! Answers, Stack Overflow and Quora, have become a popular and powerful platform for Web users to share knowledge on a wide range of subjects. This has led to a rapidly growing volume of information and the consequent challenge of readily identifying high quality objects (questions, answers and users) in Q&A sites. Exploring the interdependent relationships among different types of objects can help find high quality objects in Q&A sites more accurately. In this paper, we specifically focus on the ranking problem of co-ranking questions, answers and users in a Q&A website. By studying the tightly connected relationships between Q&A objects, we can gain useful insights toward solving the co-ranking problem. However, co-ranking multiple objects in Q&A sites is a challenging task: a) With the large volumes of data in Q&A sites, it is important to design a model that can scale well; b) The large-scale Q&A data makes extracting supervised information very expensive. In order to address these issues, we propose an unsupervised Network-based Co-Ranking framework (NCR) to rank multiple objects in Q&A sites. Empirical studies on real-world Yahoo! Answers datasets demonstrate the effectiveness and the efficiency of the proposed NCR method.

References

[1]
L. Adamic, J. Zhang, E. Bakshy, and M. Ackerman. Knowledge sharing and yahoo answers: everyone knows something. In Proc. WWW, pages 665--674, 2008.
[2]
E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In Proc. WSDM, pages 183--194, 2008.
[3]
A. Anderson, D. Huttenlocher, J. Kleinberg, and J. Leskovec. Discovering value from community activity on focused question answering sites: a case study of stack overow. In Proc. KDD, pages 850--858, 2012.
[4]
C. Aperjis, B. Huberman, and F. Wu. Human speed-accuracy tradeoffs in search. In Proc. HICSS, pages 1--10, 2011.
[5]
J. Bian, Y. Liu, E. Agichtein, and H. Zha. Finding the right facts in the crowd: factoid question answering over social media. In Proc. WWW, pages 467--476, 2008.
[6]
J. Bian, Y. Liu, D. Zhou, E. Agichtein, and H. Zha. Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In Proc. WWW, pages 51--60, 2009.
[7]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1):107--117, 1998.
[8]
G. Dror, D. Pelleg, O. Rokhlenko, and I. Szpektor. Churn prediction in new users of yahoo! answers. In Proc. WWW Companion, pages 829--834, 2012.
[9]
K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. TOIS, 20(4):422--446, 2002.
[10]
J. Jeon, W. Croft, J. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. In Proc. SIGIR, pages 228--235, 2006.
[11]
P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. In Proc. CIKM, pages 919--922, 2007.
[12]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. JACM, 46(5):604--632, 1999.
[13]
B. Li, T. Jin, M. Lyu, I. King, and B. Mak. Analyzing and predicting question quality in community question answering services. In Proc. WWW Companion, pages 775--782, 2012.
[14]
B. Li, Y. Liu, and E. Agichtein. Cocqa: co-training over questions and answers with an application to predicting question subjectivity orientation. In Proc. EMNLP, pages 937--946, 2008.
[15]
Q. Liu, E. Agichtein, G. Dror, E. Gabrilovich, Y. Maarek, D. Pelleg, and I. Szpektor. Predicting web searcher satisfaction with existing community-based answers. In Proc. SIGIR, pages 415--424, 2011.
[16]
Q. Liu, E. Agichtein, G. Dror, Y. Maarek, and I. Szpektor. When web search fails, searchers become askers: understanding the transition. In Proc. SIGIR, pages 801--810, 2012.
[17]
M. McGee. Yahoo answers hits one billion answers. Search Enginel Land, 2010.
[18]
M. McGee. Yahoo answers hits 300 million questions, but q&a activity is declining. Search Enginel Land, 2012.
[19]
K. Nam, M. Ackerman, and L. Adamic. Questions in, knowledge in?: a study of naver's question answering community. In Proc. CHI, pages 779--788, 2009.
[20]
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In Proc. SIGMOD, pages 1099--1110, 2008.
[21]
J. Preece, B. Nonnecke, and D. Andrews. The top five reasons for lurking: improving community experiences for everyone. Computers in human behavior, 20(2):201--223, 2004.
[22]
T. Sakai, D. Ishikawa, N. Kando, Y. Seki, K. Kuriyama, and C. Y. Lin. Using graded-relevance metrics for evaluating community qa answer selection. In Proc. WSDM, pages 187--196, 2011.
[23]
C. Shah and J. Pomerantz. Evaluating and predicting answer quality in community qa. In Proc. SIGIR, pages 411--418, 2010.
[24]
X. Si, E. Y. Chang, Z. Gyöngyi, and M. Sun. Confucius and its intelligent disciples: integrating social with search. PVLDB, 3(1-2):1505--1516, 2010.
[25]
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In Proc. EDBT, pages 565--576, 2009.
[26]
M. Suryanto, E. Lim, A. Sun, and R. Chiang. Quality-aware collaborative question answering: methods and evaluation. In Proc. WSDM, pages 142--151, 2009.
[27]
G. Wang, K. Gill, M. Mohanlal, H. Zheng, and B. Zhao. Wisdom in the social crowd: an analysis of quora. In Proc. WWW, pages 1341--1352, 2013.
[28]
G. Wang, S. Xie, B. Liu, and P. Yu. Review graph based online store review spammer detection. In Proc. ICDM, pages 1242--1247, 2011.
[29]
X. J. Wang, X. Tu, D. Feng, and L. Zhang. Ranking community answers by modeling question-answer relationships via analogical reasoning. In Proc. SIGIR, pages 179--186, 2009.
[30]
J. Yang, L. Adamic, and M. Ackerman. Crowdsourcing and knowledge sharing: strategic user behavior on taskcn. In Proc. EC, pages 246--255, 2008.
[31]
X. Yin, J. Han, and P. Yu. Truth discovery with multiple conicting information providers on the web. TKDE, 20(6):796--808, 2008.
[32]
J. Zhang, M. Ackerman, and L. Adamic. Expertise networks in online communities: structure and algorithms. In Proc. WWW, pages 221--230, 2007.
[33]
D. Zhou, S. Orshanskiy, H. Zha, and L. Giles. Co-ranking authors and documents in a heterogeneous network. In Proc. ICDM, pages 739--744, 2007.

Cited By

View all
  • (2023)Automatic Quality Evaluation for User Generated Contents in Online Q&A Community Based on Word2Vec-CNN2023 International Conference on Neuromorphic Computing (ICNC)10.1109/ICNC59488.2023.10462736(360-366)Online publication date: 15-Dec-2023
  • (2021)Recency and quality-based ranking question in CQAsInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10255258:4Online publication date: 1-Jul-2021
  • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
  • Show More Cited By

Index Terms

  1. NCR: A Scalable Network-Based Approach to Co-Ranking in Question-and-Answer Sites

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
    November 2014
    2152 pages
    ISBN:9781450325981
    DOI:10.1145/2661829
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. co-ranking
    2. interrelationships
    3. q&a networks
    4. unsupervise

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '14
    Sponsor:

    Acceptance Rates

    CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Automatic Quality Evaluation for User Generated Contents in Online Q&A Community Based on Word2Vec-CNN2023 International Conference on Neuromorphic Computing (ICNC)10.1109/ICNC59488.2023.10462736(360-366)Online publication date: 15-Dec-2023
    • (2021)Recency and quality-based ranking question in CQAsInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10255258:4Online publication date: 1-Jul-2021
    • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
    • (2018)Representation Learning for Question Classification via Topic Sparse Autoencoder and Entity Embedding2018 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2018.8622331(126-133)Online publication date: Dec-2018
    • (2018)Retrieving people: Identifying potential answerers in Community Question‐AnsweringJournal of the Association for Information Science and Technology10.1002/asi.2404269:10(1246-1258)Online publication date: 18-Jul-2018
    • (2017)Towards Recency Ranking in Community Question AnsweringProceedings of the 23rd Brazillian Symposium on Multimedia and the Web10.1145/3126858.3126892(173-180)Online publication date: 17-Oct-2017
    • (2017)A Survey of Heterogeneous Information Network AnalysisIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.259856129:1(17-37)Online publication date: 1-Jan-2017
    • (2017)Connecting emerging relationships from news via tensor factorization2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258048(1223-1232)Online publication date: Dec-2017
    • (2017)Survey of Current DevelopmentsHeterogeneous Information Network Analysis and Applications10.1007/978-3-319-56212-4_2(13-30)Online publication date: 26-May-2017
    • (2016)HEER: Heterogeneous graph embedding for emerging relation detection from news2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840673(803-812)Online publication date: Dec-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media