skip to main content
research-article

Metrics and Algorithms for Routing Questions to User Communities

Published: 09 March 2015 Publication History

Abstract

An online community consists of a group of users who share a common interest, background, or experience, and their collective goal is to contribute toward the welfare of the community members. Several websites allow their users to create and manage niche communities, such as Yahoo! Groups, Facebook Groups, Google+ Circles, and WebMD Forums. These community services also exist within enterprises, such as IBM Connections. Question answering within these communities enables their members to exchange knowledge and information with other community members. However, the onus of finding the right community for question asking lies with an individual user. The overwhelming number of communities necessitates the need for a good question routing strategy so that new questions get routed to an appropriately focused community and thus get resolved in a reasonable time frame.
In this article, we consider the novel problem of routing a question to the right community and propose a framework for selecting and ranking the relevant communities for a question. We propose several novel features for modeling the three main entities of the system: questions, users, and communities. We propose features such as language attributes, inclination to respond, user familiarity, and difficulty of a question; based on these features, we propose similarity metrics between the routed question and the system entities. We introduce a Cutoff-Aggregation (CA) algorithm that aggregates the entity similarity within a community to compute that community's relevance. We introduce two k-nearest-neighbor (knn) algorithms that are a natural instantiation of the CA algorithm, which are computationally efficient and evaluate several ranking algorithms over the aggregate similarity scores computed by the two knn algorithms. We propose clustering techniques to speed up our recommendation framework and show how pipelining can improve the model performance. We demonstrate the effectiveness of our framework on two large real-world datasets.

References

[1]
Sihem Amer-Yahia, Senjuti Basu Roy, Ashish Chawlat, Gautam Das, and Cong Yu. 2009. Group recommendation: Semantics and efficiency. Proceedings of the VLDB Endowment 2, 1 (Aug. 2009), 754--765.
[2]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2001. Latent Dirichlet allocation. In Advances in Neural Information Processing Systems 14 Neural Information Processing Systems: Natural and Synthetic (NIPS'01). MIT Press, 601--608.
[3]
Manuel Blum, Robert W. Floyd, Vaughan R. Pratt, Ronald L. Rivest, and Robert Endre Tarjan. 1972. Linear time bounds for median computations. In Proceedings of the 4th Annual ACM Symposium on Theory of Computing. ACM, 119--124.
[4]
Mohamed Bouguessa, Benoît Dumoulin, and Shengrui Wang. 2008. Identifying authoritative actors in question-answering forums: The case of Yahoo! answers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08). ACM, 866--874.
[5]
Yunbo Cao, Huizhong Duan, Chin yew Lin, Yong Yu, and Hsiao wuen Hon. 2008. Recommending questions using the mdl-based tree cut model. In Proceeding of the 17th International Conference on World Wide Web (WWW'08). ACM, 81--90.
[6]
Shuo Chang and Aditya Pal. 2013. Routing questions for collaborative answering in community question answering. In Advances in Social Networks Analysis and Mining (ASONAM'13). ACM, 494--501.
[7]
Kenneth Ward Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In 2nd Applied Natural Language Processing Conference (ANLP'88). ACL, 136--143.
[8]
Don Coppersmith, Lisa Fleischer, and Atri Rudra. 2006. Ordering by weighted number of wins gives a good ranking for weighted tournaments. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'06). ACM, 776--782.
[9]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Communications of ACM 51, 1 (2008), 107--113.
[10]
Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2004. Kernel k-means: Spectral clustering and normalized cuts. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04). ACM, 551--556.
[11]
Ronald Fagin, Ravi Kumar, and D. Sivakumar. 2003. Comparing top k lists. SIAM Journal of Discrete Mathematics 17, 1 (2003), 134--160.
[12]
Mike Gartrell, Xinyu Xing, Qin Lv, Aaron Beach, Richard Han, Shivakant Mishra, and Karim Seada. 2010. Enhancing group recommendation by incorporating social relationship interactions. In Proceedings of the 2010 International ACM SIGGROUP Conference on Supporting Group Work (GROUP'10). ACM, 97--106.
[13]
Jagadeesh Gorla, Neal Lathia, Stephen Robertson, and Jun Wang. 2013. Probabilistic group recommendation via information matching. In Proceedings of the 22nd International World Wide Web Conference, (WWW'13). 495--504.
[14]
Michael Grant and Stephen Boyd. 2008. Graph implementations for nonsmooth convex programs. In Recent Advances in Learning and Control, V. Blondel, S. Boyd, and H. Kimura (Eds.). Springer-Verlag Limited, 95--110. http://stanford.edu/ boyd/graph_dcp.html.
[15]
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu. 2008. Tapping on the potential of q&a community by recommending answer providers. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08). ACM, 921--930.
[16]
Ralf Herbrich, Tom Minka, and Thore Graepel. 2007. TrueSkillTM: A Bayesian skill rating system. In Advances in Neural Information Processing Systems 19 (NIPS'06). MIT Press, 569--576.
[17]
Liangjie Hong, Ron Bekkerman, Joseph Adler, and Brian D. Davison. 2012. Learning to rank social update streams. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'12). ACM, 651--660.
[18]
Matthew A. Jaro. 1989. Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida. Journal of the American Statistics Association 84, 406 (1989), 414--420.
[19]
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'02). ACM, 133--142.
[20]
Pawel Jurczyk and Eugene Agichtein. 2007. Discovering authorities in question answer communities by using link analysis. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 919--922.
[21]
Ritwik Kumar, Arunava Banerjee, Baba C. Vemuri, and Hanspeter Pfister. 2011. Maximizing all margins: Pushing face recognition with kernel plurality. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). IEEE, 2375--2382.
[22]
Liang-Cheng Lai and Hung-Yu Kao. 2012. Question routing by modeling user expertise and activity in cQA services. In The 26th Annual Conference of the Japanese Society for Artificial Intelligence.
[23]
Baichuan Li, Irwin King, and Michael R. Lyu. 2011. Question routing in community question answering: Putting category in its place. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM'11). ACM, 2041--2044.
[24]
Wei Li, Charles Zhang, and Songlin Hu. 2010. G-Finder: Routing programming questions closer to the experts. In ACM Sigplan Notices, Vol. 45. ACM, 62--73.
[25]
Jing Liu, Young-In Song, and Chin-Yew Lin. 2011. Competition-based user expertise score estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'11). ACM, New York, NY, 425--434.
[26]
Jing Liu, Quan Wang, Chin-Yew Lin, and Hsiao-Wuen Hon. 2013. Question Difficulty Estimation in Community Question Answering Services. In EMNLP. ACL, 85--90.
[27]
Qiaoling Liu and Eugene Agichtein. 2011. Modeling answerer behavior in collaborative question answering systems. In ECIR (Lecture Notes in Computer Science), Vol. 6611. Springer, 67--79.
[28]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schtze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY.
[29]
George Lann Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions I. Mathematical Programming 14, 1 (1978), 265--294.
[30]
Mark O'Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of user. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW'01). Kluwer Academic, 199--218.
[31]
Aditya Pal and Scott Counts. 2011. Identifying topical authorities in microblogs. In Proceedings of the 4th International Conference on Web Search and Web Data Mining (WSDM'11). ACM, 45--54.
[32]
Aditya Pal, F. Maxwell Harper, and Joseph A. Konstan. 2012. Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems 30, 2 (2012), 10:1--10:28.
[33]
Aditya Pal and Joseph A. Konstan. 2010. Expert identification in community question answering: Exploring question selection bias. In Proceedings of the 19th ACM Conference on Information and Knowledge Management, (CIKM). ACM, 1505--1508.
[34]
Aditya Pal, Fei Wang, Michelle X. Zhou, Jeffrey Nichols, and Barton A. Smith. 2013. Question routing to user communities. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management (CIKM'13). ACM, New York, NY, 2357--2362.
[35]
Jenny Preece and Diane Maloney Krichmar. 2005. Online communities: Design, theory and practice. Journal of Computer Mediated Communication 10, 4 (2005).
[36]
David J. Rogers and Taffee T. Tanimoto. 1960. A computer program for classifying plants. Science 132, 3434 (Oct. 1960), 1115--1118.
[37]
Lee Sproull and Manuel Arriaga. 2007. Online communities. In The Handbook of Computer Networks, H. Bidgoli (Ed.). Wiley Publishing.
[38]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining. Addison-Wesley Longman, Boston, MA.
[39]
Mao Ye, Xingjie Liu, and Wang-Chien Lee. 2012. Exploring social influence for recommendation: A generative model approach. In The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, 671--680.
[40]
Dell Zhang and Wee Sun Lee. 2003. Question classification using support vector machines. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'03). ACM, 26--32.
[41]
Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise networks in online communities: structure and algorithms. In Proceedings of the 16th International Conference on World Wide Web (WWW'07). ACM, 221--230.
[42]
Yanhong Zhou, Gao Cong, Bin Cui, Christian S. Jensen, and Junjie Yao. 2009. Routing questions to the right users in online communities. In Proceedings of the 25th International Conference on Data Engineering (ICDE'09). IEEE, 700--711.

Cited By

View all
  • (2024)Early prediction of promising expert users on community question answering sitesInternational Journal of System Assurance Engineering and Management10.1007/s13198-024-02303-015:7(2902-2913)Online publication date: 9-Apr-2024
  • (2020)Large-Scale Question Tagging via Joint Question-Topic Embedding LearningACM Transactions on Information Systems10.1145/338095438:2(1-23)Online publication date: 28-Feb-2020
  • (2019)Quality-aware skill translation models for expert finding on StackOverflowInformation Systems10.1016/j.is.2019.07.003Online publication date: Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 33, Issue 3
March 2015
184 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2737814
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2015
Accepted: 01 January 2015
Revised: 01 January 2015
Received: 01 March 2014
Published in TOIS Volume 33, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Question answering
  2. community question routing
  3. group recommendation

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Early prediction of promising expert users on community question answering sitesInternational Journal of System Assurance Engineering and Management10.1007/s13198-024-02303-015:7(2902-2913)Online publication date: 9-Apr-2024
  • (2020)Large-Scale Question Tagging via Joint Question-Topic Embedding LearningACM Transactions on Information Systems10.1145/338095438:2(1-23)Online publication date: 28-Feb-2020
  • (2019)Quality-aware skill translation models for expert finding on StackOverflowInformation Systems10.1016/j.is.2019.07.003Online publication date: Jul-2019
  • (2018)A Survey on Expert Recommendation in Community Question AnsweringJournal of Computer Science and Technology10.1007/s11390-018-1845-033:4(625-653)Online publication date: 13-Jul-2018
  • (2017)On dynamicity of expert finding in community question answeringInformation Processing & Management10.1016/j.ipm.2017.04.00253:5(1026-1042)Online publication date: Sep-2017
  • (2016)Personalized recommendation for new questions in community question answeringProceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.5555/3192424.3192594(901-908)Online publication date: 18-Aug-2016
  • (2016)Personalized recommendation for new questions in community question answering2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)10.1109/ASONAM.2016.7752346(901-908)Online publication date: Aug-2016

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media