Abstract
Community Question Answering (CQA) has emerged as a popular forum for users to ask and answer questions. Over the last few years, CQA portals such as Yahoo answersand Baidu Zhidao have exploded in popularity, and now provide a viable alternative to general purpose Web search. A number of answers submitted to address questions on CQA sites compose a valuable knowledge repository, which could be a gold mine for information retrieval as well as text mining. Two important questions in CQA research are focused on the quality of contents and the reputation of the answerers. Previous approaches for retrieving relevant and high quality content have been proposed, but not much work has been done on providing an integrated framework to solve these two problems. Besides, no research work has used both text and link information in their methods via leveraging existing ratings of answers and questions. In this paper, we present a novel approach to analyze questions and answers based on the topic modeling framework with Dirichlet forest priors (LDA-DF)[8]. We utilize information obtained from LDA-DF to construct a joint topical and link model to identify authorities and reliable answers on a CQA site.We evaluate our methods in a dataset obtained from Yahoo! Answers. With the new representation of topical structures on CQA datasets, using a limited amount of web resource, we show significant improvements over the state-of-art methods LDA-DF, LDA, and HLDA on performance of authority identification and answer ranking.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Celikyilmaz, A., Hakkani-Tur, D., Tur, G.: LDA based similarity modeling for question answering. In: Proceedings of the NAACL HLT 2010 Workshop on Semantic Search (2010)
Hickl, A.: Answering questions with authority. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management, pp. 1261–1270 (2008)
Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 45–54
McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and role discovery in social networks. Journal of Artificial Intelligence Research, 786–791 (2005)
Rasmussen, C.E.: The infinite Gaussian mixture model. Advances in Neural Information Processing Systems 12, 554–560 (2000)
Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community QA. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 411–418
Andrzejewski, D., Zhu, X., Craven, M.: Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In: Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, June 14-18, pp. 25–32 (2009)
Horowitz, D., Kamvar, S.D.: The anatomy of a large-scale social search engine. In: Proceedings of the 19th International Conference on World Wide Web, pp. 431–440
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
Agichtein, E., Liu, Y., Bian, J.: Modeling information-seeker satisfaction in community question answering. ACM Transactions on Knowledge Discovery from Data (TKDD)Â 3, 10 (2009)
Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proceedings of the International Conference on Web Search and Web Data Mining 2008, pp. 183–194 (2008)
Zhang, J., Ackerman, M.S., Adamic, L.: Expertise networks in online communities: structure and algorithms. In: Proceedings of the 16th International Conference on World Wide Web 2007, pp. 221–230 (2007)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 604–632 (1999)
Hong, L., Yang, Z.: Incorporating participant reputation in community-driven question answering systems. In: Symposium on Social Intelligence and Networking (2009)
Bian, J., Liu, Y., Zhou, D., Agichtein, E., Zha, H.: Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In: Proceedings of the 16th International Conference on World Wide Web 2009, pp. 51–60 (2009)
Ko, J., Nyberg, E., Si, L.: A probabilistic graphical model for joint answer ranking in question answering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007, pp. 343–350 (2007)
Sun, K., Cao, Y., Song, X., Song, Y.I., Wang, X., Lin, C.Y.: Learning to recommend questions based on user ratings. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management 2009, pp. 751–758 (2009)
Adamic, L.A., Zhang, J., Bakshy, E., Ackerman, M.S.: Knowledge sharing and yahoo answers: everyone knows something. In: Proceeding of the 17th International Conference on World Wide Web 2008, pp. 665–674 (2008)
Page, L.: S. Brin The PageRank citation ranking: bring order to the Web. Technical report, Stanford Digital Library Technologies Project (1998)
Nie, L., Davison, B.D., Qi, X.: f. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006, pp. 91–98 (2006)
Bilotti, M.W., Ogilvie, P., Callan, J., Nyberg, E.: Structured retrieval for question answering. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007, pp. 351–358 (2007)
Suryanto, M.A., Lim, E.P., Sun, A., Chiang, R.H.L.: Quality-aware collaborative question answering: methods and evaluation. In: Proceedings of the 32th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2009, pp. 142–151 (2009)
Bouguessa, M., Dumoulin, B., Wang, S.: Identifying authoritative actors in question-answering forums: the case of yahoo! answers. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2008, pp. 866–874 (2008)
Jurczyk, P., Agichtein, E.: Discovering authorities in question answer communities by using link analysis. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management 2007, pp. 919–922 (2007)
Han, K.S., Song, Y.I., Rim, H.C.: Probabilistic model for definitional question answering. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006, pp. 212–219 (2006)
Cilibrasi, R., Vitanyi, P.: Automatic Meaning Discovery Using Google (2004), http://xxx.lanl.gov/abs/cs.CL/0412098
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101, 5228 (2004)
Kao, W.C., Liu, D.R., Wang, S.W.: Expert finding in question-answering websites: a novel hybrid approach. In: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 867–871 (2010)
Noguchi, Y.: Web searches go low-tech: You ask, a person answers. Washington Post, page A 1 (2006)
Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, June 14-18, pp. 665–672 (2009)
Gyongyi, Z., Koutrika, G., Pedersen, J., Garcia-Molina, H.: Questioning Yahoo! Answers. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, L., Hu, X. (2013). Identifying Authoritative and Reliable Contents in Community Question Answering with Domain Knowledge. In: Li, J., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science(), vol 7867. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40319-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-40319-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40318-7
Online ISBN: 978-3-642-40319-4
eBook Packages: Computer ScienceComputer Science (R0)