Abstract
With the boom of open source software, open source communities are formed and involved in software development, deployment and application with unprecedented level. However, the rapid expansion of open source communities results in a lot of redundant contents within the community, and most importantly, among communities since they overlap each other with shared issues. On the one hand, redundant contents that are expressed in informal free texts highly increase the size of contents, which makes people suffering from finding what they exactly need from communities; on the other hand, these communities are mutually complementary that the knowledge sharing across communities can be very beneficial to users. It is crucial to recommend content for users’ need through retrieving knowledge across communities. Current studies mainly focus on acquiring knowledge from one specific community to treat communities as isolated islands, and few of them have tackle the problem of content recommendation across multiple communities. In this paper, we firstly analyze five popular open source communities, and then propose an approach of cross-community content recommendation based on LDA topic model, integrating and distilling information from multiple communities to make knowledge acquisition easier and more efficient. Taking Docker as the case study, extensive experiments show that after performing a cross-community recommendation, more than 34 % overall unanswered questions find matched answers when similarity threshold β is set to 0.85. When setting β to 0.6, almost 90 % unanswered question can be answered with existing community content. It effectively leverages various communities to recommend valuable content to users.
The work is supported by Shenzhen Municipal Science and Technology Program (Grant No. JSGG2014051616 2852628), and VMware UR project.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Cheng, A.J., Chen, Y.Y., Huang, Y.T., et al.: Personalized travel recommendation by mining people attributes from community-contributed photos. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 83–92. ACM (2011)
Duan, D., Li, Y., Jin, Y., et al.: Community mining on dynamic weighted directed graphs. In: Proceedings of the 1st ACM International Workshop on Complex Networks Meet Information and Knowledge management, pp. 11–18. ACM, MLA (2009)
Jin, D., He, D., Liu, D., et al.: Genetic algorithm with local search for community mining in complex networks. In: 2010 22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI), vol. 1, pp. 105–112. IEEE (2010). MLA
Phelan, O., McCarthy, K., Smyth, B.: Using Twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM (2009)
Kamahara, J., Asakawa, T., Shimojo, S., et al.: A community-based recommendation system to reveal unexpected interests. In: Proceedings of the 11th International Multimedia Modelling Conference, MMM 2005, pp. 433–438. IEEE (2005)
Zhang, H., Giles, C.L., Foley, H.C., et al.: Probabilistic community discovery using hierarchical latent Gaussian mixture model. In: AAAI, vol. 7, pp. 663–668 (2007)
Chen, W.Y., Chu, J.C., Luan, J., et al.: Collaborative filtering for orkut communities: discovery of user latent behavior. In: Proceedings of the 18th International Conference on World Wide Web, pp. 681–690. ACM, MLA (2009)
Qu, M., Qiu, G., He, X., et al.: Probabilistic question recommendation for question answering communities. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1229–1230. ACM (2009)
Ji, Z., Xu, F., Wang, B., et al.: Question-answer topic model for question retrieval in community question answering. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2471–2474. ACM (2012)
Li, D., He, B., Ding, Y., et al.: Community-based topic modeling for social tagging. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1565–1568. ACM (2010)
Yin, Z., Cao, L., Gu, Q., et al.: Latent community topic analysis: integration of community discovery with topic modeling. ACM Trans. Intell. Syst. Technol. (TIST) 3(4), 63 (2012)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. (2012)
Chris, D.P.: Another stemmer. ACM SIGIR Forum 24(3), 56–61 (1990)
Levenshtein Distance. http://en.wikipedia.orgjwikillevenshtein_distance
Cosine Similarity. https://en.wikipedia.org/wiki/Cosine_similarity
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Gensim. http://radimrehurek.com/gensim/
Hoffman, M., Bach, F.R., Blei, D.M.: Online learning for latent dirichlet allocation. In: Advances in Neural Information Processing Systems, pp. 856–864 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Yong, Y., Ying, L., Hongyan, T., Tong, J., Wenlong, S. (2016). An Approach for Cross-Community Content Recommendation: A Case Study on Docker. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-45817-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45816-8
Online ISBN: 978-3-319-45817-5
eBook Packages: Computer ScienceComputer Science (R0)