Abstract
Sina Weibo, one of the biggest social services in China, provides users with opportunities to share information and express their personal views, leading an explosive growth of information. How to recommend the right information to the proper person among massive data has received considerable critical attention in recent years. One of the main obstacles is the access to user topic interests. In this paper, we proposed an algorithm based on tags and bidirectional interactions to mine user topic interests on Sina Weibo. The algorithm, formulated by user interaction graph, fully takes advantage of the discordance between user interactions. Forward spread and back spread are thus utilized to update tag spread weights. We also quantify the impact of these two spread by tuning parameters on three sub data sets. In order to prove the superiority of the algorithm, we compare our algorithm with famous methods on Sina Weibo. The result demonstrates that our new algorithm outperforms other methods both in precision rate and recall rate, with the ability of mining user interest effectively with respect to tags and bidirectional interactions.
Similar content being viewed by others
References
Blei D M, Ng A Y, Jordan M I, Latent dirichlet allocation, the Journal of machine Learning research, 3, 993–1022, (2003).
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Cantador I, Bellogín A, Vallet D: Content-based Recommendation in Social Tagging Systems. In Proceedings of the Fourth ACM Conference on Recommender Systems. New York, pp. 237–240, (2010).
China Internet Network Information Center. The 37th Statistical Report on Internet Development in China. (2016).
Deng L, Huang J, Han Y, Zhou B, Liu Q: The Prediction of User Topic Interest Based on Tags and Interaction of Users. 2016 I.E. International Conference on Data Science in Cyberspace (ICDSC), (2016).
Ding, Z., Jia, Y., Zhou, B., et al.: Mining topical influencers based on the multirelational network in micro-blogging sites. China Communications. 10(1), 93–104 (2013)
Fan M, Zhou Q, Zheng T.F, “Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge”, Computational Linguistics and Intelligent Text Processing. Springer, 2 vol. 2014, 188–200, (2014).
GOLDER S H B A: The structure of collaborative tagging systems, Ithaca: Cornell University Library. (2005).
Hotho A, Jäschke R, Schmitz C, et al, Information retrieval in folksonomies: Search and ranking, Springer, (2006).
Katakis I, Tsoumakas G, Vlahavas I, Multilabel text classification for automated tag suggestion, ECML PKDD discovery challenge, pp. 75, (2008).
Li, H., Yan, J., Weihong, H., et al.: Mining user interest in microblogs with a user-topic model. Communications, China. 11(8), 131–144 (2014)
Liu, Z., Chen, X., Sun, M.: Mining the interests of Chinese microbloggers via keyword extraction. Frontiers of Computer Science in China. 6(1), 76–87 (2012)
Michelson, M, Macskassy, S.A: Discovering users topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, pp. 73–80, (2010).
Mihalcea R, Tarau P: TextRank: Bringing order into texts [C]. (2004).
Nie Y, Huang J, Li A, et al.: Identifying users based on behavioral-modeling across social media sites. Asia-Pacific Web Conference. Springer International Publishing: 48–55, (2014)
Nie, Y., Jia, Y., Li, S., et al.: Identifying users across social networks based on dynamic core interests. Neurocomputing. 210, 107–115 (2016)
Ohkura T, Kiyota Y, Nakagawa H: Browsing system for weblog articles based on automated folksonomy. In Proceedings of the WWW 2006 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW. (2006).
Porteous Ian, Newman David, Ihler Alexander, Asuncion Arthur, Smyth Padhraic, Welling Max: Fast collapsed gibbs sampling for latent dirichlet allocation, SIGKDD, (2008).
Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., et al.: Learning author-topic models from text corpora. ACM Transactions on Information Systems (TOIS). 28(1), 4 (2010)
Steinbach M, Kapypis G, Kumar V: A Comparison of Document C1ustering Techniques. Proceedings of KDD Workshop on Text Mining, pp. 109–111, (2000).
Wang, X., Jia, Y., Zhou, B., et al.: Interaction relation based user tag prediction in microblogging site. Computer Engineering & Science. 35(10), 44–50 (2013)
Wanga X, Jia Y, Chen RH, Zhou B: Ranking User Tags in Micro-blogging Website. 2015 International Conference on Information Science and Control Engineering, pp. 400–403, (2015).
Xiang, W., Jia, Y., Chen, R.H., et al.: Improving text categorization with semantic knowledge in Wikipedia. IEICE Trans. Inf. Syst. E96-D(12), 2786–2794 (2013)
Xu, Z., Fu, Y., Mao, J., et al.: “towards the semantic Web: collaborative tag suggestions”, in collaborative Web tagging workshop WWW2006. Edinburgh, Scotland (2006)
Xu Z, Lu R, Xiang L, et al: Discovering user interest on twitter with a modified author-topic model, In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on, pp. 422–429, (2011).
Zhang, B., Zhang, Y., Gao, K.-N., et al.: Combining relation and content analysis for social tagging recommendation. Ruanjian Xuebao/Journal of Software. 23(3), 476–488 (2012)
Acknowledgements
The authors would thank their colleagues, past and present, who contributed to the research described in this paper. The work described in this paper is partially supported by National Key Fundamental Research and Development Program (No.2013CB329601, No.2013CB329602, No.2013CB329604) and National Natural Science Foundation of China (No.61502517, No.61372191, No.61572492), 863 Program of China (Grant No. 2012AA01A401, 2012AA01A402, 2012AA013002), Project funded by China Postdoctoral Science Foundation (2013 M542560, 2015 T81129)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Deng, L., Jia, Y., Zhou, B. et al. User interest mining via tags and bidirectional interactions on Sina Weibo. World Wide Web 21, 515–536 (2018). https://doi.org/10.1007/s11280-017-0469-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0469-6