Abstract
Quite a number of recent works have concentrated on the task of recommending to Twitter users whom they should follow, among which, the WTF (Who To Follow) service provided by Twitter. Recommenders are based, either on the user’s network structure, or on some notion of topical similarity with other users, or on both. In this paper, we propose to accomplish the recommendation task in two steps: First, we profile users and classify them as belonging to a target community (depending e.g., on their political affiliation, preferred football team, favorite coffee shop, etc.). Then, we fine-tune recommendations for selected populations. We cast both problems of user classification and recommendation as one of itemset mining, where items are either users’ authoritative friends or semantic categories associated to friends, extracted from WiBi, the Wikipedia Bitaxonomy. In addition to evaluating our profiler and recommender on several populations, we also show that semantic categories allow for very fine-grained population studies, and make it possible to recommend not only whom to follow, but also topics of interest, users interested in the same topic, and more.












Similar content being viewed by others
Notes
The interested reader is invited to read the WiBi paper (Flati et al. 2014) for additional details.
hereafter, we omit the k index to simplify notation.
FP-Growth implementation http://www.borgelt.net/fpgrowth.html.
leave-one-out is the standard evaluation procedure in user recommendation literature.
remember that at level zero, items are Twitter users accounts.
this may seem obvious, however it is often the case that quantitative data contradict intuition.
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Proceedings of the 20th International Conference on Very Large Data Bases., VLDB ’94Morgan Kaufmann Publishers Inc. San Francisco, pp 487–499
Armentano M, Godoy D, Amandi A (2011) A topology-based approach for followees recommendation in twitter. In: 9th Workshop on Intelligent Techniques for Web Personalization and Recommender Systems. Barcelona, Spain
Aroyo L, Welty C (2013) Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. In: Proceedings of ACM Web Science. Paris, France
Barbieri N, Manco G, Bonchi F (2014) Who to follow and why: Link prediction with explanations. In: Proceedings of The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014). New York City
Burger JD, Henderson J, Kim G, Zarrella G (2011) Discriminating Gender on Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Edinburgh, Scotland, pp 1301–1309
Cagliero L, Fiori A, Grimaudo L (2013) Analyzing Twitter user-generated content changes, chap. 5. IGI Global, pp 87–109
Chen K, Chen T, Zheng G, Jin O, Yao E, Yu Y (2012) Collaborative personalized tweet recommendation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12. pp 661–670
Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: A content-based approach to geo-locating twitter users. Proceedings of the 19th ACM International Conference on Information and Knowledge Management. CIKM ’10ON, Canada, Toronto, pp 759–768
Ciot M, Sonderegger M, Ruths D (2013) Gender inference of Twitter users in non-English contexts. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp 1136–1145
Cui X, Shi H, Yi X (2014) Application of association rule mining theory in sina weibo. J Comput Commun 2(1):19–26. doi:10.4236/jcc.2014.21004
Dey R, Tang C, Ross K, Saxena N (2012) Estimating Age Privacy Leakage in Online Social Networks. In: Proceedings of IEEE INFOCOM. Orlando, pp 2836–2840
Dong Y, Tang J, Wu S, Tian J, Chawla N, Rao J, Cao H (2012) Link prediction and recommendation across heterogeneous social networks. In: IEEE 12th International Conference on Data Mining (ICDM). pp 181–190
Fink C, Kopecky J, Morawski M (2012) Inferring gender from the content of tweets: A region specific example. In: International AAAI Conference on Weblogs and Social Media
Flati T, Vannella D, Pasini T, Navigli R (2014) Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014). Association for Computational Linguistics, Baltimore, Maryland. pp 945–955
Garcia R, Amatriain X (2010) Weighted content based methods for recommending connections in online social networks. In: The 2nd ACM Workshop on Recommendation Systems and the Social Web. Barcelona, Spain
Gupta P, Goel A, Lin J, Sharma A, Wang D, Zadeh R (2013) Wtf: The who to follow service at twitter. In: Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13. pp 505–514
Han J, Kamber M, Pei J (2011) Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco
Han J, Pei J (2000) Mining frequent patterns by pattern-growth: methodology and implications. SIGKDD Explor 2(2):14–20
Hannigan J, Hernandez G, Medina RM, Roos P, Shakarian P (2013) Mining for spatially-near communities in geo-located social networks. CoRR abs/1309.2900
Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys ’10. pp 199–206
Hannon J, McCarthy K, Smyth B (2011) Finding useful users on twitter: Twittomender the followee recommender. In: Clough P, Foley C, Gurrin C, Jones G, Kraaij W, Lee H, Mudoch V (eds) Advances in Information Retrieval, Lecture Notes in Computer Science, vol 6611. Springer, Berlin Heidelberg, pp 784–787
He J, Chu WW, Liu ZV (2006) Inferring privacy information from social networks. In: Proceedings of the 4th IEEE International Conference on Intelligence and Security Informatics. ISI’06, pp 154–165
Hillard D, Ostendorf M, Shriberg E (2003) Detection of agreement vs. disagreement in meetings: Training with unlabeled data. In: Proceedings of HLT-NAACL, NAACL-Short ’03. pp 34–36
Hofmann T (2004) Latent semantic models for collaborative filtering. ACM Trans Inf Syst 22(1):89–115
Hong L, Davison BD (2010) Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, SOMA ’10. pp 80–88
Ikeda K, Hattori G, Ono C, Asoh H, Higashino T (2013) Twitter user profiling based on text and community mining for market analysis. Knowl Based Syst 51:35–47
Joo PY, Alexander T (2008) The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM Conference on Recommender Systems., RecSys ’08ACM. New York, pp 11–18
Jun Li, Shuchao M, Shuang H (2012) Recommendation on Social Network Based on Graph Model. In: Proceedings of 31st Chinese Control Conference. Hefei, China, pp 7548–7551
Kapanipathi P, Jain P, Venkataramani C, Sheth A (2014) User interests identification on twitter using a hierarchical knowledge base. In: Presutti V, d’Amato C, Gandon F, d’Aquin M, Staab S, Tordai A (eds) The Semantic Web: Trends and Challenges, Lecture Notes in Computer Science, vol. 8465. Springer International Publishing, pp. 99–113 (2014)
Kim D, Yum BJ (2005) Collaborative filtering based on iterative principal component analysis. Expert Syst Appl 28(4):823–830
Kosters WA, Pijls W, Popova V (2003) Complexity analysis of depth first and fp-growth implementations of apriori. In: Proceedings of the 3rd International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM’03. pp 284–292
Kywe S, Lim EP, Zhu F (2012) A survey of recommender systems in twitter. In: Aberer K, Flache A, Jager W, Liu L, Tang J, Guéret C (eds) Social Informatics, Lecture Notes in Computer Science, vol 7710. Springer, Berlin, Heidelberg, pp 420–433
Langhnoja SG, Barot MP, Mehta DB (2013) Web usage mining using association rule mining on clustered data for pattern discovery. International Journal of Data Mining Techniques and Applications, Integrated Intelligent Research (IIR)
Li J, Ritter A, Hovy EH (2014) Weakly supervised user profile extraction from twitter. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Long Papers, vol. 1. Baltimore, pp 165–174. URL http://aclweb.org/anthology/P/P14/P14-1016.pdf
Li Q, Wang J, Chen YP, Lin Z (2010) User comments for news recommendation in forum-based social media. Inf Sci 180(24):4929–4939
Lu C, Lam W, Zhang Y (2012) Twitter user modeling and tweets recommendation based on wikipedia concept graph. Tech. Rep. WS-12-09, AAAI Technical Report
Middleton SE, Shadbolt NR, De Roure DC (2004) Ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88. doi:10.1145/963770.963773
Misra A, Walker M (2013) Topic independent identification of agreement and disagreement in social media dialogue. In: Proceedings of the SIGDIAL 2013 Conference. Metz, France, pp 41–50
Moro A, Raganato A, Navigli R (2014) Entity Linking meets Word Sense Disambiguation: a Unified Approach. Trans Assoc Comput Linguist 2:231–244
Myers SA, Leskovec J (2014) The bursty dynamics of the twitter information network. CoRR abs/1403.2732
Navigli R, Ponzetto SP (2012) BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif Intell 193:217–250
Peng J, Zeng DD, Zhao H, Wang Fy (2010) Collaborative filtering in social tagging systems based on joint item-tag recommendations. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10. ACM, New York, pp 809–818. doi:10.1145/1871437.1871541
Pennacchiotti M, Popescu AM (2011) A machine learning approach to twitter user classification. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. Barcelona, Spain, pp 281–288
Siehndel P, Kawase R (2012) Twikime!—user profiles that make sense. In: International Semantic Web Conference (Posters & Demos)
Stilo G, Velardi P (2014) Time makes sense: Event Discovery in Twitter using Temporal Similarity. In: IEEE/WIC/ACM International Conference on Web Intelligence (WI-’14) (to appear). Warsaw, Poland
Yu X, Ma H, Hsu BJP, Han J (2014) On building entity recommender systems using user click log and freebase knowledge. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM ’14. pp 263–272. doi:10.1145/2556195.2556233
Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. Dublin, Ireland, pp 387–390
Zhang W, Ansari S (2013) A framework for profiling and friend prediction on twitter. Master’s thesis (2013)
Author information
Authors and Affiliations
Corresponding author
Appendix: FP-growth
Appendix: FP-growth



A frequent-pattern tree FP-tree is a compact structure that stores quantitative information about frequent patterns in a database. Han et al. define the FP-tree structure as follows (see Fig. 13 for an FP-tree instance example from Han et al. (2011)):
-
One root labeled as “null” with a set of item-prefix subtrees as children, and a frequent-item-header table (presented in the left side of Fig. 13);
-
Each node in the item-prefix subtree consists of three fields:
-
Item-name: registers which item is represented by the node;
-
Count: the number of transactions represented by the portion of the path reaching the node;
-
Node-link: links to the next node in the FP-tree carrying the same item-name, or null if there is none.
-
-
Each entry in the frequent-item-header table consists of two fields:
-
Item-name: as the same to the node;
-
Head of node-link: a pointer to the first node in the FP-tree carrying the item-name.
-
-
Additionally, the frequent-item-header table can have the count support for an item.
Algorithms 1 and 2 show how to create an FP-Tree structure and the Algorithm 3 shows the steps to obtain the complete set of frequent patterns starting from a given FP-Tree. Additional details can be found in Han and Pei (2000).
Rights and permissions
About this article
Cite this article
Faralli, S., Stilo, G. & Velardi, P. Recommendation of microblog users based on hierarchical interest profiles. Soc. Netw. Anal. Min. 5, 25 (2015). https://doi.org/10.1007/s13278-015-0264-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0264-2