Skip to main content
Log in

Recommendation of microblog users based on hierarchical interest profiles

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Quite a number of recent works have concentrated on the task of recommending to Twitter users whom they should follow, among which, the WTF (Who To Follow) service provided by Twitter. Recommenders are based, either on the user’s network structure, or on some notion of topical similarity with other users, or on both. In this paper, we propose to accomplish the recommendation task in two steps: First, we profile users and classify them as belonging to a target community (depending e.g., on their political affiliation, preferred football team, favorite coffee shop, etc.). Then, we fine-tune recommendations for selected populations. We cast both problems of user classification and recommendation as one of itemset mining, where items are either users’ authoritative friends or semantic categories associated to friends, extracted from WiBi, the Wikipedia Bitaxonomy. In addition to evaluating our profiler and recommender on several populations, we also show that semantic categories allow for very fine-grained population studies, and make it possible to recommend not only whom to follow, but also topics of interest, users interested in the same topic, and more.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://expandedramblings.com/index.php/march-2013-by-the-numbers-a-few-amazing-twitter-stats/#.U5gOgS8Q6X0.

  2. http://www.scientificamerican.com/article/twitter-to-release-all-tweets-to-scientists-a-trove-of-billions-of-tweets-will-be-a-research-boon-and-an-ethical-dilemma/.

  3. http://babelnet.org/, http://babelfy.org/.

  4. The interested reader is invited to read the WiBi paper (Flati et al. 2014) for additional details.

  5. hereafter, we omit the k index to simplify notation.

  6. FP-Growth implementation http://www.borgelt.net/fpgrowth.html.

  7. leave-one-out is the standard evaluation procedure in user recommendation literature.

  8. remember that at level zero, items are Twitter users accounts.

  9. http://babelnet.org/explore.jsp.

  10. this may seem obvious, however it is often the case that quantitative data contradict intuition.

References

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Proceedings of the 20th International Conference on Very Large Data Bases., VLDB ’94Morgan Kaufmann Publishers Inc. San Francisco, pp 487–499

  • Armentano M, Godoy D, Amandi A (2011) A topology-based approach for followees recommendation in twitter. In: 9th Workshop on Intelligent Techniques for Web Personalization and Recommender Systems. Barcelona, Spain

  • Aroyo L, Welty C (2013) Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. In: Proceedings of ACM Web Science. Paris, France

  • Barbieri N, Manco G, Bonchi F (2014) Who to follow and why: Link prediction with explanations. In: Proceedings of The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014). New York City

  • Burger JD, Henderson J, Kim G, Zarrella G (2011) Discriminating Gender on Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Edinburgh, Scotland, pp 1301–1309

  • Cagliero L, Fiori A, Grimaudo L (2013) Analyzing Twitter user-generated content changes, chap. 5. IGI Global, pp 87–109

  • Chen K, Chen T, Zheng G, Jin O, Yao E, Yu Y (2012) Collaborative personalized tweet recommendation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12. pp 661–670

  • Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: A content-based approach to geo-locating twitter users. Proceedings of the 19th ACM International Conference on Information and Knowledge Management. CIKM ’10ON, Canada, Toronto, pp 759–768

  • Ciot M, Sonderegger M, Ruths D (2013) Gender inference of Twitter users in non-English contexts. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp 1136–1145

  • Cui X, Shi H, Yi X (2014) Application of association rule mining theory in sina weibo. J Comput Commun 2(1):19–26. doi:10.4236/jcc.2014.21004

  • Dey R, Tang C, Ross K, Saxena N (2012) Estimating Age Privacy Leakage in Online Social Networks. In: Proceedings of IEEE INFOCOM. Orlando, pp 2836–2840

  • Dong Y, Tang J, Wu S, Tian J, Chawla N, Rao J, Cao H (2012) Link prediction and recommendation across heterogeneous social networks. In: IEEE 12th International Conference on Data Mining (ICDM). pp 181–190

  • Fink C, Kopecky J, Morawski M (2012) Inferring gender from the content of tweets: A region specific example. In: International AAAI Conference on Weblogs and Social Media

  • Flati T, Vannella D, Pasini T, Navigli R (2014) Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014). Association for Computational Linguistics, Baltimore, Maryland. pp 945–955

  • Garcia R, Amatriain X (2010) Weighted content based methods for recommending connections in online social networks. In: The 2nd ACM Workshop on Recommendation Systems and the Social Web. Barcelona, Spain

  • Gupta P, Goel A, Lin J, Sharma A, Wang D, Zadeh R (2013) Wtf: The who to follow service at twitter. In: Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13. pp 505–514

  • Han J, Kamber M, Pei J (2011) Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  • Han J, Pei J (2000) Mining frequent patterns by pattern-growth: methodology and implications. SIGKDD Explor 2(2):14–20

    Article  Google Scholar 

  • Hannigan J, Hernandez G, Medina RM, Roos P, Shakarian P (2013) Mining for spatially-near communities in geo-located social networks. CoRR abs/1309.2900

  • Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys ’10. pp 199–206

  • Hannon J, McCarthy K, Smyth B (2011) Finding useful users on twitter: Twittomender the followee recommender. In: Clough P, Foley C, Gurrin C, Jones G, Kraaij W, Lee H, Mudoch V (eds) Advances in Information Retrieval, Lecture Notes in Computer Science, vol 6611. Springer, Berlin Heidelberg, pp 784–787

    Chapter  Google Scholar 

  • He J, Chu WW, Liu ZV (2006) Inferring privacy information from social networks. In: Proceedings of the 4th IEEE International Conference on Intelligence and Security Informatics. ISI’06, pp 154–165

  • Hillard D, Ostendorf M, Shriberg E (2003) Detection of agreement vs. disagreement in meetings: Training with unlabeled data. In: Proceedings of HLT-NAACL, NAACL-Short ’03. pp 34–36

  • Hofmann T (2004) Latent semantic models for collaborative filtering. ACM Trans Inf Syst 22(1):89–115

    Article  Google Scholar 

  • Hong L, Davison BD (2010) Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, SOMA ’10. pp 80–88

  • Ikeda K, Hattori G, Ono C, Asoh H, Higashino T (2013) Twitter user profiling based on text and community mining for market analysis. Knowl Based Syst 51:35–47

    Article  Google Scholar 

  • Joo PY, Alexander T (2008) The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM Conference on Recommender Systems., RecSys ’08ACM. New York, pp 11–18

  • Jun Li, Shuchao M, Shuang H (2012) Recommendation on Social Network Based on Graph Model. In: Proceedings of 31st Chinese Control Conference. Hefei, China, pp 7548–7551

  • Kapanipathi P, Jain P, Venkataramani C, Sheth A (2014) User interests identification on twitter using a hierarchical knowledge base. In: Presutti V, d’Amato C, Gandon F, d’Aquin M, Staab S, Tordai A (eds) The Semantic Web: Trends and Challenges, Lecture Notes in Computer Science, vol. 8465. Springer International Publishing, pp. 99–113 (2014)

  • Kim D, Yum BJ (2005) Collaborative filtering based on iterative principal component analysis. Expert Syst Appl 28(4):823–830

    Article  Google Scholar 

  • Kosters WA, Pijls W, Popova V (2003) Complexity analysis of depth first and fp-growth implementations of apriori. In: Proceedings of the 3rd International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM’03. pp 284–292

  • Kywe S, Lim EP, Zhu F (2012) A survey of recommender systems in twitter. In: Aberer K, Flache A, Jager W, Liu L, Tang J, Guéret C (eds) Social Informatics, Lecture Notes in Computer Science, vol 7710. Springer, Berlin, Heidelberg, pp 420–433

    Google Scholar 

  • Langhnoja SG, Barot MP, Mehta DB (2013) Web usage mining using association rule mining on clustered data for pattern discovery. International Journal of Data Mining Techniques and Applications, Integrated Intelligent Research (IIR)

  • Li J, Ritter A, Hovy EH (2014) Weakly supervised user profile extraction from twitter. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Long Papers, vol. 1. Baltimore, pp 165–174. URL http://aclweb.org/anthology/P/P14/P14-1016.pdf

  • Li Q, Wang J, Chen YP, Lin Z (2010) User comments for news recommendation in forum-based social media. Inf Sci 180(24):4929–4939

    Article  Google Scholar 

  • Lu C, Lam W, Zhang Y (2012) Twitter user modeling and tweets recommendation based on wikipedia concept graph. Tech. Rep. WS-12-09, AAAI Technical Report

  • Middleton SE, Shadbolt NR, De Roure DC (2004) Ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88. doi:10.1145/963770.963773

  • Misra A, Walker M (2013) Topic independent identification of agreement and disagreement in social media dialogue. In: Proceedings of the SIGDIAL 2013 Conference. Metz, France, pp 41–50

  • Moro A, Raganato A, Navigli R (2014) Entity Linking meets Word Sense Disambiguation: a Unified Approach. Trans Assoc Comput Linguist 2:231–244

    Google Scholar 

  • Myers SA, Leskovec J (2014) The bursty dynamics of the twitter information network. CoRR abs/1403.2732

  • Navigli R, Ponzetto SP (2012) BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif Intell 193:217–250

    Article  MATH  MathSciNet  Google Scholar 

  • Peng J, Zeng DD, Zhao H, Wang Fy (2010) Collaborative filtering in social tagging systems based on joint item-tag recommendations. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10. ACM, New York, pp 809–818. doi:10.1145/1871437.1871541

  • Pennacchiotti M, Popescu AM (2011) A machine learning approach to twitter user classification. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. Barcelona, Spain, pp 281–288

  • Siehndel P, Kawase R (2012) Twikime!—user profiles that make sense. In: International Semantic Web Conference (Posters & Demos)

  • Stilo G, Velardi P (2014) Time makes sense: Event Discovery in Twitter using Temporal Similarity. In: IEEE/WIC/ACM International Conference on Web Intelligence (WI-’14) (to appear). Warsaw, Poland

  • Yu X, Ma H, Hsu BJP, Han J (2014) On building entity recommender systems using user click log and freebase knowledge. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM ’14. pp 263–272. doi:10.1145/2556195.2556233

  • Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. Dublin, Ireland, pp 387–390

  • Zhang W, Ansari S (2013) A framework for profiling and friend prediction on twitter. Master’s thesis (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paola Velardi.

Appendix: FP-growth

Appendix: FP-growth

Fig. 13
figure 13

An FP-tree instance

figure a
figure b
figure c

A frequent-pattern tree FP-tree is a compact structure that stores quantitative information about frequent patterns in a database. Han et al. define the FP-tree structure as follows (see Fig. 13 for an FP-tree instance example from Han et al. (2011)):

  • One root labeled as “null” with a set of item-prefix subtrees as children, and a frequent-item-header table (presented in the left side of Fig. 13);

  • Each node in the item-prefix subtree consists of three fields:

    • Item-name: registers which item is represented by the node;

    • Count: the number of transactions represented by the portion of the path reaching the node;

    • Node-link: links to the next node in the FP-tree carrying the same item-name, or null if there is none.

  • Each entry in the frequent-item-header table consists of two fields:

    • Item-name: as the same to the node;

    • Head of node-link: a pointer to the first node in the FP-tree carrying the item-name.

  • Additionally, the frequent-item-header table can have the count support for an item.

Algorithms 1 and 2 show how to create an FP-Tree structure and the Algorithm 3 shows the steps to obtain the complete set of frequent patterns starting from a given FP-Tree. Additional details can be found in Han and Pei (2000).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Faralli, S., Stilo, G. & Velardi, P. Recommendation of microblog users based on hierarchical interest profiles. Soc. Netw. Anal. Min. 5, 25 (2015). https://doi.org/10.1007/s13278-015-0264-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-015-0264-2

Keywords

Navigation