Skip to main content
Log in

On the impact of text similarity functions on hashtag recommendations in microblogging environments

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Microblogging applications such as Twitter are experiencing tremendous success. Microblog users utilize hashtags to categorize posted messages which aim at bringing order to the myriads of microblog messages. However, the percentage of messages incorporating hashtags is small and the used hashtags are very heterogeneous as hashtags may be chosen freely and may consist of any arbitrary combination of characters. This heterogeneity and the lack of use of hashtags lead to significant drawbacks in regards to the search functionality as messages are not categorized in a homogeneous way. In this paper, we present an approach for the recommendation of hashtags suitable for the message the user currently enters which aims at creating a more homogeneous set of hashtags. Furthermore, we present a detailed study on how the similarity measures used for the computation of recommendations influence the final set of recommended hashtags.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. According to http://business.twitter.com/en/basics/what-is-twitter/.

  2. According to http://yearinreview.twitter.com/de/tps.html.

  3. http://dev.twitter.com/docs/streaming-apis.

References

  • Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749

    Article  Google Scholar 

  • Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’07. ACM, New York, pp 971–980

  • Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, New York

  • Balabanović M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Commun ACM 40:66–72

    Article  Google Scholar 

  • Bollen D, Knijnenburg BP, Willemsen MC, Graus M (2010) Understanding choice overload in recommender systems. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 63–70

  • Boyd D, Golder S, Lotan G (1899) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: HICSS, IEEE Computer Society, pp 1–10

  • Chen J, Nairn R, Nelson L, Bernstein M, Chi E (2010) Short and tweet: experiments on recommending content from information streams. In: Proceedings of the 28th international conference on Human factors in computing systems. ACM, New York, pp 1185–1194

  • Cremonesi P, Turrin R, Lentini E, Matteucci M (2008) An evaluation methodology for collaborative recommender systems. In: IEEE International Conference on Automated solutions for Cross Media Content and Multi-channel Distribution, 2008. AXMEDIS’08, pp 224–231

  • Dice L (1945) Measures of the amount of ecologic association between species. Ecol Freshw Fish 26(3):297–302

    Article  Google Scholar 

  • Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Morgan Kaufmann Publishers Inc., pp 1606–1611

  • Garg N, Weber I (2008) Personalized, interactive tag recommendation for flickr. In: Proceedings of the 2008 ACM conference on Recommender systems, RecSys ’08. ACM, New York, pp 67–74

  • Gassler W, Zangerle E, Specht G (2011) The snoopy concept: fighting heterogeneity in semistructured and collaborative information systems by using recommendations. In: The 2011 International Conference on Collaboration Technologies and Systems (CTS 2011), Philadelphia

  • Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys ’10: Proceedings of the fourth ACM conference on Recommender systems. ACM, New York, pp 199–206

  • Honeycutt C, Herring SC (2009) Beyond microblogging: conversation and collaboration via Twitter. In: HICSS, IEEE Computer Society, pp 1–10

  • Huberman B, Romero D, Wu F (2009) Social networks that matter: Twitter under the microscope. First Monday 14(1):8

    Google Scholar 

  • Jaccard P (1901) Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37:547–579

    Google Scholar 

  • Jaeschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in Folksonomies. In: Kok J, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenic D, Skowron A (eds) Knowledge discovery in databases: PKDD 2007, vol 4702 of Lecture Notes in Computer Science. Springer, Berlin, pp 506–514

  • Java A, Song X, Finin T, Tseng B (2007) Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, New York, pp 56–65

  • Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21

    Article  Google Scholar 

  • Krishnamurthy B, Gill P, Arlitt M (2008) A few chirps about twitter. In: Proceedings of the first workshop on Online social networks. ACM, New York, pp 19–24

  • Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web. ACM, New York, pp 591–600

  • Levenshtein V (1965) Binary codes with correction for deletions and insertions of the symbol 1. Problemy Peredachi Informatsii 1(1):12–25

    MathSciNet  Google Scholar 

  • Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710

    MathSciNet  Google Scholar 

  • Lipczak M, Milios E (2010) Learning in efficient tag recommendation. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 167–174

  • Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

  • Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the seventeenth conference on Hypertext and hypermedia, HT ’06. ACM, New York, pp 31–40

  • Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the national conference on artificial intelligence, vol 21. AAAI Press, Menlo Park; MIT Press, London

  • Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81

    Article  Google Scholar 

  • Nishida K, Banno K, Fujimura K, Hoshide T (2011) Tweet classification by data compression. In: Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, New York, pp 29–34

  • Pazzani M, Billsus D (2007) Content-based recommendation systems. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web, vol 4321 of Lecture Notes in Computer Science. Springer, Berlin, pp 325–341

  • Phelan O, McCarthy K, Smyth B (2009) Using twitter to recommend real-time topical news. In: Proceedings of the third ACM conference on recommender systems. ACM, New York, pp 385–388

  • Rae A, Sigurbjörnsson B, van Zwol R (2010) Improving tag recommendation using social networks. In: Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ’10. Le Centre de Hautes Etudes Internationales d’Informatique Documentaire, Paris, pp 92–99

  • Resnick P, Varian H (1997) Recommender systems. Commun ACM 40(3):58

    Article  Google Scholar 

  • Robertson SE, Walker S, Jones S, Hancock-Beaulieu M, Gatford M (1994) Okapi at TREC-3. In: Proceedings of the Text Retrieval Conference (TREC). National Institute of Standards and Technology, Gaithersburg, pp 109–126

  • Romero DM, Meeder B, Kleinberg JM (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino E, Kumar R (eds) WWW. ACM, New York, pp 695–704

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523

    Article  Google Scholar 

  • Schedl M (2010) On the use of microblogging posts for similarity estimation and artist labeling. In: Downie JS, Veltkamp RC (eds) ISMIR, International Society for Music Information Retrieval, pp 447–452

  • Schedl M (2012) # nowplaying madonna: a large-scale evaluation on estimating similarities between music artists and between movies from microblogs. Inf Retr 1–35

  • Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47

    Article  Google Scholar 

  • Sen S, Vig J, Riedl J (2009) Tagommenders: connecting users to items through tags. In: Proceedings of the 18th international conference on world wide web, WWW ’09. ACM, New York, pp 671–680

  • Sigurbjörnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web. ACM, New York, pp 327–336

  • Tatu M, Srikanth M, D’Silva T (2008) RSDC’08: Tag recommendations using bookmark content. In: Workshop at 18th European Conference on Machine Learning (ECML’08)/11th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD08

  • Ye S, Wu S (2010) Measuring Message Propagation and Social Influence on Twitter. com. In: Proceedings of Second International Conference, Socinfo 2010, on Social Informatics, Laxenburg. Springer, New York, pp 216–231

  • Zangerle E, Gassler W, Specht G (2011) Using tag recommendations to homogenize folksonomies in microblogging environments. In: Bolc L, Makowski M, Wierzbicki A (eds) Proceedings of Third International Conference, SocInfo 2011, on Social Informatics, Singapore, vol 6430 of Lecture Notes in Computer Science. Springer, Berlin, pp 1–18

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eva Zangerle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zangerle, E., Gassler, W. & Specht, G. On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc. Netw. Anal. Min. 3, 889–898 (2013). https://doi.org/10.1007/s13278-013-0108-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13278-013-0108-x

Keywords

Navigation