On the impact of text similarity functions on hashtag recommendations in microblogging environments

Zangerle, Eva; Gassler, Wolfgang; Specht, Günther

doi:10.1007/s13278-013-0108-x

On the impact of text similarity functions on hashtag recommendations in microblogging environments

Original Article
Published: 26 March 2013

Volume 3, pages 889–898, (2013)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Eva Zangerle¹,
Wolfgang Gassler¹ &
Günther Specht¹

798 Accesses
41 Citations
4 Altmetric
Explore all metrics

Abstract

Microblogging applications such as Twitter are experiencing tremendous success. Microblog users utilize hashtags to categorize posted messages which aim at bringing order to the myriads of microblog messages. However, the percentage of messages incorporating hashtags is small and the used hashtags are very heterogeneous as hashtags may be chosen freely and may consist of any arbitrary combination of characters. This heterogeneity and the lack of use of hashtags lead to significant drawbacks in regards to the search functionality as messages are not categorized in a homogeneous way. In this paper, we present an approach for the recommendation of hashtags suitable for the message the user currently enters which aims at creating a more homogeneous set of hashtags. Furthermore, we present a detailed study on how the similarity measures used for the computation of recommendations influence the final set of recommended hashtags.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social media analytics: a survey of techniques, tools and platforms

Article Open access 26 July 2014

Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

Article 12 March 2023

Targeted marketing on social media: utilizing text analysis to create personalized landing pages

Article 04 April 2024

Notes

According to http://business.twitter.com/en/basics/what-is-twitter/.
According to http://yearinreview.twitter.com/de/tps.html.
http://dev.twitter.com/docs/streaming-apis.

References

Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Article Google Scholar
Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’07. ACM, New York, pp 971–980
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, New York
Balabanović M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Commun ACM 40:66–72
Article Google Scholar
Bollen D, Knijnenburg BP, Willemsen MC, Graus M (2010) Understanding choice overload in recommender systems. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 63–70
Boyd D, Golder S, Lotan G (1899) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: HICSS, IEEE Computer Society, pp 1–10
Chen J, Nairn R, Nelson L, Bernstein M, Chi E (2010) Short and tweet: experiments on recommending content from information streams. In: Proceedings of the 28th international conference on Human factors in computing systems. ACM, New York, pp 1185–1194
Cremonesi P, Turrin R, Lentini E, Matteucci M (2008) An evaluation methodology for collaborative recommender systems. In: IEEE International Conference on Automated solutions for Cross Media Content and Multi-channel Distribution, 2008. AXMEDIS’08, pp 224–231
Dice L (1945) Measures of the amount of ecologic association between species. Ecol Freshw Fish 26(3):297–302
Article Google Scholar
Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Morgan Kaufmann Publishers Inc., pp 1606–1611
Garg N, Weber I (2008) Personalized, interactive tag recommendation for flickr. In: Proceedings of the 2008 ACM conference on Recommender systems, RecSys ’08. ACM, New York, pp 67–74
Gassler W, Zangerle E, Specht G (2011) The snoopy concept: fighting heterogeneity in semistructured and collaborative information systems by using recommendations. In: The 2011 International Conference on Collaboration Technologies and Systems (CTS 2011), Philadelphia
Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys ’10: Proceedings of the fourth ACM conference on Recommender systems. ACM, New York, pp 199–206
Honeycutt C, Herring SC (2009) Beyond microblogging: conversation and collaboration via Twitter. In: HICSS, IEEE Computer Society, pp 1–10
Huberman B, Romero D, Wu F (2009) Social networks that matter: Twitter under the microscope. First Monday 14(1):8
Google Scholar
Jaccard P (1901) Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37:547–579
Google Scholar
Jaeschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in Folksonomies. In: Kok J, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenic D, Skowron A (eds) Knowledge discovery in databases: PKDD 2007, vol 4702 of Lecture Notes in Computer Science. Springer, Berlin, pp 506–514
Java A, Song X, Finin T, Tseng B (2007) Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, New York, pp 56–65
Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21
Article Google Scholar
Krishnamurthy B, Gill P, Arlitt M (2008) A few chirps about twitter. In: Proceedings of the first workshop on Online social networks. ACM, New York, pp 19–24
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web. ACM, New York, pp 591–600
Levenshtein V (1965) Binary codes with correction for deletions and insertions of the symbol 1. Problemy Peredachi Informatsii 1(1):12–25
MathSciNet Google Scholar
Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710
MathSciNet Google Scholar
Lipczak M, Milios E (2010) Learning in efficient tag recommendation. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 167–174
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the seventeenth conference on Hypertext and hypermedia, HT ’06. ACM, New York, pp 31–40
Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the national conference on artificial intelligence, vol 21. AAAI Press, Menlo Park; MIT Press, London
Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81
Article Google Scholar
Nishida K, Banno K, Fujimura K, Hoshide T (2011) Tweet classification by data compression. In: Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, New York, pp 29–34
Pazzani M, Billsus D (2007) Content-based recommendation systems. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web, vol 4321 of Lecture Notes in Computer Science. Springer, Berlin, pp 325–341
Phelan O, McCarthy K, Smyth B (2009) Using twitter to recommend real-time topical news. In: Proceedings of the third ACM conference on recommender systems. ACM, New York, pp 385–388
Rae A, Sigurbjörnsson B, van Zwol R (2010) Improving tag recommendation using social networks. In: Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ’10. Le Centre de Hautes Etudes Internationales d’Informatique Documentaire, Paris, pp 92–99
Resnick P, Varian H (1997) Recommender systems. Commun ACM 40(3):58
Article Google Scholar
Robertson SE, Walker S, Jones S, Hancock-Beaulieu M, Gatford M (1994) Okapi at TREC-3. In: Proceedings of the Text Retrieval Conference (TREC). National Institute of Standards and Technology, Gaithersburg, pp 109–126
Romero DM, Meeder B, Kleinberg JM (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino E, Kumar R (eds) WWW. ACM, New York, pp 695–704
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523
Article Google Scholar
Schedl M (2010) On the use of microblogging posts for similarity estimation and artist labeling. In: Downie JS, Veltkamp RC (eds) ISMIR, International Society for Music Information Retrieval, pp 447–452
Schedl M (2012) # nowplaying madonna: a large-scale evaluation on estimating similarities between music artists and between movies from microblogs. Inf Retr 1–35
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47
Article Google Scholar
Sen S, Vig J, Riedl J (2009) Tagommenders: connecting users to items through tags. In: Proceedings of the 18th international conference on world wide web, WWW ’09. ACM, New York, pp 671–680
Sigurbjörnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web. ACM, New York, pp 327–336
Tatu M, Srikanth M, D’Silva T (2008) RSDC’08: Tag recommendations using bookmark content. In: Workshop at 18th European Conference on Machine Learning (ECML’08)/11th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD08
Ye S, Wu S (2010) Measuring Message Propagation and Social Influence on Twitter. com. In: Proceedings of Second International Conference, Socinfo 2010, on Social Informatics, Laxenburg. Springer, New York, pp 216–231
Zangerle E, Gassler W, Specht G (2011) Using tag recommendations to homogenize folksonomies in microblogging environments. In: Bolc L, Makowski M, Wierzbicki A (eds) Proceedings of Third International Conference, SocInfo 2011, on Social Informatics, Singapore, vol 6430 of Lecture Notes in Computer Science. Springer, Berlin, pp 1–18

Download references

Author information

Authors and Affiliations

Databases and Information Systems Institute of Computer Science, University of Innsbruck, Innsbruck, Austria
Eva Zangerle, Wolfgang Gassler & Günther Specht

Authors

Eva Zangerle
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Gassler
View author publications
You can also search for this author in PubMed Google Scholar
Günther Specht
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eva Zangerle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zangerle, E., Gassler, W. & Specht, G. On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc. Netw. Anal. Min. 3, 889–898 (2013). https://doi.org/10.1007/s13278-013-0108-x

Download citation

Received: 22 May 2012
Revised: 03 February 2013
Accepted: 13 March 2013
Published: 26 March 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s13278-013-0108-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the impact of text similarity functions on hashtag recommendations in microblogging environments

Abstract

Access this article

Similar content being viewed by others

Social media analytics: a survey of techniques, tools and platforms

Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

Targeted marketing on social media: utilizing text analysis to create personalized landing pages

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the impact of text similarity functions on hashtag recommendations in microblogging environments

Abstract

Access this article

Similar content being viewed by others

Social media analytics: a survey of techniques, tools and platforms

Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

Targeted marketing on social media: utilizing text analysis to create personalized landing pages

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation