Skip to main content

Temporal Semantics: Time-Varying Hashtag Sense Clustering

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8876))

Abstract

Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. However, since hashtags are created in a spontaneous and highly dynamic way by users using multiple languages, the same topic can be associated to different hashtags and conversely, the same hashtag may imply different topics in different time spans. Contrary to common words, sense clustering for hashtags is complicated by the fact that no sense catalogues are available, like, e.g. Wikipedia or WordNet and furthermore, hashtag labels are often obscure. In this paper we propose a sense clustering algorithm based on temporal mining. First, hashtag time series are converted into strings of symbols using Symbolic Aggregate ApproXimation (SAX), then, hashtags are clustered based on string similarity and temporal co-occurrence. Evaluation is performed on two reference datasets of semantically tagged hashtags. We also perform a complexity evaluation of our algorithm, since efficiency is a crucial performance factor when processing large-scale data streams, such as Twitter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mehrota, R., Sanner, S.: Improving LDA Topic Models for Microblogs via Tweet Pooling and Automatic Labeling. In: SIGIR 2013, Dublin, July 28-August 1 (2013)

    Google Scholar 

  2. Tsur, O., Littman, A., Rappoport, A.: Efficient Clustering of Short Messages into General Domains. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media, ICWSM 2013 (2013)

    Google Scholar 

  3. Muntean, C.I., Morar, G.A., Moldovan, D.: Exploring the meaning behind twitter hashtags through clustering. In: Abramowicz, W., Domingue, J., Węcel, K. (eds.) BIS Workshops 2012. LNBIP, vol. 127, pp. 231–242. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  4. Ozdikis, O., Senkul, P., Oguztuzun, H.: Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter. In: VLDB 2012 WOSS, Istanbul, Turkey, August 31 (2012)

    Google Scholar 

  5. Carter, S., Tsagkias, M., Weerkamp, W.: Twitter hashtags: Joint Translation and Clustering. In: 3rd International Conference on Web Science, WebSci (2011)

    Google Scholar 

  6. Modi, A., Tinkerhess, M., Antenucci, D., Handy, G.: Classification of Tweets via clustering of hashtags. EECS 545 Final Project (2011)

    Google Scholar 

  7. Posch, L., et al.: Meaning as collective use: predicting semantic hashtag categories on twitter. In: Proceedings of the 22nd International Conference on World Wide Web Companion. International World Wide Web Conferences (2013)

    Google Scholar 

  8. Romero, D.M., Meeder, B., Kleinberg, J.: Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th International Conference Wide Web, ACM (2011)

    Google Scholar 

  9. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the fourth ACM International Conference on Web Search and Data Mining, pp. 177–186. ACM (2011)

    Google Scholar 

  10. Weng, J., Yao, Y., Leonardi, E., Lee, B.-S.: Event Detection in Twitter. In: ICWSM 2011 International AAAI Conference on Weblogs and Social Media (2011)

    Google Scholar 

  11. Xie, W., Zhu, F., Jang, J., Lim, E.-P., Wang, K.: TopicSketch: Real-time Bursty Topic Detection from Twitter. In: IEEE 13th International Conference on Data Mining, ICDM (2013)

    Google Scholar 

  12. Qin, Y., Zhang, Y., Zhang, M., Zheng, D.: Feature-Rich Segment-Based News Event Detection on Twitter. In: International Joint Conference on Natural Language Processing (2013)

    Google Scholar 

  13. Guzman, J., Poblete, B.: On-line Relevant Anomaly Detection in the Twitter Stream:An Efficient Bursty Keyword Detection Model. In: KDD 2013 (2013)

    Google Scholar 

  14. Osborne, M., Petrovic, S., McCreadie, R., Macdonald, C., Ounis, I.: Bieber no more: First Story Detection using Twitter and Wikipedia. In: TAIA 2012 (2012)

    Google Scholar 

  15. Diao, Q., Jiang, J., Zhu, F., Lim, E.-P.: Finding Bursty Topics from Microblogs. In: ACL (2012)

    Google Scholar 

  16. Naaman, M., Becker, H., Gravano, L.: Hips and Trendy: characterizing emerging trends on Twitter. JASIST (2011)

    Google Scholar 

  17. Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT 2010), pp. 181–189. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  18. Lin, J., Keogh, E., Li, W., Lonardi, S.: Experiencing SAX: A novel symbolic representation of time series. Data Mining and Knowledge Discovery 15(2), 107–144 (2007)

    Article  MathSciNet  Google Scholar 

  19. Oncina, J., Garcıa, P.: Inferring Regular Languages in Polynomial Updated Time. In: The 4th Spanish Symposium on Pattern Recognition and Image Analysis. MPAI, vol. 1, pp. 49–61. World Scientific (1992)

    Google Scholar 

  20. Jain, A.K.: Data clustering: 50 years beyond K –means. Pattern Recognition Letters 31, 651–666 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Stilo, G., Velardi, P. (2014). Temporal Semantics: Time-Varying Hashtag Sense Clustering. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds) Knowledge Engineering and Knowledge Management. EKAW 2014. Lecture Notes in Computer Science(), vol 8876. Springer, Cham. https://doi.org/10.1007/978-3-319-13704-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13704-9_42

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13703-2

  • Online ISBN: 978-3-319-13704-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics