Skip to main content

Hashtag Recommendation Using Word Sequences’ Embeddings

  • Conference paper
  • First Online:
Big Data, Cloud and Applications (BDCA 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 872))

Included in the following conference series:

Abstract

Nowadays, billions of people use social networks such as Twitter. Twitter users create and use hashtags in their tweets to classify them corresponding to topic or theme. Hashtags have been progressed into a multifaceted instrument to tag and track content, emphasise a standpoint or galvanise communal support across published posts on social networks. Although, by dint of the free hashtag creation strategy, users are having a broad toughness to choose suitable hashtags for their posts. In this paper, we introduce an approach for hashtag recommendation in Twitter based on tweets embeddings. We first make use of multiple techniques to calculate embeddings of the tweets in the corpus. Next, we use the k-means clustering algorithm in order to divide the heterogeneous tweets into clusters of similar tweets. Afterwards, we compute the similarity between the entered tweet embeddings and the centroids embeddings of each obtained cluster to recommend the most appropriate hashtags to the user. Through miscellaneous experiments, we introduce an itemized study on how the techniques used for tweet embeddings influence on the final set of the recommended hashtags.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Twitter Company. https://about.twitter.com/fr/company.html. Accessed 29 Jan 2018

  2. Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: The 5th International Conference on Learning Representations (2017)

    Google Scholar 

  3. Iyyer, M., Manjunatha, V., Boyd-Graber, J., Daume III, H.: Deep unordered composition rivals syntactic methods for text classification. The Association for Computational Linguistics (2015)

    Google Scholar 

  4. Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Towards universal paraphrastic sentence embeddings. In: International Conference on Learning Representations (2016)

    Google Scholar 

  5. Wang, Y., Huang, H., Feng, C., Zhou, Q., Gu, J., Gao, X.: Conceptual sentence embeddings based on attention model. In: The 54th Annual Meeting of the Association for Computational Linguistics (2016)

    Google Scholar 

  6. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 993–1022 (2003)

    Google Scholar 

  7. She, J., Chen, L.: TOMOHA: TOpic model-based HAshtag recommendation on Twitter. In: Proceedings of the 23rd International Conference on World Wide Web (2014)

    Google Scholar 

  8. Ding, Z., Zhang, Q., Huang, X.: Automatic hashtag recommendation for microblogs using topic-specific translation model. In: Proceedings of COLING (2012)

    Google Scholar 

  9. Chen, J.D., Kao, H.Y.: LDA based semi-supervised learning from streaming short text. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (2015)

    Google Scholar 

  10. Sedhai, S., Sun, A.: Hashtag recommendation for hyperlinked tweets. In: SIGIR Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (2014)

    Google Scholar 

  11. Zangerle, E., Gassler, W., Specht, G.: On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc. Netw. Anal. Min. (2011). https://doi.org/10.1007/s13278-013-0108-x

  12. Jeon, M., Jun, S., Hwang, E.: Hashtag recommendation based on user tweet and hashtag classification on Twitter. In: Chen, Y., et al. (eds.) WAIM 2014. LNCS, vol. 8597, pp. 325–336. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11538-2_30

    Chapter  Google Scholar 

  13. Gong, Y., Zhang, Q.: Hashtag recommendation using attention-based convolutional neural network. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) (2016)

    Google Scholar 

  14. Weston, J., Chopra, S., Adams, K.: #TAGSPACE: semantic embeddings from hashtags. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1822–1827 (2014)

    Google Scholar 

  15. Ben Lhachemi, N., Nfaoui, E.H.: An extended spreading activation technique for hashtag recommendation in microblogging platforms. In: The 7th International Conference on Web Intelligence, Mining and Semantics (2017)

    Google Scholar 

  16. Kalloubi, F., Nfaoui, E.H., El Beqqali, O.: Harnessing semantic features for large scale content based hashtag recommendations on microblogging platforms. Int. J. Semant. Web Inf. Syst. 13(1), 6381 (2017)

    Article  Google Scholar 

  17. Kenter, T., Borisov, A., de Rijke, M.: Siamese CBOW: optimizing word embeddings for sentence representations. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016)

    Google Scholar 

  18. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)

  19. https://code.google.com/archive/p/word2vec/. Accessed 3 Dec 2017

  20. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: 2014 Proceedings of the 31st International Conference on Machine Learning, Beijing, China, vol. 32. JMLR: W&CP (2014)

    Google Scholar 

  21. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability (1967)

    Google Scholar 

  22. https://code.google.com/archive/p/word2vec/. Accessed 16 Nov 2017

  23. Illinois Wiki. https://wiki.cites.illinois.edu/wiki/display/forward/Dataset-UDITwitterCrawl-Aug2012. Accessed 21 Oct 2017

  24. http://scikit-learn.org/stable/. Accessed 25 Nov 2017

  25. https://radimrehurek.com/gensim/Accessed 24 Nov 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nada Ben-Lhachemi or El Habib Nfaoui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ben-Lhachemi, N., Nfaoui, E.H. (2018). Hashtag Recommendation Using Word Sequences’ Embeddings. In: Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N. (eds) Big Data, Cloud and Applications. BDCA 2018. Communications in Computer and Information Science, vol 872. Springer, Cham. https://doi.org/10.1007/978-3-319-96292-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-96292-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-96291-7

  • Online ISBN: 978-3-319-96292-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics