Skip to main content

Multi-interest User Profiling in Short Text Microblogs

  • Conference paper
  • First Online:
Designing for Digital Transformation. Co-Creating Services with Citizens and Industry (DESRIST 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12388))

Abstract

Discourse on short text platforms like Twitter shapes the design of underlying knowledge-based recommendation engines. The resulting recommendations are powered by user connections as social network nodes as well as with shared interests. Twitter as a platform provides a complex mesh of users’ interest levels where some users tend to consume certain topical content to a lesser or greater extent. This consumption of content is usually considered a defining factor in curation of their online identity. Our aim in this paper is to quantify the multi-interests of users based on the tweets they disseminate. We do this by (i) representing all tweets as vectors for computations (ii) generating cluster centroids representative of the topics of interest. (iii) computing a responsibility matrix to depict their interest levels in the topics (iv) aggregating intra-user interest levels to define the user’s multi-topic affinities. We use a Twitter dataset geolocated to Kenya to validate users’ intra-topical interests. Our experimental results demonstrate the effectiveness of our approach in terms of capturing their multi-interests and in turn generate their multi-topic interest profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

    Google Scholar 

  2. Banerjee, N., et al.: User interests in social media sites: an exploration with micro-blogs. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1823–1826 (2009)

    Google Scholar 

  3. Bao, H., Li, Q., Liao, S.S., Song, S., Gao, H.: A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis. Support Syst. 55(3), 698–709 (2013)

    Article  Google Scholar 

  4. Bholowalia, P., Kumar, A.: EBK-means: a clustering technique based on elbow method and k-means in WSN. Int. J. Comput. Appl. 105(9), 17–24 (2014)

    Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  6. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  7. Cami, B.R., Hassanpour, H., Mashayekhi, H.: User preferences modeling using Dirichlet process mixture model for a content-based recommender system. Knowl.-Based Syst. 163, 644–655 (2019)

    Article  Google Scholar 

  8. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)

    MathSciNet  MATH  Google Scholar 

  9. Di Tommaso, G., Faralli, S., Stilo, G., Velardi, P.: Wiki-MID: a very large multi-domain interests dataset of Twitter users with mappings to wikipedia. In: Vrandečić, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L.-A., Simperl, E. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 36–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_3

    Chapter  Google Scholar 

  10. Garcia Esparza, S., O’Mahony, M.P., Smyth, B.: CatStream: categorising tweets for user profiling and stream filtering. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 25–36 (2013)

    Google Scholar 

  11. Goel, S., Kumar, R.: Folksonomy-based user profile enrichment using clustering and community recommended tags in multiple levels. Neurocomputing 315, 425–438 (2018)

    Article  Google Scholar 

  12. Grenha Teixeira, J., Patrício, L., Huang, K.H., Fisk, R.P., Nóbrega, L., Constantine, L.: The minds method: integrating management and interaction design perspectives for service design. J, Serv. Res. 20(3), 240–258 (2017)

    Article  Google Scholar 

  13. Jiang, B., Sha, Y.: Modeling temporal dynamics of user interests in online social networks. Procedia Comput. Sci. 51, 503–512 (2015)

    Article  Google Scholar 

  14. Kapanipathi, P., Jain, P., Venkataramani, C., Sheth, A.: Hierarchical interest graph from tweets. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 311–312 (2014)

    Google Scholar 

  15. Karatay, D., Karagoz, P.: User interest modeling in twitter with named entity recognition. In: 5th Workshop on Making Sense of Microposts (2015)

    Google Scholar 

  16. Lapão, L.V., Da Silva, M.M., Gregório, J.: Implementing an online pharmaceutical service using design science research. BMC Med. Inform. Decis. Mak. 17(1), 31 (2017)

    Article  Google Scholar 

  17. Li, J., Xu, H., He, X., Deng, J., Sun, X.: Tweet modeling with LSTM recurrent neural networks for hashtag recommendation. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 1570–1577. IEEE (2016)

    Google Scholar 

  18. Liang, S., Zhang, X., Ren, Z., Kanoulas, E.: Dynamic embeddings for user profiling in twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and#38; Data Mining, pp. 1764–1773. KDD ’18, ACM, New York, NY, USA (2018). https://doi.org/10.1145/3219819.3220043

  19. McHugh, M.L.: Interrater reliability: the kappa statistic. Biochem. Medica 22(3), 276–282 (2012)

    Article  MathSciNet  Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  21. Mishra, S., Rizoiu, M.A., Xie, L.: Modeling popularity in asynchronous social media streams with recurrent neural networks. In: Twelfth International AAAI Conference on Web and Social Media (2018)

    Google Scholar 

  22. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? comparing data from twitter’s streaming API with Twitter’s firehose. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)

    Google Scholar 

  23. Nguyen, V.D., Sriboonchitta, S., Huynh, V.N.: Using community preference for overcoming sparsity and cold-start problems in collaborative filtering system offering soft ratings. Electron. Commer. Res. Appl. 26, 101–108 (2017)

    Article  Google Scholar 

  24. Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. J. Manag. Inf. Syst. 24(3), 45–77 (2007)

    Article  Google Scholar 

  25. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  26. Recalde, L., Baeza-Yates, R.: What kind of content are you prone to tweet? multi-topic preference model for tweeters. arXiv preprint arXiv:1807.07162 (2018)

  27. Viera, A.J., Garrett, J.M., et al.: Understanding interobserver agreement: the kappa statistic. Fam. Med. 37(5), 360–363 (2005)

    Google Scholar 

  28. Wandabwa, H., Naeem, M.A., Mirza, F., Pears, R.: Follow-back recommendations for sports bettors: a Twitter-based approach. In: Proceedings of the 53rd Hawaii International Conference on System Sciences (2020)

    Google Scholar 

  29. Wandabwa, H., Naeem, M.A., Pears, R., Mirza, F.: A metamodel enabled approach for discovery of coherent topics in short text microblogs. IEEE Access 6, 65582–65593 (2018)

    Article  Google Scholar 

  30. Ying, Q.F., Chiu, D.M., Venkatramanan, S., Zhang, X.: User modeling and usage profiling based on temporal posting behavior in OSNs. Online Soc. Netw. Media 8, 32–41 (2018)

    Article  Google Scholar 

  31. Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: Gangemi, A., Navigli, R., Vidal, M.-E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 745–760. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_48

    Chapter  Google Scholar 

  32. Zheng, J., Wang, S., Li, D., Zhang, B.: Personalized recommendation based on hierarchical interest overlapping community. Inf. Sci. 479, 55–75 (2019)

    Article  Google Scholar 

  33. Zhu, Z., Zhou, Y., Deng, X., Wang, X.: A graph-oriented model for hierarchical user interest in precision social marketing. Electron. Commer. Res. Appl. 35, 100845 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Herman Wandabwa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wandabwa, H., Naeem, M.A., Mirza, F., Pears, R., Nguyen, A. (2020). Multi-interest User Profiling in Short Text Microblogs. In: Hofmann, S., Müller, O., Rossi, M. (eds) Designing for Digital Transformation. Co-Creating Services with Citizens and Industry. DESRIST 2020. Lecture Notes in Computer Science(), vol 12388. Springer, Cham. https://doi.org/10.1007/978-3-030-64823-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64823-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64822-0

  • Online ISBN: 978-3-030-64823-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics