Abstract
Discourse on short text platforms like Twitter shapes the design of underlying knowledge-based recommendation engines. The resulting recommendations are powered by user connections as social network nodes as well as with shared interests. Twitter as a platform provides a complex mesh of users’ interest levels where some users tend to consume certain topical content to a lesser or greater extent. This consumption of content is usually considered a defining factor in curation of their online identity. Our aim in this paper is to quantify the multi-interests of users based on the tweets they disseminate. We do this by (i) representing all tweets as vectors for computations (ii) generating cluster centroids representative of the topics of interest. (iii) computing a responsibility matrix to depict their interest levels in the topics (iv) aggregating intra-user interest levels to define the user’s multi-topic affinities. We use a Twitter dataset geolocated to Kenya to validate users’ intra-topical interests. Our experimental results demonstrate the effectiveness of our approach in terms of capturing their multi-interests and in turn generate their multi-topic interest profiles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)
Banerjee, N., et al.: User interests in social media sites: an exploration with micro-blogs. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1823–1826 (2009)
Bao, H., Li, Q., Liao, S.S., Song, S., Gao, H.: A new temporal and social PMF-based method to predict users’ interests in micro-blogging. Decis. Support Syst. 55(3), 698–709 (2013)
Bholowalia, P., Kumar, A.: EBK-means: a clustering technique based on elbow method and k-means in WSN. Int. J. Comput. Appl. 105(9), 17–24 (2014)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Cami, B.R., Hassanpour, H., Mashayekhi, H.: User preferences modeling using Dirichlet process mixture model for a content-based recommender system. Knowl.-Based Syst. 163, 644–655 (2019)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)
Di Tommaso, G., Faralli, S., Stilo, G., Velardi, P.: Wiki-MID: a very large multi-domain interests dataset of Twitter users with mappings to wikipedia. In: Vrandečić, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L.-A., Simperl, E. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 36–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_3
Garcia Esparza, S., O’Mahony, M.P., Smyth, B.: CatStream: categorising tweets for user profiling and stream filtering. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 25–36 (2013)
Goel, S., Kumar, R.: Folksonomy-based user profile enrichment using clustering and community recommended tags in multiple levels. Neurocomputing 315, 425–438 (2018)
Grenha Teixeira, J., Patrício, L., Huang, K.H., Fisk, R.P., Nóbrega, L., Constantine, L.: The minds method: integrating management and interaction design perspectives for service design. J, Serv. Res. 20(3), 240–258 (2017)
Jiang, B., Sha, Y.: Modeling temporal dynamics of user interests in online social networks. Procedia Comput. Sci. 51, 503–512 (2015)
Kapanipathi, P., Jain, P., Venkataramani, C., Sheth, A.: Hierarchical interest graph from tweets. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 311–312 (2014)
Karatay, D., Karagoz, P.: User interest modeling in twitter with named entity recognition. In: 5th Workshop on Making Sense of Microposts (2015)
Lapão, L.V., Da Silva, M.M., Gregório, J.: Implementing an online pharmaceutical service using design science research. BMC Med. Inform. Decis. Mak. 17(1), 31 (2017)
Li, J., Xu, H., He, X., Deng, J., Sun, X.: Tweet modeling with LSTM recurrent neural networks for hashtag recommendation. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 1570–1577. IEEE (2016)
Liang, S., Zhang, X., Ren, Z., Kanoulas, E.: Dynamic embeddings for user profiling in twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and#38; Data Mining, pp. 1764–1773. KDD ’18, ACM, New York, NY, USA (2018). https://doi.org/10.1145/3219819.3220043
McHugh, M.L.: Interrater reliability: the kappa statistic. Biochem. Medica 22(3), 276–282 (2012)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Mishra, S., Rizoiu, M.A., Xie, L.: Modeling popularity in asynchronous social media streams with recurrent neural networks. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? comparing data from twitter’s streaming API with Twitter’s firehose. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)
Nguyen, V.D., Sriboonchitta, S., Huynh, V.N.: Using community preference for overcoming sparsity and cold-start problems in collaborative filtering system offering soft ratings. Electron. Commer. Res. Appl. 26, 101–108 (2017)
Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. J. Manag. Inf. Syst. 24(3), 45–77 (2007)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Recalde, L., Baeza-Yates, R.: What kind of content are you prone to tweet? multi-topic preference model for tweeters. arXiv preprint arXiv:1807.07162 (2018)
Viera, A.J., Garrett, J.M., et al.: Understanding interobserver agreement: the kappa statistic. Fam. Med. 37(5), 360–363 (2005)
Wandabwa, H., Naeem, M.A., Mirza, F., Pears, R.: Follow-back recommendations for sports bettors: a Twitter-based approach. In: Proceedings of the 53rd Hawaii International Conference on System Sciences (2020)
Wandabwa, H., Naeem, M.A., Pears, R., Mirza, F.: A metamodel enabled approach for discovery of coherent topics in short text microblogs. IEEE Access 6, 65582–65593 (2018)
Ying, Q.F., Chiu, D.M., Venkatramanan, S., Zhang, X.: User modeling and usage profiling based on temporal posting behavior in OSNs. Online Soc. Netw. Media 8, 32–41 (2018)
Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: Gangemi, A., Navigli, R., Vidal, M.-E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 745–760. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_48
Zheng, J., Wang, S., Li, D., Zhang, B.: Personalized recommendation based on hierarchical interest overlapping community. Inf. Sci. 479, 55–75 (2019)
Zhu, Z., Zhou, Y., Deng, X., Wang, X.: A graph-oriented model for hierarchical user interest in precision social marketing. Electron. Commer. Res. Appl. 35, 100845 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wandabwa, H., Naeem, M.A., Mirza, F., Pears, R., Nguyen, A. (2020). Multi-interest User Profiling in Short Text Microblogs. In: Hofmann, S., Müller, O., Rossi, M. (eds) Designing for Digital Transformation. Co-Creating Services with Citizens and Industry. DESRIST 2020. Lecture Notes in Computer Science(), vol 12388. Springer, Cham. https://doi.org/10.1007/978-3-030-64823-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-64823-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64822-0
Online ISBN: 978-3-030-64823-7
eBook Packages: Computer ScienceComputer Science (R0)