Abstract
With more than one billion monthly active users and nearly 100 million photos shared on the platform daily, Instagram has become among the richest sources of information for detecting users’ interests and trends. However, research works on this social network are limited compared to its competitors, e.g., Facebook and Twitter. There is no doubt that the lack of a publicly labeled dataset that summarizes the content of Instagram profiles is a prime problem bothering the researchers. To overcome this issue, here, for the first time, we present an annotated multidomain interests dataset to train and test OSNs’ users and the methodology to create this dataset from Instagram profiles. In addition, through this work, we propose an automatic detection and classification of Instagram users’ interests. We rely on word embedding representations of words and deep learning techniques to introduce two approaches: (i) a feature-based method and (ii) fine-tuning the BERT model. We observed that BERT fine-tuning performed much better.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
https://interestexplorer.io/facebook-interests-list/. Last access Mar2022.
- 5.
- 6.
- 7.
References
Abbasi1, R., Rehman, G., Lee, J., Riaz, F.M., Luo, B.: Discovering temporal user interest on twitter using semantic based dynamic interest finding model. In: Proceedings of the IEEE Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, China, December 2017
Weng, J., Lim, E., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the 3rd International Conference on Web Search and Web Data Mining, WSDM 2010, New York, NY, USA, pp 261–270 (2010)
Xu, Z., Lu, R., Xiang, L., Yang, Q.: Discovering user interest on twitter with a modified author-topic model. In: 2011 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Lyon, France (2011)
Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what @you #tag: does the dual role affect hashtag adoption? In: Proceedings of the 21st WWW Conference, Lyon (2012)
Piao, G., Breslin, J.G.: User modeling on twitter with wordnet Synsets and DBpedia concepts for personalized recommendations. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management CIKM 2016, IN, USA (2016)
Kang, J., Lee, H.: Modeling user interest in social media using news media and Wikipedia. Inf. Syst. 65, 52–64 (2017)
Fani, H., Bagheri, E., Du, W.: Temporally Like-minded User Community Identification through Neural Embeddings. In: Proceedings of the 26th ACM International Conference on Information and Knowledge Management, CIK 2017, Melbourne (2017)
Chong, W.-H., Lim, E.-P., Cohen, W.: Collective entity linking in tweets over space and time. In: Jose, J.M., et al. (eds.) ECIR 2017. LNCS, vol. 10193, pp. 82–94. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56608-5_7
Liang, S., Zhang, X., Ren, Z., Kanoulas, E.: Dynamic embeddings for user profiling in twitter Shangsong. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018), London, UK (2018)
Jain, A., Gupta, A., Sharma, N., Joshi, S., Yadav, D.: Mining application on analyzing users’ interests from twitter. In: Proceedings of the 3rd International Conference on Internet of Things and Connected Technologies, Jaipur, India, March 2018
Ombabi, A.H., Lazzez, O., Ouarda, W., Alimi, A.N.: Deep learning framework based on Word2Vec and CNN for users interests classification. In: Proceedings of the 5th Sudan Conference on Computer Science and Information Technology 2017, Sudan (2017)
Adjali, O., Besançon, R., Ferret, O., Le Borgne, H., Grau, B.: Multimodal entity linking for tweets. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 463–478. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_31
Piao, G., Breslin, J.G.: Inferring User interests for passive users on twitter by leveraging followee biographies. In: Jose, J.M., et al. (eds.) ECIR 2017. LNCS, vol. 10193, pp. 122–133. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56608-5_10
Arabzadeh, N., Fani, H., Zarrinkalam, F., Navivala, A., Bagheri, B.: Causal dependencies for future interest prediction on twitter. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Turin, Italy (2018)
Jang, J.Y., Han, K., Shih, P.C., Lee, D.: Generation like: comparative characteristics in Instagram. In: Proceedings of the 33rd ACM Conference on Human Factors in Computing Systems, CHI 2015, Seoul, Korea, April 2015
Lee, R.K.-W., Hoang, T.-A., Lim, E.-P.: On analyzing user topic-specific platform preferences across multiple social media sites. In: Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, April 2017
Ferrara, E., Interdonato, R., Tagarelli, A.: Online popularity and topical interests through the lens of Instagram. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media HT, pp 24–34, Santiago, Chile, September 2014
Devlin, J., Chang, M., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805 (2018)
Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. In: Cherifi, H., Gaito, S., Mendes, J.F., Moro, E., Rocha, L.M. (eds.) COMPLEX NETWORKS 2019. SCI, vol. 881, pp. 928–940. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36687-2_77
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems 32 Inc, pp. 7059–7069 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hamdi, S., Hamdi, A., Ben Yahia, S. (2022). BERT and Word Embedding for Interest Mining of Instagram Users. In: Bădică, C., Treur, J., Benslimane, D., Hnatkowska, B., Krótkiewicz, M. (eds) Advances in Computational Collective Intelligence. ICCCI 2022. Communications in Computer and Information Science, vol 1653. Springer, Cham. https://doi.org/10.1007/978-3-031-16210-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-16210-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16209-1
Online ISBN: 978-3-031-16210-7
eBook Packages: Computer ScienceComputer Science (R0)