Abstract
Images are an important part of collection items in any digital library. Mining information from social media networks, and especially the Instagram, for Image description has recently gained increased research interest. In the current study we extend previous work on the use of topic modelling for mining tags from Instagram hashtags for image content description. We examine whether the hashtags accompanying Instagram photos, collected via a common query hashtag (called ‘subject’ hereafter), vary in a statistically significant manner depending on the similarity of their visual content. In the experiment we use the topics mined from Instagram hashtags from a set of Instagram images corresponding to 26 different query hashtags and classified into two categories per subject, named as ‘relevant’ and ‘irrelevant’ depending on the similarity of their visual content. Two different set of users, namely trained students and generic crowd, assess the topics presented to them as word clouds. To invest whether there is significant difference between the word clouds of the images considered as visually relevant to the query subject compared to those considered visually irrelevant. At the same time we investigate whether the word cloud interpretations of trained students and generic crowd differ. The data collected through this empirical study are analyzed with use of independent samples t-test and Pearson rho. We conclude that the word clouds of the relevant Instagram images are much more easily interpretable by both the trained students and the crowd. The results also show some interesting variations across subjects which are analysed and discussed in detail throughout the paper. At the same time the interpretations of trained students and the generic crowd are highly correlated, denoting that no specific training is required to mine relevant tags from Instagram hashtags to describe the accompanied Instagram photos.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alami, N., Meknassi, M., En-nahnahi, N., El Adlouni, Y., Ammor, O.: Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling. Expert Syst. Appl. 172, 114652 (2021)
Argyrou, A., Giannoulakis, S., Tsapatsoulis, N.: Topic modelling on Instagram hashtags: an alternative way to Automatic Image Annotation? In: 13th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP 2018), pp. 61–67, IEEE, Piscataway (2018)
Atenstaedt, R.: Word cloud analysis of the BJGP: 5 years on. Br. J. Gen. Pract. 67(658), 231–232 (2017)
Blei, D.: Probabilistic topic models. Commun. ACM 55, 77–84 (2012)
Cabrall, C., et al.: Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing. Accid. Anal. Prev. 114, 25–33 (2018)
Daer, A., Hoffman, R., Goodman, S.: Rhetorical functions of hashtag forms across social media applications. Commun. Des. Q. 3, 12–16 (2015)
Fu, X., Wang, T., Li, J., Yu C., Liu, W.: Improving distributed word representation and topic model by word-topic mixture model. In: Durrant, R.J., Kim, K.-E.b (eds.) Proceedings of the Asian Conference on Machine Learning, vol. 63, pp. 190–205 (2016)
Giannoulakis, S., Tsapatsoulis, N.: Defining and identifying stophashtags in instagram. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds.) INNS 2016. AISC, vol. 529, pp. 304–313. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-47898-2_31
Giannoulakis, S., Tsapatsoulis, N.: Instagram hashtags as image annotation metadata. In: Chbeir, R., Manolopoulos, Y., Maglogiannis, I., Alhajj, R. (eds.) AIAI 2015. IAICT, vol. 458, pp. 206–220. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23868-5_15
Giannoulakis, S., Tspatsoulis, N.: Filtering Instagram hashtags through crowdtagging and the HITS algorithm. IEEE Trans. Comput. Soc. Syst. 6(3), 592–603 (2019)
Hall, M., Clough, P., Stevenson, M.: Evaluating the use of clustering for automatically organising digital library collections. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 323–334. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33290-6_35
Ibba, S., Pani, F.E.: Digital libraries: the challenge of integrating instagram with a taxonomy for content management. Future Internet 8(2), 16 (2016)
Lohmann, S., Heimerl, F., Bopp, F., Burch, M., Ertl, T.: ConcentriCloud: word cloud visualization for multiple text documents. In: Banissi, E., et al. (eds.) Proceedings of the 19th International Conference on Information Visualisation, pp. 114–120. IEEE, Piscataway (2015)
Maier-Hein, L., et al.: Can masses of non-experts train highly accurate image classifiers? In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8674, pp. 438–445. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10470-6_55
Mitry, D., et al.: The accuracy and reliability of crowdsource annotations of digital retinal images. Transl. Vis. Sci. Technol. 5, 6 (2016)
Petrelli, D., Clough, P.: Analysing user’s queries for cross-language image retrieval from digital library collections. Electron. Libr. 30, 197–219 (2012)
Rohani, V., Shayaa, S., Babanejaddehaki, G.: Topic modeling for social media content: a practical approach. In: 3rd International Conference on Computer and Information Sciences (ICCOINS) a Conference of World Engineering, Science & Technology Congress (ESTCON), pp. 397–402. IEEE, Piscataway (2016)
Sfakakis, M., Papachristopoulos, L., Zoutsou, K., Tsakonas, G., Papatheodorou, C.: Automated subject indexing of domain specific collections using word embeddings and general purpose Thesauri. In: Garoufallou, E., Fallucchi, F., William De Luca, E. (eds.) MTSR 2019. CCIS, vol. 1057, pp. 103–114. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36599-8_9
Suadaa, L., Purwarianti, A.: Combination of Latent Dirichlet Allocation (LDA) and Term Frequency-Inverse Cluster Frequency (TFxICF) in Indonesian text clustering with labeling. In: 4th International Conference on Information and Communication Technology. IEEE, Piscataway (2016)
Tsapatsoulis, N.: Image retrieval via topic modelling of Instagram hashtags. In: 15th International Workshop on Semantic and Social Media Adaptation & Personalization, pp. 1–6. IEEE, Piscataway (2020)
Xie, I., Matusiak, K.: Metadata. Discover Digital Libraries: Theory and Practice, pp. 129–170. Elsevier, Amsterdam (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Giannoulakis, S., Tsapatsoulis, N. (2022). Topic Identification of Instagram Hashtag Sets for Image Tagging: An Empirical Assessment. In: Garoufallou, E., Ovalle-Perandones, MA., Vlachidis, A. (eds) Metadata and Semantic Research. MTSR 2021. Communications in Computer and Information Science, vol 1537. Springer, Cham. https://doi.org/10.1007/978-3-030-98876-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-98876-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98875-3
Online ISBN: 978-3-030-98876-0
eBook Packages: Computer ScienceComputer Science (R0)