Topic Identification of Instagram Hashtag Sets for Image Tagging: An Empirical Assessment

Giannoulakis, Stamatios; Tsapatsoulis, Nicolas

doi:10.1007/978-3-030-98876-0_14

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1537))

Included in the following conference series:

Research Conference on Metadata and Semantics Research

742 Accesses

Abstract

Images are an important part of collection items in any digital library. Mining information from social media networks, and especially the Instagram, for Image description has recently gained increased research interest. In the current study we extend previous work on the use of topic modelling for mining tags from Instagram hashtags for image content description. We examine whether the hashtags accompanying Instagram photos, collected via a common query hashtag (called ‘subject’ hereafter), vary in a statistically significant manner depending on the similarity of their visual content. In the experiment we use the topics mined from Instagram hashtags from a set of Instagram images corresponding to 26 different query hashtags and classified into two categories per subject, named as ‘relevant’ and ‘irrelevant’ depending on the similarity of their visual content. Two different set of users, namely trained students and generic crowd, assess the topics presented to them as word clouds. To invest whether there is significant difference between the word clouds of the images considered as visually relevant to the query subject compared to those considered visually irrelevant. At the same time we investigate whether the word cloud interpretations of trained students and generic crowd differ. The data collected through this empirical study are analyzed with use of independent samples t-test and Pearson rho. We conclude that the word clouds of the relevant Instagram images are much more easily interpretable by both the trained students and the crowd. The results also show some interesting variations across subjects which are analysed and discussed in detail throughout the paper. At the same time the interpretations of trained students and the generic crowd are highly correlated, denoting that no specific training is required to mine relevant tags from Instagram hashtags to describe the accompanied Instagram photos.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alami, N., Meknassi, M., En-nahnahi, N., El Adlouni, Y., Ammor, O.: Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling. Expert Syst. Appl. 172, 114652 (2021)
Article Google Scholar
Argyrou, A., Giannoulakis, S., Tsapatsoulis, N.: Topic modelling on Instagram hashtags: an alternative way to Automatic Image Annotation? In: 13th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP 2018), pp. 61–67, IEEE, Piscataway (2018)
Google Scholar
Atenstaedt, R.: Word cloud analysis of the BJGP: 5 years on. Br. J. Gen. Pract. 67(658), 231–232 (2017)
Article Google Scholar
Blei, D.: Probabilistic topic models. Commun. ACM 55, 77–84 (2012)
Article Google Scholar
Cabrall, C., et al.: Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing. Accid. Anal. Prev. 114, 25–33 (2018)
Article Google Scholar
Daer, A., Hoffman, R., Goodman, S.: Rhetorical functions of hashtag forms across social media applications. Commun. Des. Q. 3, 12–16 (2015)
Google Scholar
Fu, X., Wang, T., Li, J., Yu C., Liu, W.: Improving distributed word representation and topic model by word-topic mixture model. In: Durrant, R.J., Kim, K.-E.b (eds.) Proceedings of the Asian Conference on Machine Learning, vol. 63, pp. 190–205 (2016)
Google Scholar
Giannoulakis, S., Tsapatsoulis, N.: Defining and identifying stophashtags in instagram. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds.) INNS 2016. AISC, vol. 529, pp. 304–313. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-47898-2_31
Chapter Google Scholar
Giannoulakis, S., Tsapatsoulis, N.: Instagram hashtags as image annotation metadata. In: Chbeir, R., Manolopoulos, Y., Maglogiannis, I., Alhajj, R. (eds.) AIAI 2015. IAICT, vol. 458, pp. 206–220. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23868-5_15
Chapter Google Scholar
Giannoulakis, S., Tspatsoulis, N.: Filtering Instagram hashtags through crowdtagging and the HITS algorithm. IEEE Trans. Comput. Soc. Syst. 6(3), 592–603 (2019)
Article Google Scholar
Hall, M., Clough, P., Stevenson, M.: Evaluating the use of clustering for automatically organising digital library collections. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 323–334. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33290-6_35
Chapter Google Scholar
Ibba, S., Pani, F.E.: Digital libraries: the challenge of integrating instagram with a taxonomy for content management. Future Internet 8(2), 16 (2016)
Article Google Scholar
Lohmann, S., Heimerl, F., Bopp, F., Burch, M., Ertl, T.: ConcentriCloud: word cloud visualization for multiple text documents. In: Banissi, E., et al. (eds.) Proceedings of the 19th International Conference on Information Visualisation, pp. 114–120. IEEE, Piscataway (2015)
Google Scholar
Maier-Hein, L., et al.: Can masses of non-experts train highly accurate image classifiers? In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8674, pp. 438–445. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10470-6_55
Chapter Google Scholar
Mitry, D., et al.: The accuracy and reliability of crowdsource annotations of digital retinal images. Transl. Vis. Sci. Technol. 5, 6 (2016)
Article Google Scholar
Petrelli, D., Clough, P.: Analysing user’s queries for cross-language image retrieval from digital library collections. Electron. Libr. 30, 197–219 (2012)
Article Google Scholar
Rohani, V., Shayaa, S., Babanejaddehaki, G.: Topic modeling for social media content: a practical approach. In: 3rd International Conference on Computer and Information Sciences (ICCOINS) a Conference of World Engineering, Science & Technology Congress (ESTCON), pp. 397–402. IEEE, Piscataway (2016)
Google Scholar
Sfakakis, M., Papachristopoulos, L., Zoutsou, K., Tsakonas, G., Papatheodorou, C.: Automated subject indexing of domain specific collections using word embeddings and general purpose Thesauri. In: Garoufallou, E., Fallucchi, F., William De Luca, E. (eds.) MTSR 2019. CCIS, vol. 1057, pp. 103–114. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36599-8_9
Chapter Google Scholar
Suadaa, L., Purwarianti, A.: Combination of Latent Dirichlet Allocation (LDA) and Term Frequency-Inverse Cluster Frequency (TFxICF) in Indonesian text clustering with labeling. In: 4th International Conference on Information and Communication Technology. IEEE, Piscataway (2016)
Google Scholar
Tsapatsoulis, N.: Image retrieval via topic modelling of Instagram hashtags. In: 15th International Workshop on Semantic and Social Media Adaptation & Personalization, pp. 1–6. IEEE, Piscataway (2020)
Google Scholar
Xie, I., Matusiak, K.: Metadata. Discover Digital Libraries: Theory and Practice, pp. 129–170. Elsevier, Amsterdam (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communication and Internet Studies, Cyprus University of Technology, 30, Arch. Kyprianos Street, 3036, Limassol, Cyprus
Stamatios Giannoulakis & Nicolas Tsapatsoulis

Authors

Stamatios Giannoulakis
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Tsapatsoulis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stamatios Giannoulakis .

Editor information

Editors and Affiliations

International Hellenic University, Thessaloniki, Greece
Emmanouel Garoufallou
Complutense University of Madrid, Madrid, Spain
María-Antonia Ovalle-Perandones
University College London, London, UK
Andreas Vlachidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Giannoulakis, S., Tsapatsoulis, N. (2022). Topic Identification of Instagram Hashtag Sets for Image Tagging: An Empirical Assessment. In: Garoufallou, E., Ovalle-Perandones, MA., Vlachidis, A. (eds) Metadata and Semantic Research. MTSR 2021. Communications in Computer and Information Science, vol 1537. Springer, Cham. https://doi.org/10.1007/978-3-030-98876-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-98876-0_14
Published: 01 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98875-3
Online ISBN: 978-3-030-98876-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics