Mining exoticism from visual content with fusion-based deep neural networks

Ceroni, Andrea; Ma, Chenyang; Ewerth, Ralph

doi:10.1007/s13735-018-00165-4

Mining exoticism from visual content with fusion-based deep neural networks

Regular Paper
Published: 23 January 2019

Volume 8, pages 19–33, (2019)
Cite this article

International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

259 Accesses
Explore all metrics

Abstract

Exoticism is the charm of the unfamiliar or something remote. It has received significant interest in different kinds of arts, but although visual concept classification in images and videos for semantic multimedia retrieval has been researched for years, the visual concept of exoticism has not been investigated yet from a computational perspective. In this paper, we present the first approach to automatically classify images as exotic or non-exotic. We have gathered two large datasets that cover exoticism in a general as well as a concept-specific way. The datasets have been annotated in a crowdsourcing approach. To circumvent cultural differences in the annotation, only North American crowdworkers are employed for this task. Two deep learning architectures to learn the concept of exoticism are evaluated. Besides deep learning features, we also investigate the usefulness of hand-crafted features, which are combined with deep features in our proposed fusion-based approach. Different machine learning models are compared with the fusion-based approach, which is the best performing one, reaching an accuracy over 83% and 91% on two different datasets. Comprehensive experimental results provide insights into which features contribute at most to recognizing exoticism. The estimation of image exoticism could be applied in fields like advertising and travel suggestions, as well as to increase serendipity and diversity of recommendations and search results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain

Context-Based Quote Generation from Images

Artificial Neural Networks and Deep Learning in the Visual Arts: a review

Article 12 January 2021

Iria Santos, Luz Castro, … Adrián Carballal

Notes

https://github.com/xander7/MiningExoticism.
See footnote 1.

References

Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE CVPR ’09
Adamopoulos P, Tuzhilin A (2015) On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4):54
Google Scholar
Boiy E, Moens M-F (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retrieval 12(5):526–558
Article Google Scholar
Borth D, Chen T, Ji R, Chang S (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: MM’13
Bradski G (2000) The openCV library. Dr. Dobb’s J Softw Tools 120:122–125
Google Scholar
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML ’14
Editors of the American Heritage Dictionaries (2018) The American heritage dictionary of the English language. https://ahdictionary.com/word/search.html?q=exotic. Accessed 18 Jan 2019
Eickhoff C, de Vries AP (2013) Increasing cheat robustness of crowdsourcing tasks. Inf Retrieval 16(2):121–137
Article Google Scholar
Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Ewerth R, Springstein M, Phan-Vogtmann LA, Schütze J (2017) “Are machines better than humans in image tagging?”: a user study adds to the puzzle. In: Jose JM, Hauff C, Altıngovde IS, Song D, Albakour D, Watt S, Tait J (eds) Advances in information retrieval. Springer, Cham, pp 186–198
Chapter Google Scholar
Ge M, Delgado-Battenfeld C, Jannach D (2010) Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: RecSys ’10
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR ’14
Goldwater RJ (1986) Primitivism in modern art. Harvard University Press, Cambridge
Book Google Scholar
Gracia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J (2012) Challenges for the multilingual web of data. Web Semant Sci Serv Agents World Wide Web 11:63–71
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato
Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804
Article Google Scholar
Hare J, Samangooei S, Dupplaw D (2011) OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images. In: MM ’11
Howarth P, Rüger S (2004) Evaluation of texture features for content-based image retrieval. In: CIVR ’04
Hull DA, Grefenstette G (1996) Querying across languages: a dictionary-based approach to multilingual information retrieval. In: SIGIR ’96
Jacobs M (1995) The painted voyage: art, travel and exploration, 1564–1875 (Art History). British Museum Press, London
Jenkins OH (1999) Understanding and measuring tourist destination images. Int J Tour Res 1:1–15
Article Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM MM ’14
Jones A (2007) This is not a cruise. http://archive.fo/TEec. Accessed 18 Jan 2019
Jou B, Chen T, Pappas N, Redi M, Topkara M, Chang S (2015) Visual affect around the world: a large-scale multilingual visual sentiment ontology. In: MM ’15
Kaminskas M, Bridge D (2017) Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans Interact Intell Syst 7(1):2
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS ’12
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Article MATH Google Scholar
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
Locke RP (2009) Musical exoticism. Images and reflections. Cambridge University Pres, Cambridge
Google Scholar
Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: ECCV ’08
Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM’10
Markatopoulou F, Moumtzidou A, Tzelepis C, Avgerinakis K, Gkalelis N, Vrochidis S, Mezaris V, Kompatsiaris I (2015) ITI-CERTH participation to TRECVID 2015. In: TRECVID 2015 workshop
Mavridaki E, Mezaris V (2014) No-reference blur assessment in natural images using Fourier transform and spatial pyramids. In: ICIP ’14
Mavridaki E, Mezaris V (2015) A comprehensive aesthetic quality assessment method for natural images using basic rules of photography. In: IEEE ICIP ’15
Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: ACL ’07
Merriam-Webster Online (2018) Merriam-Webster’s dictionary of English usage. https://www.merriam-webster.com/dictionary/exotic. Accessed 18 Jan 2019
Müller-Budack E, Pustu-Iren K, Ewerth R (2018) Geolocation estimation of photos using a hierarchical model and scene classification. In: European conference on computer vision (ECCV). Springer, Munich, pp 575–592
Nguyen TT, Hui P, Harper F, Terveen L, Konstan J (2014) Exploring the filter bubble: the effect of using recommender systems on content diversity. In: WWW’14
Over P, Awad G, Fiscus J, Sanders G, Shaw B, Michel M, Smeaton A, Kraaij W, Quénot G (2013) TRECVID 2013: an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. Washington, USA. https://hal.inria.fr/hal-00953093
Pappas N, Redi M, Topkara M, Jou B, Liu H, Chen T, Chang S (2016) Multilingual visual sentiment concept matching. In: ICMR ’16
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW ’09
Segalen V (2002) Essay on exoticism: an aesthetics of diversity. Duke University Press, Durham
Google Scholar
Sharma G, Wu W, Dalal EN (2005) The CIEDE2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Res Appl 30(1):21–30
Article Google Scholar
Sheridan P, Ballerini JP (1996) Experiments in multilingual information retrieval using the spider system. In: SIGIR ’96
Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv 47:1–45
Article Google Scholar
Song K, Tian Y, Gao W, Huang T (2006) Diversifying the image retrieval results. In: ACM MM ’06
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR’15
Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473
Article Google Scholar
Tapachai N, Waryszak R (2000) An examination of the role of beneficial image in tourist destination selection. J Travel Res 39(1):37–44
Article Google Scholar
Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li LJ (2016) Yfcc100m: the new data in multimedia research. Commun ACM 59(2):64–73
Article Google Scholar
Tong H, Li M, Zhang H, He J, Zhang C (2004) Classification of digital photos taken by photographers or home users. In: PCM ’04
van Leuken RH, Garcia L, Olivares X, van Zwol R (2009) Visual diversification of image search results. In: WWW ’09
van de Weijer J, Schmid C, Verbeek J (2007) Learning color names from real-world images. In: IEEE CVPR’07
Vargas S, Castells P (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In: RecSys ’11
Weyand T, Kostrikov I, Philbin J (2016) Planet-photo geolocation with convolutional neural networks. In: European conference on computer vision. Springer, pp 37–55
Wu S, Chen YC, Li X, Wu AC, You JJ, Zheng WS (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8
Wu Y, Bauckhage C, Thurau C (2010) The good, the bad, and the ugly: predicting aesthetic image labels. In: ICPR ’10
Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3485–3492
Yeh CH, Ho YC, Barsky BA, Ouhyoung M (2010) Personalized photograph ranking and selection system. In: ACM MM ’10
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV ’14
Zhang N, Donahue J, Girshick RB, Darrell T (2014) Part-based R-CNNs for fine-grained category detection. In: ECCV ’14
Zhao S, Gao Y, Jiang X, Yao H, Chua T, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: MM ’14
Zhao S, Ding G, Huang Q, Chua TS, Schuller BW, Keutzer K (2018) Affective image content analysis: a comprehensive survey. In: IJCAI, pp 5534–5541

Download references

Author information

Authors and Affiliations

L3S Research Center, Leibniz Universität Hannover, Hannover, Germany
Andrea Ceroni, Chenyang Ma & Ralph Ewerth
Visual Analytics Research Group, Leibniz Information Centre for Science and Technology (TIB), Hannover, Germany
Ralph Ewerth

Authors

Andrea Ceroni
View author publications
You can also search for this author in PubMed Google Scholar
Chenyang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Ralph Ewerth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ralph Ewerth.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ceroni, A., Ma, C. & Ewerth, R. Mining exoticism from visual content with fusion-based deep neural networks. Int J Multimed Info Retr 8, 19–33 (2019). https://doi.org/10.1007/s13735-018-00165-4

Download citation

Received: 16 September 2018
Revised: 10 December 2018
Accepted: 15 December 2018
Published: 23 January 2019
Issue Date: 07 March 2019
DOI: https://doi.org/10.1007/s13735-018-00165-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining exoticism from visual content with fusion-based deep neural networks

Abstract

Access this article

Similar content being viewed by others

Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain

Context-Based Quote Generation from Images

Artificial Neural Networks and Deep Learning in the Visual Arts: a review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining exoticism from visual content with fusion-based deep neural networks

Abstract

Access this article

Similar content being viewed by others

Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain

Context-Based Quote Generation from Images

Artificial Neural Networks and Deep Learning in the Visual Arts: a review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation