Skip to main content
Log in

Mining exoticism from visual content with fusion-based deep neural networks

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Exoticism is the charm of the unfamiliar or something remote. It has received significant interest in different kinds of arts, but although visual concept classification in images and videos for semantic multimedia retrieval has been researched for years, the visual concept of exoticism has not been investigated yet from a computational perspective. In this paper, we present the first approach to automatically classify images as exotic or non-exotic. We have gathered two large datasets that cover exoticism in a general as well as a concept-specific way. The datasets have been annotated in a crowdsourcing approach. To circumvent cultural differences in the annotation, only North American crowdworkers are employed for this task. Two deep learning architectures to learn the concept of exoticism are evaluated. Besides deep learning features, we also investigate the usefulness of hand-crafted features, which are combined with deep features in our proposed fusion-based approach. Different machine learning models are compared with the fusion-based approach, which is the best performing one, reaching an accuracy over 83% and 91% on two different datasets. Comprehensive experimental results provide insights into which features contribute at most to recognizing exoticism. The estimation of image exoticism could be applied in fields like advertising and travel suggestions, as well as to increase serendipity and diversity of recommendations and search results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://github.com/xander7/MiningExoticism.

  2. See footnote 1.

References

  1. Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE CVPR ’09

  2. Adamopoulos P, Tuzhilin A (2015) On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4):54

    Google Scholar 

  3. Boiy E, Moens M-F (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retrieval 12(5):526–558

    Article  Google Scholar 

  4. Borth D, Chen T, Ji R, Chang S (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: MM’13

  5. Bradski G (2000) The openCV library. Dr. Dobb’s J Softw Tools 120:122–125

    Google Scholar 

  6. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML ’14

  7. Editors of the American Heritage Dictionaries (2018) The American heritage dictionary of the English language. https://ahdictionary.com/word/search.html?q=exotic. Accessed 18 Jan 2019

  8. Eickhoff C, de Vries AP (2013) Increasing cheat robustness of crowdsourcing tasks. Inf Retrieval 16(2):121–137

    Article  Google Scholar 

  9. Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136

    Article  Google Scholar 

  10. Ewerth R, Springstein M, Phan-Vogtmann LA, Schütze J (2017) “Are machines better than humans in image tagging?”: a user study adds to the puzzle. In: Jose JM, Hauff C, Altıngovde IS, Song D, Albakour D, Watt S, Tait J (eds) Advances in information retrieval. Springer, Cham, pp 186–198

    Chapter  Google Scholar 

  11. Ge M, Delgado-Battenfeld C, Jannach D (2010) Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: RecSys ’10

  12. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR ’14

  13. Goldwater RJ (1986) Primitivism in modern art. Harvard University Press, Cambridge

    Book  Google Scholar 

  14. Gracia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J (2012) Challenges for the multilingual web of data. Web Semant Sci Serv Agents World Wide Web 11:63–71

    Article  Google Scholar 

  15. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  16. Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato

  17. Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804

    Article  Google Scholar 

  18. Hare J, Samangooei S, Dupplaw D (2011) OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images. In: MM ’11

  19. Howarth P, Rüger S (2004) Evaluation of texture features for content-based image retrieval. In: CIVR ’04

  20. Hull DA, Grefenstette G (1996) Querying across languages: a dictionary-based approach to multilingual information retrieval. In: SIGIR ’96

  21. Jacobs M (1995) The painted voyage: art, travel and exploration, 1564–1875 (Art History). British Museum Press, London

  22. Jenkins OH (1999) Understanding and measuring tourist destination images. Int J Tour Res 1:1–15

    Article  Google Scholar 

  23. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM MM ’14

  24. Jones A (2007) This is not a cruise. http://archive.fo/TEec. Accessed 18 Jan 2019

  25. Jou B, Chen T, Pappas N, Redi M, Topkara M, Chang S (2015) Visual affect around the world: a large-scale multilingual visual sentiment ontology. In: MM ’15

  26. Kaminskas M, Bridge D (2017) Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans Interact Intell Syst 7(1):2

    Google Scholar 

  27. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS ’12

  28. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    Article  MATH  Google Scholar 

  29. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

  30. Locke RP (2009) Musical exoticism. Images and reflections. Cambridge University Pres, Cambridge

    Google Scholar 

  31. Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: ECCV ’08

  32. Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM’10

  33. Markatopoulou F, Moumtzidou A, Tzelepis C, Avgerinakis K, Gkalelis N, Vrochidis S, Mezaris V, Kompatsiaris I (2015) ITI-CERTH participation to TRECVID 2015. In: TRECVID 2015 workshop

  34. Mavridaki E, Mezaris V (2014) No-reference blur assessment in natural images using Fourier transform and spatial pyramids. In: ICIP ’14

  35. Mavridaki E, Mezaris V (2015) A comprehensive aesthetic quality assessment method for natural images using basic rules of photography. In: IEEE ICIP ’15

  36. Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: ACL ’07

  37. Merriam-Webster Online (2018) Merriam-Webster’s dictionary of English usage. https://www.merriam-webster.com/dictionary/exotic. Accessed 18 Jan 2019

  38. Müller-Budack E, Pustu-Iren K, Ewerth R (2018) Geolocation estimation of photos using a hierarchical model and scene classification. In: European conference on computer vision (ECCV). Springer, Munich, pp 575–592

  39. Nguyen TT, Hui P, Harper F, Terveen L, Konstan J (2014) Exploring the filter bubble: the effect of using recommender systems on content diversity. In: WWW’14

  40. Over P, Awad G, Fiscus J, Sanders G, Shaw B, Michel M, Smeaton A, Kraaij W, Quénot G (2013) TRECVID 2013: an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. Washington, USA. https://hal.inria.fr/hal-00953093

  41. Pappas N, Redi M, Topkara M, Jou B, Liu H, Chen T, Chang S (2016) Multilingual visual sentiment concept matching. In: ICMR ’16

  42. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  43. San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW ’09

  44. Segalen V (2002) Essay on exoticism: an aesthetics of diversity. Duke University Press, Durham

    Google Scholar 

  45. Sharma G, Wu W, Dalal EN (2005) The CIEDE2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Res Appl 30(1):21–30

    Article  Google Scholar 

  46. Sheridan P, Ballerini JP (1996) Experiments in multilingual information retrieval using the spider system. In: SIGIR ’96

  47. Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv 47:1–45

    Article  Google Scholar 

  48. Song K, Tian Y, Gao W, Huang T (2006) Diversifying the image retrieval results. In: ACM MM ’06

  49. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR’15

  50. Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473

    Article  Google Scholar 

  51. Tapachai N, Waryszak R (2000) An examination of the role of beneficial image in tourist destination selection. J Travel Res 39(1):37–44

    Article  Google Scholar 

  52. Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li LJ (2016) Yfcc100m: the new data in multimedia research. Commun ACM 59(2):64–73

    Article  Google Scholar 

  53. Tong H, Li M, Zhang H, He J, Zhang C (2004) Classification of digital photos taken by photographers or home users. In: PCM ’04

  54. van Leuken RH, Garcia L, Olivares X, van Zwol R (2009) Visual diversification of image search results. In: WWW ’09

  55. van de Weijer J, Schmid C, Verbeek J (2007) Learning color names from real-world images. In: IEEE CVPR’07

  56. Vargas S, Castells P (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In: RecSys ’11

  57. Weyand T, Kostrikov I, Philbin J (2016) Planet-photo geolocation with convolutional neural networks. In: European conference on computer vision. Springer, pp 37–55

  58. Wu S, Chen YC, Li X, Wu AC, You JJ, Zheng WS (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8

  59. Wu Y, Bauckhage C, Thurau C (2010) The good, the bad, and the ugly: predicting aesthetic image labels. In: ICPR ’10

  60. Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3485–3492

  61. Yeh CH, Ho YC, Barsky BA, Ouhyoung M (2010) Personalized photograph ranking and selection system. In: ACM MM ’10

  62. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV ’14

  63. Zhang N, Donahue J, Girshick RB, Darrell T (2014) Part-based R-CNNs for fine-grained category detection. In: ECCV ’14

  64. Zhao S, Gao Y, Jiang X, Yao H, Chua T, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: MM ’14

  65. Zhao S, Ding G, Huang Q, Chua TS, Schuller BW, Keutzer K (2018) Affective image content analysis: a comprehensive survey. In: IJCAI, pp 5534–5541

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ralph Ewerth.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ceroni, A., Ma, C. & Ewerth, R. Mining exoticism from visual content with fusion-based deep neural networks. Int J Multimed Info Retr 8, 19–33 (2019). https://doi.org/10.1007/s13735-018-00165-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-018-00165-4

Keywords

Navigation