Abstract
In the past, many researchers focus on scene classification in computer vision, because it is an important problem. Tourism scene classification, however, has not been paid attention to in the field of computer vision. In this paper, we introduce a new scenic-spots-centric database called tourism scene, which consists of 25 tourism scenic areas with 750 tourism scene categories, about 440 thousand labeled images. For tourism scene classification, we propose a multi-stage transfer learning model with category hierarchical structure and use convolutional neural networks (e.g., AlexNet) as basic building block. To demonstrate the effectiveness of our proposed model, we also propose a baseline model and one-stage transfer learning model. From the results, we observe that our proposed framework achieves new bounds for performance.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Xiao, J, Hays, J, Ehinger, KA, Oliva, A, Torralba, A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 3485–3492
Patterson, G, Hays, J (2012) Sun attribute database: discovering, annotating, and recognizing scene attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 2751–2758
Quattoni, A, Torralba, A (2009) Recognizing indoor scenes. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 413–420
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer society conference on computer vision and pattern recognition (CVPR), IEEE, pp 324–241
Zhou, B, Lapedrizaand, A, Xiao, J, Torralba, A, Oliva, A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems (NIPS), pp 487–495
Tourist attraction (2015) https://en.wikipedia.org/wiki/Tourist_attraction
Serre Thomas, Kreiman Gabriel, Kouh Minjoon, Cadieu Charles, Knoblich Ulf, Poggio Tomaso (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165:33–56
LeCun Yann, Boser Bernhard, Denker John S, Henderson Donnie, Howard Richard E, Hubbard Wayne, Jackel Lawrence D (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Krizhevsky, A, Sutskever, I, Hinton, GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105
Simonyan, K, Zisserman, A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Zerler, MD, Fergus, R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557
Girshick, R, Donahue, J, Darrell, T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 580–587
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. Pattern Anal Mach Intell (PAMI) 35(8):1915–1929
Tousch AM, Herbin S, Audibert JY (2012) Semantic hierarchies for image annotation: a survey. Pattern Recognit (PR) 45(1):333–345
Bengio, S, Jason, W, David, G (2010) Label embedding trees for large multi-class tasks. In: Advances in neural information processing systems (NIPS), pp 163–171
Gao, T, Daphne, K (2011) Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: International conference on computer vision (ICCV), IEEE, pp 2072–2079
Jia, Y, Joshua, TA, Joseph, A, Thomas, G, Trevor, D (2013) Visual concept learning: Combining machine vision and bayesian generalization on concept hierarchies. In: Advances in neural information processing systems (NIPS), pp 1842–1850
Marszalek, M, Schmid, C (2007) Semantic hierarchies for visual object recognition. In Computer vision and pattern recognition (CVPR), IEEE, pp 1–7
Nakul, V, Mahajan, D, Sellamanickam, S, Nair, V (2012) Learning hierarchical similarity metrics. In Computer vision and pattern recognition (CVPR), IEEE, pp 2280–2287
Bannour, H, Céline, H (2012) Hierarchical image annotation using semantic hierarchies. In: lProceedings of the 21st ACM international conference on Information and knowledge management (IKM), ACM, pp 2431–2434
Li, L-J, Wang, C, Lim, Y, Blei , DM, Fei-Fei L (2010) Building and using a semantivisual image hierarchy. In Computer vision and pattern recognition (CVPR), IEEE, pp 3336–3343
Salakhutdinov, R, Torralba, A, Tenenbaum, J (2011) Learning to share visual appearance for multiclass object detection. In Computer vision and pattern recognition (CVPR), IEEE, pp 1481–1488
Deng, J, Jonathan, K, Alexander, CB, Fei-Fei L (2012) Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In: Computer vision and pattern recognition (CVPR), IEEE, pp 3450–3457
Liu, B, Sadeghi, F, Tappen, M, Shamir, O, Liu, C (2013) Probabilistic label trees for efficient large scale image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 843–850
Srivastava, N, Salakhutdinov, R (2013) Discriminative transfer learning with tree-based priors. In Advances in neural information processing systems (NIPS), pp 2094–2102
Yan, Z, Zhang, H, Piramuthu, R, Jagadeesh, V, DeCoste, D, Di, W, Yu Y (2015) HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2740–2748
Ouyang, W, Wang, X, Zhang, C, Yang, X (2016) Factors in finetuning deep model for object detection with long-tail distribution. In: IEEE computer society conference on computer vision and pattern recognition (CVPR)
Xiao, T, Zhang, J, Yang, K, Peng, Y, Zhang, Z (2014) Error-driven incremental learning in deep convolutional neural network for large-scale image classification. In: Proceedings of the ACM international conference on multimedia (ICM), ACM, pp 177–186
Kim, S, Xing, EP (2010) Tree-guided group lasso for multi-task regression with structured sparsity. In: International conference on machine learning (ICML), pp 543–550
Kang, Z, Grauman, K, Sha F (2011) Learning with whom to share in multi-task feature learning. In: International conference on machine learning (ICML), pp 521–528
Zweig, A, Weinshall D (2013) Hierarchical regularization cascade for joint learning. In: International conference on machine learning (ICML), pp 37–45
List of World Heritage Sites in China (2015) https://en.wikipedia.org/wiki/List_of_World_Heritage_Sites_in_China
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Simonyan, K, Zisserman, A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)
Jia, Y, Shelhamer, E, Donahue, J, Karayev, S, Long, J, Girshick, R, Guadarrama, S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
Krizhevsky, A, Hinton, G (2009) Learning multiple layers of features from tiny images. Master’s thesis, Citeseer
Acknowledgements
This research is partly supported by the National Nature Science Foundation of China (U1611461, 61672241 and 61528204), the Cultivation Project of Major Basic Research of NSF-Guangdong Province (2016A030308013), Guangdong Key Research Base of Technology and Finance (2014B030303005) and Guangdong Provincial Key Laboratory of Technology and Finance & Big Data Analysis (2017B030301010). Yong Xu is also a visiting researcher with Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen 518060, China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Tangquan Qi, Yong Xu and Haibin Ling state that there are no conflicts of interest.
Rights and permissions
About this article
Cite this article
Qi, T., Xu, Y. & Ling, H. Tourism scene classification based on multi-stage transfer learning model. Neural Comput & Applic 31, 4341–4352 (2019). https://doi.org/10.1007/s00521-018-3351-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3351-2