Abstract
Deep neural networks (DNN) have been widely used for image classification. One major hurdle of deep learning approaches is that large sets of labeled data are necessary, which can be prohibitively costly to obtain, particularly in architectural style classification. Data augmentation can alleviate this labeling effort. In this paper, we use data augmentation to increase the number of architectural style datasets. To extract building elements, the inputs are preprocessed by Deformable Part Model (DPM) first, and then the preprocessed images are sent to the data augmentation to increase the number of images. Next, we design a deep neural network based on GoogLeNet. The proposed network aims to learn robust feature embeddings to improve architectural style classification performance. Finally, architectural style can be classified by the robust feature embeddings. Experimental results show that our approach achieves promising performance and is superior to previous methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The first author is a student.
References
Xu, Z., Tao, D., Zhang, Y., Wu, J., Tsoi, A.: Architectural style classification using multinomial latent logistic regression. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) European Conference on Computer Vision, pp. 600–615. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_39
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Peng, C., Xiao, T., Li, Z., et al.: MegDet: a large mini-batch object detector. arXiv preprint arXiv:1711.07240 (2017). 7
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Szegedy, C., Ioffe, S., Vanhoucke, V., et al.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol. 4, p. 12 (2017)
Berg, A.C., Grabler, F., Malik, J.: Parsing images of architectural scenes. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)
Chu, W.T., Tsai, M.H.: Visual pattern discovery for architecture image classification and product image search. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, p. 27. ACM (2012)
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.: What makes paris look like Paris? ACM Trans. Graph. 31(4) (2015)
Goel, A., Juneja, M., Jawahar, C.V.: Are buildings only instances?: Exploration in architectural style categories. In: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing, ACM (2012). Article number 1
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Song, X., Petrak, J., Roberts, A.: A Deep Neural Network Sentence Level Classification Method with Context Information (2018)
Kitada, S., Iyatomi, H.: Skin lesion classification with ensemble of squeeze-and-excitation networks and semi-supervised learning. arXiv preprint arXiv:1809.02568 (2018)
Zeng, J., Li, J., Song, Y., Gao, C., Lyu, M.R., King, I.: Topic memory networks for short text classification. In EMNLP (2018)
Sharma, V., Diba, A., Neven, D., et al.: Classification-Driven Dynamic Image Enhancement. arXiv preprint arXiv:1710.07558 (2017)
Wang, L., Li, W., Li, W., et al.: Appearance-and-relation networks for video classification. arXiv preprint arXiv:1711.09125 (2017)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: IEEE International Conference on Computer Vision, pp. 4489–4497. IEEE (2015)
Miao, Q., Liu, R., Zhao, P., et al.: A semi-supervised image classification model based on improved ensemble projection algorithm. IEEE Access 6, 1372–1379 (2018)
Zhao, P.: 基于集成投影及卷积神经网络的建筑风格分类算法研究. Xidian University (2015)
Acknowledgments
The work was jointly supported by the National Key R&D Program of China under Grant No. 2018YFC0807500, the National Key Research and Development Program of China No. 238, the National Natural Science Foundations of China under grant No. 61772396, 61472302, 61772392, the Fundamental Research Funds for the Central Universities under grant No. JBF180301, and Xi’an Key Laboratory of Big Data and Intelligent Vision under grant No. 201805053ZD4CG37.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, P., Miao, Q., Liu, R., Song, J. (2019). Architectural Style Classification Based on DNN Model. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11857. Springer, Cham. https://doi.org/10.1007/978-3-030-31654-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-31654-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31653-2
Online ISBN: 978-3-030-31654-9
eBook Packages: Computer ScienceComputer Science (R0)