Tile2Vec with Predicting Noise for Land Cover Classification

Sinaga, Marshal Arijona; Ali, Fadel Muhammad; Arymurthy, Aniati Murni

doi:10.1007/978-3-030-92273-3_8

Marshal Arijona Sinaga¹³,
Fadel Muhammad Ali¹³ &
Aniati Murni Arymurthy¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13111))

Included in the following conference series:

International Conference on Neural Information Processing

1914 Accesses

Abstract

Tile2vec has proven to be a good representation learning model in the remote sensing field. The success of the model depends on l2-norm regularization. However, l2-norm regularization has the main drawback that affects the regularization. We propose to replace the l2-norm with regularization with predicting noise framework. We then develop an algorithm to integrate the framework. We evaluate the model by using it as a feature extractor on the land cover classification task. The result shows that our proposed model outperforms all the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223. PMLR, July 2017
Google Scholar
Bojanowski, P., Joulin, A.: Unsupervised learning by predicting noise. In: International Conference on Machine Learning, pp. 517–526. PMLR, July 2017
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR, November 2020
Google Scholar
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 1422–1430 (2015)
Google Scholar
Ermon: tile2vec. https://github.com/ermongroup/tile2vec (2019). Accessed 24 June 2021
Fried, O., Avidan, S., Cohen-Or, D.: Patch2vec: globally consistent image patch representation. In: Computer Graphics Forum, vol. 36, no. 7, pp. 183–194, October 2017
Google Scholar
Gao, S., Yan, B.: Place2vec: visualizing and reasoning about place type similarity and relatedness by learning context embeddings. In: Adjunct Proceedings of the 14th International Conference on Location Based Services, pp. 225–226. ETH Zurich, January 2018
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1, no. 2. MIT Press, Cambridge (2016)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014)
Grill, J.B., et al.: Bootstrap your own latent: a new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Helber, P., Bischke, B., Dengel, A., Borth, D.: Eurosat: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 12(7), 2217–2226 (2019)
Article Google Scholar
Jean, N., Wang, S., Samar, A., Azzari, G., Lobell, D., Ermon, S.: Tile2vec: unsupervised representation learning for spatially distributed data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 3967–3974, July 2019
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Lin, D., Fu, K., Wang, Y., Xu, G., Sun, X.: MARTA GANs: unsupervised representation learning for remote sensing image classification. IEEE Geosci. Remote Sens. Lett. 14(11), 2092–2096 (2017)
Article Google Scholar
Lu, X., Zheng, X., Yuan, Y.: Remote sensing scene classification by unsupervised representation learning. IEEE Trans. Geosci. Remote Sens. 55(9), 5148–5157 (2017)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_5
Chapter Google Scholar
Parr, T.: 3 The difference between L1 and L2 regularization. https://explained.ai/regularization/L1vsL2.html#sec:3.2. Accessed 22 June 2021
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, October 2014
Google Scholar
Qi, G.J., Zhang, L., Chen, C.W., Tian, Q.: AVT: unsupervised learning of transformation equivariant representations by autoencoding variational transformations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8130–8139 (2019)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 59–66. IEEE, January 1998
Google Scholar
Vali, A., Comai, S., Matteucci, M.: Deep learning for land use and land cover classification based on hyperspectral and multispectral earth observation data: a review. Remote Sens. 12(15), 2495 (2020)
Article Google Scholar
Varghese, A., Gubbi, J., Ramaswamy, A., Balamuralidhar, P.: ChangeNet: a deep learning architecture for visual change detection. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0–0 (2018)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103, July 2008
Google Scholar
Wang, Z., Li, H., Rajagopal, R.: Urban2Vec: incorporating street view imagery and pois for multi-modal urban neighborhood embedding. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 01, pp. 1013–1020, April 2020
Google Scholar
Zhang, L., Qi, G.J., Wang, L., Luo, J.: Aet vs. aed: unsupervised representation learning by auto-encoding transformations rather than data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2547–2555 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science University of Indonesia, Depok, Indonesia
Marshal Arijona Sinaga, Fadel Muhammad Ali & Aniati Murni Arymurthy

Authors

Marshal Arijona Sinaga
View author publications
You can also search for this author in PubMed Google Scholar
Fadel Muhammad Ali
View author publications
You can also search for this author in PubMed Google Scholar
Aniati Murni Arymurthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marshal Arijona Sinaga .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Appendices

A Model’s Architecture

The model adopts Resnet18 architecture with slight differences. Each row describes a residual block with particular kernels except the first row. All blocks set padding as 1.

Encoder	Note
Conv(kernels=64, size=3, stride=1), B-Norm, ReLU	1 Block
Conv(kernels=64, size=3, stride=1), B-Norm, ReLU	2 Blocks
Conv(kernels=64, size=3, stride=1), B-Norm, ReLU	2 Blocks
Conv(kernels=128, size=3, stride=2), B-Norm, ReLU	2 Blocks
Conv(kernels=128, size=3, stride=1), B-Norm, ReLU	2 Blocks
Conv(kernels=256, size=3, stride=2), B-Norm, ReLU	2 Blocks
Conv(kernels=256, size=3, stride=1), B-Norm, ReLU	2 Blocks
Conv(kernels=512, size=3, stride=2), B-Norm, ReLU	2 Blocks
Conv(kernels=512, size=3, stride=1), Batch Norm, ReLU	2 Blocks
Conv(kernels=z, size=3, stride=2), Batch Norm, ReLU	2 Blocks
Conv(kernels=z, size=3, stride=1), Batch Norm, ReLU	2 Blocks

B Hyperparameters

Parameter	Value
Learning rate	0.02
\(\alpha \)	1.0
Representation dimension	512
m	0.1

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sinaga, M.A., Ali, F.M., Arymurthy, A.M. (2021). Tile2Vec with Predicting Noise for Land Cover Classification. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-92273-3_8
Published: 05 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92272-6
Online ISBN: 978-3-030-92273-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tile2Vec with Predicting Noise for Land Cover Classification