Skip to main content

Tile2Vec with Predicting Noise for Land Cover Classification

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13111))

Included in the following conference series:

  • 1914 Accesses

Abstract

Tile2vec has proven to be a good representation learning model in the remote sensing field. The success of the model depends on l2-norm regularization. However, l2-norm regularization has the main drawback that affects the regularization. We propose to replace the l2-norm with regularization with predicting noise framework. We then develop an algorithm to integrate the framework. We evaluate the model by using it as a feature extractor on the land cover classification task. The result shows that our proposed model outperforms all the baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)

  2. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223. PMLR, July 2017

    Google Scholar 

  3. Bojanowski, P., Joulin, A.: Unsupervised learning by predicting noise. In: International Conference on Machine Learning, pp. 517–526. PMLR, July 2017

    Google Scholar 

  4. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR, November 2020

    Google Scholar 

  5. Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 1422–1430 (2015)

    Google Scholar 

  6. Ermon: tile2vec. https://github.com/ermongroup/tile2vec (2019). Accessed 24 June 2021

  7. Fried, O., Avidan, S., Cohen-Or, D.: Patch2vec: globally consistent image patch representation. In: Computer Graphics Forum, vol. 36, no. 7, pp. 183–194, October 2017

    Google Scholar 

  8. Gao, S., Yan, B.: Place2vec: visualizing and reasoning about place type similarity and relatedness by learning context embeddings. In: Adjunct Proceedings of the 14th International Conference on Location Based Services, pp. 225–226. ETH Zurich, January 2018

    Google Scholar 

  9. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1, no. 2. MIT Press, Cambridge (2016)

    Google Scholar 

  10. Goodfellow, I.J., et al.: Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014)

  11. Grill, J.B., et al.: Bootstrap your own latent: a new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020)

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Helber, P., Bischke, B., Dengel, A., Borth, D.: Eurosat: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 12(7), 2217–2226 (2019)

    Article  Google Scholar 

  14. Jean, N., Wang, S., Samar, A., Azzari, G., Lobell, D., Ermon, S.: Tile2vec: unsupervised representation learning for spatially distributed data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 3967–3974, July 2019

    Google Scholar 

  15. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  16. Lin, D., Fu, K., Wang, Y., Xu, G., Sun, X.: MARTA GANs: unsupervised representation learning for remote sensing image classification. IEEE Geosci. Remote Sens. Lett. 14(11), 2092–2096 (2017)

    Article  Google Scholar 

  17. Lu, X., Zheng, X., Yuan, Y.: Remote sensing scene classification by unsupervised representation learning. IEEE Trans. Geosci. Remote Sens. 55(9), 5148–5157 (2017)

    Article  Google Scholar 

  18. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  19. Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_5

    Chapter  Google Scholar 

  20. Parr, T.: 3 The difference between L1 and L2 regularization. https://explained.ai/regularization/L1vsL2.html#sec:3.2. Accessed 22 June 2021

  21. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, October 2014

    Google Scholar 

  22. Qi, G.J., Zhang, L., Chen, C.W., Tian, Q.: AVT: unsupervised learning of transformation equivariant representations by autoencoding variational transformations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8130–8139 (2019)

    Google Scholar 

  23. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  24. Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 59–66. IEEE, January 1998

    Google Scholar 

  25. Vali, A., Comai, S., Matteucci, M.: Deep learning for land use and land cover classification based on hyperspectral and multispectral earth observation data: a review. Remote Sens. 12(15), 2495 (2020)

    Article  Google Scholar 

  26. Varghese, A., Gubbi, J., Ramaswamy, A., Balamuralidhar, P.: ChangeNet: a deep learning architecture for visual change detection. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0–0 (2018)

    Google Scholar 

  27. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103, July 2008

    Google Scholar 

  28. Wang, Z., Li, H., Rajagopal, R.: Urban2Vec: incorporating street view imagery and pois for multi-modal urban neighborhood embedding. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 01, pp. 1013–1020, April 2020

    Google Scholar 

  29. Zhang, L., Qi, G.J., Wang, L., Luo, J.: Aet vs. aed: unsupervised representation learning by auto-encoding transformations rather than data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2547–2555 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marshal Arijona Sinaga .

Editor information

Editors and Affiliations

Appendices

A Model’s Architecture

The model adopts Resnet18 architecture with slight differences. Each row describes a residual block with particular kernels except the first row. All blocks set padding as 1.

Encoder

Note

Conv(kernels=64, size=3, stride=1), B-Norm, ReLU

1 Block

Conv(kernels=64, size=3, stride=1), B-Norm, ReLU

2 Blocks

Conv(kernels=64, size=3, stride=1), B-Norm, ReLU

Conv(kernels=128, size=3, stride=2), B-Norm, ReLU

2 Blocks

Conv(kernels=128, size=3, stride=1), B-Norm, ReLU

Conv(kernels=256, size=3, stride=2), B-Norm, ReLU

2 Blocks

Conv(kernels=256, size=3, stride=1), B-Norm, ReLU

Conv(kernels=512, size=3, stride=2), B-Norm, ReLU

2 Blocks

Conv(kernels=512, size=3, stride=1), Batch Norm, ReLU

Conv(kernels=z, size=3, stride=2), Batch Norm, ReLU

2 Blocks

Conv(kernels=z, size=3, stride=1), Batch Norm, ReLU

B Hyperparameters

Parameter

Value

Learning rate

0.02

\(\alpha \)

1.0

Representation dimension

512

m

0.1

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sinaga, M.A., Ali, F.M., Arymurthy, A.M. (2021). Tile2Vec with Predicting Noise for Land Cover Classification. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92273-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92272-6

  • Online ISBN: 978-3-030-92273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics