CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes

Lin, Yijun; Wu, Fengge; Zhao, Junsuo

doi:10.1007/s00521-022-08096-2

CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes

Original Article
Published: 08 December 2022

Volume 35, pages 8227–8241, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

181 Accesses
1 Altmetric
Explore all metrics

Abstract

High-resolution (HR) remote sensing images provide rich information for human activities. However, processing entire HR images is time-consuming, and many computations are meaningless for change detection tasks since objects often cluster in local regions. To alleviate the pressure of downstream detectors, previous studies introduce a regional attention process to roughly sample candidate patches, but most solutions are tailored to particular tasks and datasets. Motivated by these, we develop a novel reinforcement learning sampling framework, and train a human-like agent, named CDZoom, to locate regions of interest by simulating human zooming behaviors. To be specific, the proposed network consists of an encoder block, multiple context blocks and a decision block. It speeds up sequential sampling operations by gradually focusing the scope of observed scene and increasing the resolution. To avoid the sparse reward problem when learning complex sampling tasks, we introduce a novel training paradigm based on curriculum learning and policy distillation. The proposed CDZoom can sample multi-size patches from multi-scale scenes, and thus generalizes well to different requirements. Experiments on public change detection datasets demonstrate the effectiveness of our method. CDZoom can reduce the computational cost by over 50%, while maintaining similar detection accuracy to models which use full HR images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Data availability

All data and materials in this article support the published claims and comply with field standards.

Code Availability

All software application and custom code in this article support the published claims and comply with field standards.

References

VidalMata RG, Banerjee S, RichardWebster B et al (2020) Bridging the gap between computational photography and visual recognition. IEEE Trans Pattern Anal Mach Intell 43(12):4272–4290
Article Google Scholar
Yao H, Dong P, Cheng S et al (2022) Regional attention reinforcement learning for rapid object detection. Comput Electr Eng 98:107747
Article Google Scholar
Chen H, Shi Z (2020) A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens 12(10):1662
Article Google Scholar
Shen L, Lu Y, Chen H et al (2021) S2looking: a satellite side-looking dataset for building change detection. Remote Sens 13(24):5094
Article Google Scholar
Bandara WGC, Patel VM (2022) Revisiting consistency regularization for semi-supervised change detection in remote sensing images. arXiv preprint arXiv:2204.08454
Bandara WGC, Patel VM (2022) A transformer-based siamese network for change detection. arXiv preprint arXiv:2201.01293
Chen H, Qi Z, Shi Z (2021) Remote sensing image change detection with transformers. IEEE Trans Geosci Remote Sens 60:1–14
Article Google Scholar
Uzkent B, Ermon S (2020) Learning when and where to zoom with deep reinforcement learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12345–12354
Uzkent B, Yeh C, Ermon S (2020) Efficient object detection in large images using deep reinforcement learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1824–1833
Lu Y, Javidi T, Lazebnik S (2016) Adaptive object detection using adjacency and zoom prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2351–2359
Gao M, Yu R, Li A, et al (2018) Dynamic zoom-in network for fast object detection in large images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6926–6935
Ayush K, Uzkent B, Tanmay K et al (2021) Efficient poverty mapping from high resolution remote sensing images. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 35, pp 12–20
Xu J, Li Y, Wang S (2021) Adazoom: adaptive zoom network for multi-scale object detection in large scenes. arXiv preprint arXiv:2106.10409
Yang F, Fan H, Chu P et al. (2019) Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8311–8320
Thrun S, Littman ML (2000) Reinforcement learning: an introduction. AI Mag 21(1):103–103
Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Iandola FN, Han S, Moskewicz MW et al (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360
Iandola F, Moskewicz M, Karayev S, et al (2014) Densenet: implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869
Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. In: Icml, vol 99, pp 278–287
Hu Y, He H, Xu C et al (2018) Exposure: a white-box photo post-processing framework. ACM Trans Gr (TOG) 37(2):1–17
Article Google Scholar
Singh A (1989) Review article digital change detection techniques using remotely-sensed data. Int J Remote Sens 10(6):989–1003
Article Google Scholar
Tewkesbury AP, Comber AJ, Tate NJ et al (2015) A critical synthesis of remotely sensed optical image change detection techniques. Remote Sens Environ 160:1–14
Article Google Scholar
Khelifi L, Mignotte M (2020) Deep learning for change detection in remote sensing images: comprehensive review and meta-analysis. IEEE Access 8:126385–126400
Article Google Scholar
Benedek C, Szirányi T (2009) Change detection in optical aerial images by a multilayer conditional mixed Markov model. IEEE Trans Geosci Remote Sens 47(10):3416–3430
Article Google Scholar
Bourdis N, Marraud D, Sahbi H (2011) Constrained optical flow for aerial image change detection. In: 2011 IEEE international geoscience and remote sensing symposium, pp 4176–4179
Fujita A, Sakurada K, Imaizumi T et al (2017) Damage detection from aerial images via convolutional neural networks. In: 2017 Fifteenth IAPR international conference on machine vision applications (MVA), pp 5–8
Ji S, Wei S, Lu M (2018) Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans Geosci Remote Sens 57(1):574–586
Article Google Scholar
Lebedev M, Vizilter YV, Vygolov O et al (2018) Change detection in remote sensing images using conditional adversarial networks. Int Arch Photogr Remote Sens Spat Inform Sci 42(2)
Shi Q, Liu M, Li S et al (2021) A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection. IEEE Trans Geosci Remote Sens 60:1–16
Google Scholar
Dana A, Shutman M, Perlitz Y, et al (2021) You better look twice: a new perspective for designing accurate detectors with reduced computations. arXiv preprint arXiv:2107.10050
Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. Adv Neural Inform Process Syst 26
Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28
Sutton RS, McAllester D, Singh S et al (1999) Policy gradient methods for reinforcement learning with function approximation. Adv Neural Inform Process Syst 12
Arulkumaran K, Deisenroth MP, Brundage M et al (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38
Article Google Scholar
Bengio Y, Louradour J, Collobert R et al (2009) urriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48
Narvekar S, Sinapov J, Leonetti M et al (2016) Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems, pp 566–574
Kirkpatrick J, Pascanu R, Rabinowitz N et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526
Article MathSciNet MATH Google Scholar
Czarnecki WM, Pascanu R, Osindero S et al (2019) Distilling policy distillation. In: The 22nd international conference on artificial intelligence and statistics, pp 1331–1340 PMLR
Recasens A, Kellnhofer P, Stent S et al (2018) Learning to zoom: a saliency-based sampling layer for neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 51–66
Thavamani C, Li M, Cebron N et al (2021) Fovea: foveated image magnification for autonomous navigation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15539–15548
Zhang R, Guo L, Huang S et al (2021) Rellie: deep reinforcement learning for customized low-light image enhancement. In: Proceedings of the 29th ACM international conference on multimedia, pp 2429–2437

Download references

Funding

This work was supported by National Natural Science Foundation of China (91938301).

Author information

Authors and Affiliations

Institute of Software Chinese Academy of Sciences, Beijing, 100190, China
Yijun Lin, Fengge Wu & Junsuo Zhao
University of Chinese Academy of Sciences, Beijing, 100049, China
Yijun Lin, Fengge Wu & Junsuo Zhao

Authors

Yijun Lin
View author publications
You can also search for this author in PubMed Google Scholar
Fengge Wu
View author publications
You can also search for this author in PubMed Google Scholar
Junsuo Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design.

Corresponding author

Correspondence to Yijun Lin.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Ethical approval

The authors confirm that all experimental protocols were approved by the Institute of Software Chinese Academy of Sciences. The methods were carried out in accordance with the relevant guidelines and regulations, and informed consent has been obtained from all authors.

Consent to participate

The consent to participate has been obtained from all authors.

Consent for publication

The consent for publication has been obtained from all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, Y., Wu, F. & Zhao, J. CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes. Neural Comput & Applic 35, 8227–8241 (2023). https://doi.org/10.1007/s00521-022-08096-2

Download citation

Received: 17 June 2022
Accepted: 22 November 2022
Published: 08 December 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s00521-022-08096-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A survey on Image Data Augmentation for Deep Learning

Attention mechanisms in computer vision: A survey

Data availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

A survey on Image Data Augmentation for Deep Learning

Attention mechanisms in computer vision: A survey

Data availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation