Skip to main content
Log in

CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

High-resolution (HR) remote sensing images provide rich information for human activities. However, processing entire HR images is time-consuming, and many computations are meaningless for change detection tasks since objects often cluster in local regions. To alleviate the pressure of downstream detectors, previous studies introduce a regional attention process to roughly sample candidate patches, but most solutions are tailored to particular tasks and datasets. Motivated by these, we develop a novel reinforcement learning sampling framework, and train a human-like agent, named CDZoom, to locate regions of interest by simulating human zooming behaviors. To be specific, the proposed network consists of an encoder block, multiple context blocks and a decision block. It speeds up sequential sampling operations by gradually focusing the scope of observed scene and increasing the resolution. To avoid the sparse reward problem when learning complex sampling tasks, we introduce a novel training paradigm based on curriculum learning and policy distillation. The proposed CDZoom can sample multi-size patches from multi-scale scenes, and thus generalizes well to different requirements. Experiments on public change detection datasets demonstrate the effectiveness of our method. CDZoom can reduce the computational cost by over 50%, while maintaining similar detection accuracy to models which use full HR images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

All data and materials in this article support the published claims and comply with field standards.

Code Availability

All software application and custom code in this article support the published claims and comply with field standards.

References

  1. VidalMata RG, Banerjee S, RichardWebster B et al (2020) Bridging the gap between computational photography and visual recognition. IEEE Trans Pattern Anal Mach Intell 43(12):4272–4290

    Article  Google Scholar 

  2. Yao H, Dong P, Cheng S et al (2022) Regional attention reinforcement learning for rapid object detection. Comput Electr Eng 98:107747

    Article  Google Scholar 

  3. Chen H, Shi Z (2020) A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens 12(10):1662

    Article  Google Scholar 

  4. Shen L, Lu Y, Chen H et al (2021) S2looking: a satellite side-looking dataset for building change detection. Remote Sens 13(24):5094

    Article  Google Scholar 

  5. Bandara WGC, Patel VM (2022) Revisiting consistency regularization for semi-supervised change detection in remote sensing images. arXiv preprint arXiv:2204.08454

  6. Bandara WGC, Patel VM (2022) A transformer-based siamese network for change detection. arXiv preprint arXiv:2201.01293

  7. Chen H, Qi Z, Shi Z (2021) Remote sensing image change detection with transformers. IEEE Trans Geosci Remote Sens 60:1–14

    Article  Google Scholar 

  8. Uzkent B, Ermon S (2020) Learning when and where to zoom with deep reinforcement learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12345–12354

  9. Uzkent B, Yeh C, Ermon S (2020) Efficient object detection in large images using deep reinforcement learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1824–1833

  10. Lu Y, Javidi T, Lazebnik S (2016) Adaptive object detection using adjacency and zoom prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2351–2359

  11. Gao M, Yu R, Li A, et al (2018) Dynamic zoom-in network for fast object detection in large images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6926–6935

  12. Ayush K, Uzkent B, Tanmay K et al (2021) Efficient poverty mapping from high resolution remote sensing images. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 35, pp 12–20

  13. Xu J, Li Y, Wang S (2021) Adazoom: adaptive zoom network for multi-scale object detection in large scenes. arXiv preprint arXiv:2106.10409

  14. Yang F, Fan H, Chu P et al. (2019) Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8311–8320

  15. Thrun S, Littman ML (2000) Reinforcement learning: an introduction. AI Mag 21(1):103–103

    Google Scholar 

  16. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  17. Iandola FN, Han S, Moskewicz MW et al (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360

  18. Iandola F, Moskewicz M, Karayev S, et al (2014) Densenet: implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869

  19. Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. In: Icml, vol 99, pp 278–287

  20. Hu Y, He H, Xu C et al (2018) Exposure: a white-box photo post-processing framework. ACM Trans Gr (TOG) 37(2):1–17

    Article  Google Scholar 

  21. Singh A (1989) Review article digital change detection techniques using remotely-sensed data. Int J Remote Sens 10(6):989–1003

    Article  Google Scholar 

  22. Tewkesbury AP, Comber AJ, Tate NJ et al (2015) A critical synthesis of remotely sensed optical image change detection techniques. Remote Sens Environ 160:1–14

    Article  Google Scholar 

  23. Khelifi L, Mignotte M (2020) Deep learning for change detection in remote sensing images: comprehensive review and meta-analysis. IEEE Access 8:126385–126400

    Article  Google Scholar 

  24. Benedek C, Szirányi T (2009) Change detection in optical aerial images by a multilayer conditional mixed Markov model. IEEE Trans Geosci Remote Sens 47(10):3416–3430

    Article  Google Scholar 

  25. Bourdis N, Marraud D, Sahbi H (2011) Constrained optical flow for aerial image change detection. In: 2011 IEEE international geoscience and remote sensing symposium, pp 4176–4179

  26. Fujita A, Sakurada K, Imaizumi T et al (2017) Damage detection from aerial images via convolutional neural networks. In: 2017 Fifteenth IAPR international conference on machine vision applications (MVA), pp 5–8

  27. Ji S, Wei S, Lu M (2018) Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans Geosci Remote Sens 57(1):574–586

    Article  Google Scholar 

  28. Lebedev M, Vizilter YV, Vygolov O et al (2018) Change detection in remote sensing images using conditional adversarial networks. Int Arch Photogr Remote Sens Spat Inform Sci 42(2)

  29. Shi Q, Liu M, Li S et al (2021) A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection. IEEE Trans Geosci Remote Sens 60:1–16

    Google Scholar 

  30. Dana A, Shutman M, Perlitz Y, et al (2021) You better look twice: a new perspective for designing accurate detectors with reduced computations. arXiv preprint arXiv:2107.10050

  31. Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. Adv Neural Inform Process Syst 26

  32. Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28

  33. Sutton RS, McAllester D, Singh S et al (1999) Policy gradient methods for reinforcement learning with function approximation. Adv Neural Inform Process Syst 12

  34. Arulkumaran K, Deisenroth MP, Brundage M et al (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38

    Article  Google Scholar 

  35. Bengio Y, Louradour J, Collobert R et al (2009) urriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48

  36. Narvekar S, Sinapov J, Leonetti M et al (2016) Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems, pp 566–574

  37. Kirkpatrick J, Pascanu R, Rabinowitz N et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526

    Article  MathSciNet  MATH  Google Scholar 

  38. Czarnecki WM, Pascanu R, Osindero S et al (2019) Distilling policy distillation. In: The 22nd international conference on artificial intelligence and statistics, pp 1331–1340 PMLR

  39. Recasens A, Kellnhofer P, Stent S et al (2018) Learning to zoom: a saliency-based sampling layer for neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 51–66

  40. Thavamani C, Li M, Cebron N et al (2021) Fovea: foveated image magnification for autonomous navigation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15539–15548

  41. Zhang R, Guo L, Huang S et al (2021) Rellie: deep reinforcement learning for customized low-light image enhancement. In: Proceedings of the 29th ACM international conference on multimedia, pp 2429–2437

Download references

Funding

This work was supported by National Natural Science Foundation of China (91938301).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design.

Corresponding author

Correspondence to Yijun Lin.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Ethical approval

The authors confirm that all experimental protocols were approved by the Institute of Software Chinese Academy of Sciences. The methods were carried out in accordance with the relevant guidelines and regulations, and informed consent has been obtained from all authors.

Consent to participate

The consent to participate has been obtained from all authors.

Consent for publication

The consent for publication has been obtained from all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, Y., Wu, F. & Zhao, J. CDZoom: a human-like sequential zoom agent for efficient change detection in large scenes. Neural Comput & Applic 35, 8227–8241 (2023). https://doi.org/10.1007/s00521-022-08096-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-08096-2

Keywords

Navigation