Multi-Scale Cross Distillation for Object Detection in Aerial Images

Wang, Kun; Wang, Zi; Li, Zhang; Teng, Xichao; Li, Yang

doi:10.1007/978-3-031-72967-6_25

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15107))

Included in the following conference series:

European Conference on Computer Vision

253 Accesses

Abstract

Object detection in aerial images is a longstanding yet challenging task. Despite the significant advancements in recent years, most works still show unsatisfactory performance due to the scale variation of objects. A standard strategy to address this problem is multi-scale training, aiming to learn scale-invariant feature representations. Albeit achieving inspiring improvements, such a multi-scale strategy is impractical for real application as inference time increases considerably. Besides, the original images are resized to different scales and subsequently trained separately, lacking information interaction across different scales. This paper presents a novel method called multi-scale cross distillation (MSCD) to address the issues mentioned above. MSCD combines the merits of multi-scale training and knowledge distillation, enabling single-scale inference to achieve comparable or superior performance than multi-scale inference. Specifically, we first construct a parallel multi-branch architecture, in which each branch shares the same parameters yet takes images with different scales as input. Furthermore, we design an adaptive cross-scale distillation module that adaptively integrates the knowledge of different branches into one. Thus, the detectors trained with MSCD only require single-scale inference. Extensive experiments demonstrate the effectiveness of MSCD. Without bells and whistles, MSCD can facilitate prevalent two-stage detectors to outperform corresponding single-scale models by $\sim $5 and $\sim $7 mAP improvement on DOTA and DIOR-R datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ADD-YOLO: a new model for object detection in aerial images

Article 18 February 2025

Enhanced Tiny Object Detection in Aerial Images

ESOD-YOLO: an enhanced efficient small object detection framework for aerial images

Article 18 January 2025

Notes

1.
It is worth noting that the RPN only predicts foreground and background, which does not make predictions for all categories. Thus, for each prediction of RPN, $y_i^{m, R}=[c_i^{m, R}, b_i^{m, R}]\in \mathbb {R}^{2+5}$.
2.
In this paper, we use the gray font to indicate that variables do not participate in gradient back-propagation.
3.
When conducting multi-scale training and testing, the original images are first resized to three scales, i.e., (0.5, 1.0, 1.5), which are then cropped to 1,024 $\times $ 1,024 patches with a stride of 524.

References

Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9155–9163 (2019). https://doi.org/10.1109/CVPR.2019.00938
Blaschke, T.: Object based image analysis for remote sensing. ISPRS J. Photogramm. Remote. Sens. 65(1), 2–16 (2010). https://doi.org/10.1016/j.isprsjprs.2009.06.004
Article Google Scholar
Blaschke, T., et al.: Geographic object-based image analysis - towards a new paradigm. ISPRS J. Photogramm. Remote. Sens. 87, 180–191 (2014). https://doi.org/10.1016/j.isprsjprs.2013.09.014
Article Google Scholar
Burochin, J.P., Vallet, B., Brédif, M., Mallet, C., Brosset, T., Paparoditis, N.: Detecting blind building façades from highly overlapping wide angle aerial imagery. ISPRS J. Photogramm. Remote. Sens. 96, 193–209 (2014). https://doi.org/10.1016/j.isprsjprs.2014.07.011
Article Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018). https://doi.org/10.1109/CVPR.2018.00644
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018). https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Chen, R., Ai, H., Shang, C., Chen, L., Zhuang, Z.: Learning lightweight pedestrian detector with hierarchical knowledge distillation. In: Proceedings of the IEEE International Conference on Image Processing, pp. 1645–1649 (2019). https://doi.org/10.1109/ICIP.2019.8803079
Cheng, G., et al.: Anchor-free oriented proposal generator for object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022). https://doi.org/10.1109/TGRS.2022.3183022
Article Google Scholar
Cheng, G., et al.: Dual-aligned oriented detector. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022). https://doi.org/10.1109/TGRS.2022.3149780
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995). https://doi.org/10.1023/A:1022627411411
Article Google Scholar
Dai, X., et al.: General instance distillation for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7838–7847 (2021). https://doi.org/10.1109/CVPR46437.2021.00775
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177
Ding, J., Xue, N., Long, Y., Xia, G.S., Lu, Q.: Learning ROI transformer for oriented object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2844–2853 (2019). https://doi.org/10.1109/CVPR.2019.00296
Ding, J., et al.: Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7778–7796 (2022). https://doi.org/10.1109/TPAMI.2021.3117983
Article Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6568–6577 (2019). https://doi.org/10.1109/ICCV.2019.00667
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005). https://doi.org/10.1109/CVPR.2005.16
Guo, Q., et al.: Online knowledge distillation via collaborative learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11017–11026 (2020). https://doi.org/10.1109/CVPR42600.2020.01103
Han, J., Ding, J., Li, J., Xia, G.S.: Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2022). https://doi.org/10.1109/TGRS.2021.3062048
Article Google Scholar
Han, J., Ding, J., Xue, N., Xia, G.S.: Redet: a rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2785–2794 (2021). https://doi.org/10.1109/CVPR46437.2021.00281
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Hei, L., Jia, D.: Cornernet: detecting objects as paired keypoints. Int. J. Comput. Vision 128, 642–656 (2020). https://doi.org/10.1007/s11263-019-01204-1
Article Google Scholar
Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1921–1930 (2019). https://doi.org/10.1109/ICCV.2019.00201
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006). https://doi.org/10.1126/science.1127647
Article MathSciNet Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv e-prints (2015)
Google Scholar
Hou, Y., Ma, Z., Liu, C., Loy, C.C.: Learning lightweight lane detection CNNs by self attention distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1013–1021 (2019). https://doi.org/10.1109/ICCV.2019.00110
Kim, K., Ji, B., Yoon, D., Hwang, S.: Self-knowledge distillation with progressive refinement of targets. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6547–6556 (2021). https://doi.org/10.1109/ICCV48922.2021.00650
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc, Red Hook (2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539
Article Google Scholar
Leitloff, J., Hinz, S., Stilla, U.: Vehicle detection in very high resolution satellite images of city areas. IEEE Trans. Geosci. Remote Sens. 48(7), 2795–2806 (2010). https://doi.org/10.1109/TGRS.2010.2043109
Article Google Scholar
Li, F., Zhang, H., Liu, S., Guo, J., Ni, L.M., Zhang, L.: DN-DETR: accelerate DETR training by introducing query denoising. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13609–13617 (2022). https://doi.org/10.1109/CVPR52688.2022.01325
Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7341–7349 (2017). https://doi.org/10.1109/CVPR.2017.776
Li, Y., Chen, Y., Wang, N., Zhang, Z.X.: Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6053–6062 (2019). https://doi.org/10.1109/ICCV.2019.00615
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017). https://doi.org/10.1109/CVPR.2017.106
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020). https://doi.org/10.1109/TPAMI.2018.2858826
Article Google Scholar
Liu, S., et al.: DAB-DETR: dynamic anchor boxes are better queries for DETR. In: Proceedings of the International Conference on Learning Representations (2022)
Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018). https://doi.org/10.1109/CVPR.2018.00913
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Liu, W., Zhang, T., Huang, S., Li, K.: A hybrid optimization framework for UAV reconnaissance mission planning. Comput. Ind. Eng. 173, 108653 (2022). https://doi.org/10.1016/j.cie.2022.108653
Article Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20(11), 3111–3122 (2018). https://doi.org/10.1109/TMM.2018.2818020
Article Google Scholar
Ma, T., Tian, W., Xie, Y.: Multi-level knowledge distillation for low-resolution object detection and facial expression recognition. Knowl.-Based Syst. 240, 108136 (2022)
Article Google Scholar
Nguyen, C.H., Nguyen, T.C., Tang, T.N., Phan, N.L.H.: Improving object detection by label assignment distillation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1322–1331 (2022). https://doi.org/10.1109/WACV51458.2022.00139
Osco, L.P., et al.: A CNN approach to simultaneously count plants and detect plantation-rows from UAV imagery. ISPRS J. Photogram. Remote Sens. 174, 1–17 (2021). https://doi.org/10.1016/j.isprsjprs.2021.01.024
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3962–3971 (2019). https://doi.org/10.1109/CVPR.2019.00409
Qi, L., et al.: Multi-scale aligned distillation for low-resolution detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14438–14448 (2021). https://doi.org/10.1109/CVPR46437.2021.01421
Qian, W., Yang, X., Peng, S., Yan, J., Guo, Y.: Learning modulated loss for rotated object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2458–2466 (2021). https://doi.org/10.1609/aaai.v35i3.16347
Qian, W., Yang, X., Peng, S., Zhang, X., Yan, J.: RSDet++: point-based modulated loss for more accurate rotated object detection. IEEE Trans. Circuits Syst. Video Technol. 32(11), 7869–7879 (2022). https://doi.org/10.1109/TCSVT.2022.3186070
Article Google Scholar
Qiao, S., Chen, L.C., Yuille, A.: Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10208–10219 (2021). https://doi.org/10.1109/CVPR46437.2021.01008
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525 (2017). https://doi.org/10.1109/CVPR.2017.690
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Salvoldi, M., Cohen-Zada, A.L., Karnieli, A.: Using the venus super-spectral camera for detecting moving vehicles. ISPRS J. Photogramm. Remote. Sens. 192, 33–48 (2022). https://doi.org/10.1016/j.isprsjprs.2022.08.005
Article Google Scholar
Singh, B., Davis, L.S.: An analysis of scale invariance in object detection - snip. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3578–3587 (2018). https://doi.org/10.1109/CVPR.2018.00377
Singh, B., Najibi, M., Davis, L.S.: Sniper: Efficient multi-scale training. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 31 (2018). https://proceedings.neurips.cc/paper/2018/file/166cee72e93a992007a89b39eb29628b-Paper.pdf
Wang, T., Yuan, L., Zhang, X., Feng, J.: Distilling object detectors with fine-grained feature imitation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4928–4937 (2019). https://doi.org/10.1109/CVPR.2019.00507
Xia, G.S., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3974–3983 (2018). https://doi.org/10.1109/CVPR.2018.00418
Xie, X., Cheng, G., Wang, J., Yao, X., Han, J.: Oriented R-CNN for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3500–3509 (2021). https://doi.org/10.1109/ICCV48922.2021.00350
Xu, G., Liu, Z., Li, X., Loy, C.C.: Knowledge distillation meets self-supervision. In: Proceedings of the European Conference on Computer Vision, pp. 588–604 (2020)
Google Scholar
Xu, Y., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 43(4), 1452–1459 (2021). https://doi.org/10.1109/TPAMI.2020.2974745
Article Google Scholar
Yang, X., Yan, J., Feng, Z., He, T.: R3DET: refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3163–3171 (2021). https://doi.org/10.1609/aaai.v35i4.16426
Yang, X., Hou, L., Zhou, Y., Wang, W., Yan, J.: Dense label encoding for boundary discontinuity free rotation detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15814–15824 (2021). https://doi.org/10.1109/CVPR46437.2021.01556
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 677–694. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_40
Chapter Google Scholar
Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., Tian, Q.: Rethinking rotated object detection with gaussian Wasserstein distance loss. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 11830–11841 (2021). https://proceedings.mlr.press/v139/yang21l.html
Yang, X., et al.: Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 34, pp. 18381–18394 (2021). https://proceedings.neurips.cc/paper/2021/file/98f13708210194c475687be6106a3b84-Paper.pdf
Yang, X., et al.: The KFIoU Loss for Rotated Object Detection. arXiv e-prints arXiv:2201.12558 (2022)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: point set representation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9656–9665 (2019). https://doi.org/10.1109/ICCV.2019.00975
Yang, Z., et al.: Focal and global knowledge distillation for detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4633–4642 (2022). https://doi.org/10.1109/CVPR52688.2022.00460
Yu, Y., Da, F.: Phase-shifting coder: predicting accurate orientation in oriented object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13354–13363 (2023). https://doi.org/10.1109/CVPR52729.2023.01283
Zhang, H., et al.: DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. arXiv e-prints (2022). https://doi.org/10.48550/arXiv.2203.03605
Zhang, L., Ma, K.: Structured knowledge distillation for accurate and efficient object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(12), 15706–15724 (2023). https://doi.org/10.1109/TPAMI.2023.3300470
Article Google Scholar
Zhang, T., et al.: Balance learning for ship detection from synthetic aperture radar remote sensing imagery. ISPRS J. Photogramm. Remote. Sens. 182, 190–207 (2021). https://doi.org/10.1016/j.isprsjprs.2021.10.010
Article Google Scholar
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4320–4328 (2018). https://doi.org/10.1109/CVPR.2018.00454
Zhao, F., Xia, L., Kylling, A., Li, R., Shang, H., Xu, M.: Detection flying aircraft from landsat 8 oli data. ISPRS J. Photogramm. Remote. Sens. 141, 176–184 (2018). https://doi.org/10.1016/j.isprsjprs.2018.05.001
Article Google Scholar
Zheng, Z., et al.: Localization distillation for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 10070–10083 (2023). https://doi.org/10.1109/TPAMI.2023.3248583
Article Google Scholar
Zhou, Y., et al.: Mmrotate: a rotated object detection benchmark using pytorch. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 7331–7334 (2022). https://doi.org/10.1145/3503161.3548541
Zhu, J., et al.: Complementary relation contrastive distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9256–9265 (2021). https://doi.org/10.1109/CVPR46437.2021.00914
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: Proceedings of the International Conference on Learning Representations (2021)
Google Scholar
Zhu, Y., et al.: Scalekd: distilling scale-aware knowledge in small object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19723–19733 (2023). https://doi.org/10.1109/CVPR52729.2023.01889

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 12302252.

Author information

Authors and Affiliations

College of Aerospace Science and Engineering, National University of Defense Technology, Changsha, 410000, China
Kun Wang, Zi Wang, Zhang Li, Xichao Teng & Yang Li
Hunan Provincial Key Laboratory of Image Measurement and Vision Navigation, Changsha, 410000, China
Kun Wang, Zi Wang, Zhang Li, Xichao Teng & Yang Li

Authors

Kun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xichao Teng
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhang Li .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 8109 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, K., Wang, Z., Li, Z., Teng, X., Li, Y. (2025). Multi-Scale Cross Distillation for Object Detection in Aerial Images. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15107. Springer, Cham. https://doi.org/10.1007/978-3-031-72967-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-72967-6_25
Published: 03 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72966-9
Online ISBN: 978-3-031-72967-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-Scale Cross Distillation for Object Detection in Aerial Images