research-article

IGG: Improved Graph Generation for Domain Adaptive Object Detection

Authors:

Guang ZhouAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 1314 - 1324

https://doi.org/10.1145/3581783.3613116

Published: 27 October 2023 Publication History

Abstract

Domain Adaptive Object Detection (DAOD) transfers an object detector from a labeled source domain to a novel unlabeled target domain. Recent works bridge the domain gap by aligning cross-domain pixel-pairs in the non-euclidean graphical space and minimizing the domain discrepancy for adapting semantic distribution. Though great successes, these methods model graphs roughly with coarse semantic sampling due to ignoring the non-informative noises and failing to concentrate on precise semantics alignment. Besides, the coarse graph generation inevitably contains abnormal nodes. These challenges result in biased domain adaptation. Therefore, we propose an Improved Graph Generation (IGG) framework which conducts high-quality graph generation for DAOD. Specifically, we design an Intensive Node Refinement (INR) module that reconstructs the noisy sampled nodes with a memory bank, and contrastively regularizes the noisy features. For better semantics alignment, we decouple the domain-specific style and category-invariant content encoded in graph covariance and selectively eliminate only the domain-specific style. Then, a Precision Graph Optimization (PGO) adaptor is proposed which utilizes the variational inference to down-weight abnormal nodes. Comprehensive experiments on three adaptation benchmarks demonstrate that IGG achieves state-of-the-art results in unsupervised domain adaptation.

References

[1]

Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019).

[2]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).

[3]

Gilles Blanchard, Gyemin Lee, and Clayton Scott. 2011. Generalizing from several related classification tasks to a new unlabeled sample. Advances in neural information processing systems 24 (2011).

[4]

Chaoqi Chen, Jiongcheng Li, Zebiao Zheng, Yue Huang, Xinghao Ding, and Yizhou Yu. 2021. Dual bipartite graph learning: A general approach for domain adaptive object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2703--2712.

[5]

Chaoqi Chen, Jiongcheng Li, Hong-Yu Zhou, Xiaoguang Han, Yue Huang, Xinghao Ding, and Yizhou Yu. 2022. Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection. arXiv preprint arXiv:2206.02355 (2022).

[6]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning. PMLR, 1597--1607.

[7]

Yuhua Chen, Wen Li, Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3339--3348.

[8]

Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne T Kim, Seungryong Kim, and Jaegul Choo. 2021. Robustnet: Improving domain generalization in urban-scene segmentation via instance selective whitening. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11580--11590.

[9]

Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3213--3223.

[10]

Elliot Creager, Jörn-Henrik Jacobsen, and Richard Zemel. 2021. Environment inference for invariant learning. In International Conference on Machine Learning. PMLR, 2189--2200.

[11]

Ashok Cutkosky and Francesco Orabona. 2019. Momentum-based variance reduction in non-convex sgd. Advances in neural information processing systems 32 (2019).

[12]

Jinhong Deng,Wen Li, Yuhua Chen, and Lixin Duan. 2021. Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4091--4101.

[13]

Yixiao Ge, Dapeng Chen, and Hongsheng Li. 2019. Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Reidentification. In International Conference on Learning Representations.

[14]

Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3354--3361.

Digital Library

[15]

Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision. 1440--1448.

Digital Library

[16]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[18]

Mengzhe He, Yali Wang, Jiaxi Wu, Yiru Wang, Hanqing Li, Bo Li, Weihao Gan, Wei Wu, and Yu Qiao. 2022. Cross domain object detection by target-perceived dual branch distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9570--9580.

[19]

Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, and Ming-Hsuan Yang. 2020. Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In European Conference on Computer Vision. Springer, 733--748.

Digital Library

[20]

Han-Kai Hsu, Chun-Han Yao, Yi-Hsuan Tsai, Wei-Chih Hung, Hung-Yu Tseng, Maneesh Singh, and Ming-Hsuan Yang. 2020. Progressive domain adaptation for object detection. In Proceedings of the IEEE/CVF Winter Conference on applications of Computer Vision. 749--757.

[21]

Wei-Jie Huang, Yu-Lin Lu, Shih-Yao Lin, Yusheng Xie, and Yen-Yu Lin. 2022. AQT: Adversarial Query Transformers for Domain Adaptive Object Detection. In 31st International Joint Conference on Artificial Intelligence, IJCAI 2022. International Joint Conferences on Artificial Intelligence, 972--979.

[22]

Matthew Johnson-Roberson, Charles Barto, Rounak Mehta, Sharath Nittur Sridhar, Karl Rosaen, and Ram Vasudevan. 2016. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983 (2016).

[23]

Masanori Koyama and Shoichiro Yamaguchi. 2020. When is invariance useful in an Out-of-Distribution Generalization problem? arXiv preprint arXiv:2008.01883 (2020).

[24]

Congcong Li, Dawei Du, Libo Zhang, LongyinWen, Tiejian Luo, YanjunWu, and Pengfei Zhu. 2020. Spatial attention pyramid network for unsupervised domain adaptation. In European Conference on Computer Vision. Springer, 481--497.

Digital Library

[25]

Wuyang Li, Xinyu Liu, and Yixuan Yuan. 2022. SCAN: Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection. IEEE Transactions on Multimedia (2022).

[26]

Wuyang Li, Xinyu Liu, and Yixuan Yuan. 2022. SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5291--5300.

[27]

Zenan Li, Qitian Wu, Fan Nie, and Junchi Yan. 2022. GraphDE: A Generative Framework for Debiased Learning and Out-of-Distribution Detection on Graphs. In Advances in Neural Information Processing Systems (NeurIPS).

[28]

Chuang Lin, Zehuan Yuan, Sicheng Zhao, Peize Sun, Changhu Wang, and Jianfei Cai. 2021. Domain-invariant disentangled network for generalizable object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8771--8780.

[29]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980--2988.

[30]

Xinyu Liu, Xiaoqing Guo, Yajie Liu, and Yixuan Yuan. 2021. Consolidated domain adaptive detection and localization framework for cross-device colonoscopic images. Medical Image Analysis 71 (2021), 102052.

[31]

Yishay Mansour, Mehryar Mohri, and Afshin Rostamizadeh. 2009. Domain adaptation: Learning bounds and algorithms. arXiv preprint arXiv:0902.3430 (2009).

[32]

Muhammad Akhtar Munir, Muhammad Haris Khan, M Sarfraz, and Mohsen Ali. 2021. SSAL: Synergizing between Self-Training and Adversarial Learning for Domain Adaptive Object Detection. Advances in Neural Information Processing Systems 34 (2021), 22770--22782.

[33]

Xingang Pan, Xiaohang Zhan, Jianping Shi, Xiaoou Tang, and Ping Luo. 2019. Switchable whitening for deep representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1863--1871.

[34]

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. 2019. Do imagenet classifiers generalize to imagenet?. In International Conference on Machine Learning. PMLR, 5389--5400.

[35]

Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

[36]

Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. 2018. Learning to reweight examples for robust deep learning. In International Conference on Machine Learning. PMLR, 4334--4343.

[37]

Farzaneh Rezaeianaran, Rakshith Shetty, Rahaf Aljundi, Daniel Olmeda Reino, Shanshan Zhang, and Bernt Schiele. 2021. Seeking similarities over differences: Similarity-based domain alignment for adaptive object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9204--9213.

[38]

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision 126, 9 (2018), 973--992.

Digital Library

[39]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[40]

Baochen Sun and Kate Saenko. 2016. Deep coral: Correlation alignment for deep domain adaptation. In European Conference on Computer Vision. Springer, 443--450.

[41]

Kun Tian, Chenghao Zhang, Ying Wang, Shiming Xiang, and Chunhong Pan. 2021. Knowledge mining and transferring for domain adaptive object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9133-- 9142.

[42]

Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9627--9636.

[43]

Vibashan Vs, Vikram Gupta, Poojan Oza, Vishwanath A Sindagi, and Vishal M Patel. 2021. Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4516--4526.

[44]

Vibashan VS, Poojan Oza, and Vishal M Patel. 2022. Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection. arXiv preprint arXiv:2203.15793 (2022).

[45]

Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, XiShan Zhang, and ShaoLi Liu. 2021. Domain-specific suppression for adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9603--9612.

[46]

Aming Wu, Rui Liu, Yahong Han, Linchao Zhu, and Yi Yang. 2021. Vectordecomposed disentanglement for domain-invariant object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9342--9351.

[47]

QitianWu, Hengrui Zhang, Junchi Yan, and David Wipf. 2022. Handling Distribution Shifts on Graphs: An Invariance Perspective. arXiv preprint arXiv:2202.02466 (2022).

[48]

Ying-Xin Wu, Xiang Wang, An Zhang, Xiangnan He, and Tat-Seng Chua. 2022. Discovering invariant rationales for graph neural networks. arXiv preprint arXiv:2201.12872 (2022).

[49]

Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, and Wenjun Zhang. 2020. Crossdomain detection via graph-induced prototype alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12355--12364.

[50]

Xu Yang, Cheng Deng, Tongliang Liu, and Dacheng Tao. 2020. Heterogeneous graph attention network for unsupervised multiple-target domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

[51]

Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, and Shanghang Zhang. 2022. MTTrans: Cross-domain Object Detection with Mean Teacher Transformer. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part IX. Springer, 629--645.

[52]

Cheng Zhang, Judith Bütepage, Hedvig Kjellström, and Stephan Mandt. 2018. Advances in variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 8 (2018), 2008--2026.

[53]

Yixin Zhang, Zilei Wang, and Yushi Mao. 2021. Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12425--12434.

[54]

Liang Zhao and Limin Wang. 2022. Task-specific Inconsistency Alignment for Domain Adaptive Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14217--14226.

[55]

Yangtao Zheng, Di Huang, Songtao Liu, and YunhongWang. 2020. Cross-domain object detection through coarse-to-fine feature adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13766--13775.

[56]

Yangtao Zheng, Di Huang, Songtao Liu, and YunhongWang. 2020. Cross-domain object detection through coarse-to-fine feature adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13766--13775.

[57]

Wenzhang Zhou, Dawei Du, Libo Zhang, Tiejian Luo, and YanjunWu. 2022. Multigranularity alignment domain adaptation for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9581--9590.

Cited By

Wang XRen WChen XFan HTang YHan ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open WorldProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681212(1991-2000)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681212
Cui YLi LZhang JYan CWang HWang SJin HWu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Stochastic Context Consistency Reasoning for Domain Adaptive Object DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680899(1331-1340)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680899
Zhang JWang YYang XWang SFeng YShi YRen RZhu ELiu XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Test-Time Training on Graphs with Large Language Models (LLMs)Proceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680865(2089-2098)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680865
Show More Cited By

Index Terms

IGG: Improved Graph Generation for Domain Adaptive Object Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Open-Scenario Domain Adaptive Object Detection in Autonomous Driving
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Existing domain adaptive object detection algorithms (DAOD) have demonstrated their effectiveness in discriminating and localizing objects across scenarios. However, these algorithms typically assume a single source and target domain for adaptation, ...
SR-DAYOLOv8: cross-domain adaptive object detection based on super-resolution domain classifier: SR-DAYOLOv8: cross-domain adaptive object detection...
Abstract
Object detection is a fundamental task of environment perception in traffic road scenarios, and its accurate detection results are of great significance for improving the reliability of autonomous driving, optimizing traffic flow management, and ...
Domain Attention Model for Domain Generalization in Object Detection
Pattern Recognition and Computer Vision
Abstract
Domain generalization methods in object detection aim to learn a domain-invariant detector for different domains. However, it is difficult to obtain a domain-invariant detector when there is large discrepancy between different domains. Based on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Open Research Fund from Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) under Grant
the National Natural Science Foundation of China (NSFC) under Grants
Shenzhen Science and Technology Program under Grant
the Guangdong Pearl River Talent Recruitment Program under Grant

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
250
Total Downloads

Downloads (Last 12 months)114
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang XRen WChen XFan HTang YHan ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open WorldProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681212(1991-2000)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681212
Cui YLi LZhang JYan CWang HWang SJin HWu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Stochastic Context Consistency Reasoning for Domain Adaptive Object DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680899(1331-1340)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680899
Zhang JWang YYang XWang SFeng YShi YRen RZhu ELiu XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Test-Time Training on Graphs with Large Language Models (LLMs)Proceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680865(2089-2098)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680865
Ke JHe LHan BLi JWang DGao X(2024)VLDadaptor: Domain Adaptive Object Detection With Vision-Language Model DistillationIEEE Transactions on Multimedia10.1109/TMM.2024.345306126(11316-11331)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3453061
Zhang SZhang LLi GLi PLiu Z(2024)Multi-Prototype Guided Source-Free Domain Adaptive Object Detection for Autonomous DrivingIEEE Transactions on Intelligent Vehicles10.1109/TIV.2023.33377959:1(1589-1601)Online publication date: Jan-2024
https://doi.org/10.1109/TIV.2023.3337795
Imteaj AHossain MZaman SShahid A(2024)TriplePlay: Enhancing Federated Learning with CLIP for Non-IID Data and Resource Efficiency2024 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA61862.2024.00228(1474-1480)Online publication date: 18-Dec-2024
https://doi.org/10.1109/ICMLA61862.2024.00228
Qiao ZShi DJin SShi YWang ZQiu C(2024)JFDI: Joint Feature Differentiation and Interaction for domain adaptive object detectionNeural Networks10.1016/j.neunet.2024.106682180(106682)Online publication date: Dec-2024
https://doi.org/10.1016/j.neunet.2024.106682

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten