research-article

Stochastic Context Consistency Reasoning for Domain Adaptive Object Detection

Authors:

Li WuAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 1331 - 1340

https://doi.org/10.1145/3664647.3680899

Published: 28 October 2024 Publication History

Abstract

Domain Adaptive Object Detection (DAOD) aims to improve the adaptation of the detector for the unlabeled target domain by the labeled source domain. Recent advances leverage a self-training framework to enable a student model to learn the target domain knowledge using pseudo labels generated by a teacher model. Despite great successes, such category-level consistency supervision suffers from poor quality of pseudo labels to fully explore the contextual target domain knowledge. To mitigate the problem, we propose a stochastic context consistency reasoning network with the self-training framework. Firstly, we introduce a stochastic complementary masking module (SCM) to generate complementary masked images thus preventing the network from over-relying on specific visual clues. Secondly, we design an inter-changeable context consistency reasoning module (Inter-CCR), which constructs an inter-context consistency paradigm to capture the texture and contour details in the target domain by aligning the predictions of the student model for complementary masked images. Meanwhile, we develop an intra-changeable context consistency reasoning module (Intra-CCR), which constructs an intra-context consistency paradigm to strengthen the utilization of context relations by utilizing pseudo labels to supervise the predictions of the student model. Experimental results on three DAOD benchmarks demonstrate our method outperforms current state-of-the-art methods by a large margin. Code is released at https://github.com/HDUyiming/SOCCER.

References

[1]

Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Lingyu Duan, and Ting Yao. 2019. Exploring object relation in mean teacher for cross-domain detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11457--11466.

[2]

Shengcao Cao, Dhiraj Joshi, Liang-Yan Gui, and Yu-Xiong Wang. 2023. Contrastive Mean Teacher for Domain Adaptive Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23839--23848.

[3]

Chaoqi Chen, Zebiao Zheng, Xinghao Ding, Yue Huang, and Qi Dou. 2020. Harmonizing transferability and discriminability for adapting object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8869--8878.

[4]

Meilin Chen, Weijie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, et al. 2022. Learning domain adaptive object detection with probabilistic teacher. arXiv preprint arXiv:2206.06293 (2022).

[5]

Pengguang Chen, Shu Liu, Hengshuang Zhao, and Jiaya Jia. 2020. Gridmask data augmentation. arXiv preprint arXiv:2001.04086 (2020).

[6]

Yuhua Chen, Wen Li, Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3339--3348.

[7]

Yuhua Chen, Haoran Wang, Wen Li, Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2021. Scale-aware domain adaptive faster r-cnn. International Journal of Computer Vision, Vol. 129, 7 (2021), 2223--2243.

Digital Library

[8]

Y Chenggang, S Yaoqi, Z Hao, Z Chenwei, Z Zunjie, Z Bolun, and Z Xiaofei. 2022. Review of omnimedia content quality evaluation [J]. J. Signal Process., Vol. 38, 6 (2022), 1111--1143.

[9]

Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3213--3223.

[10]

Jinhong Deng, Wen Li, Yuhua Chen, and Lixin Duan. 2021. Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4091--4101.

[11]

Jinhong Deng, Dongli Xu, Wen Li, and Lixin Duan. 2023. Harmonious Teacher for Cross-Domain Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23829--23838.

[12]

Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).

[13]

Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In International conference on machine learning. PMLR, 1180--1189.

[14]

Changlong Gao, Chengxu Liu, Yujie Dun, and Xueming Qian. 2023. CSDA: Learning Category-Scale Joint Feature for Domain Adaptive Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11421--11430.

[15]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.

Digital Library

[16]

Ross Girshick, Ilija Radosavovic, Georgia Gkioxari, Piotr Dollár, and Kaiming He. 2018. Detectron. https://github.com/facebookresearch/detectron.

[17]

Kaixiong Gong, Shuang Li, Shugang Li, Rui Zhang, Chi Harold Liu, and Qiang Chen. 2022. Improving transferability for domain adaptive detection transformers. In Proceedings of the 30th ACM International Conference on Multimedia. 1543--1551.

Digital Library

[18]

Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu, and Yanpeng Cao. 2021. Uncertainty-aware unsupervised domain adaptation in object detection. IEEE Transactions on Multimedia, Vol. 24 (2021), 2502--2514.

Digital Library

[19]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[20]

Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, and Sinisa Todorovic. 2023. Bidirectional Alignment for Domain Adaptive Detection with Transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18775--18785.

[21]

Mengzhe He, Yali Wang, Jiaxi Wu, Yiru Wang, Hanqing Li, Bo Li, Weihao Gan, Wei Wu, and Yu Qiao. 2022. Cross domain object detection by target-perceived dual branch distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9570--9580.

[22]

Zhenwei He and Lei Zhang. 2019. Multi-adversarial faster-rcnn for unrestricted object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6668--6677.

[23]

Lukas Hoyer, Dengxin Dai, Haoran Wang, and Luc Van Gool. 2023. MIC: Masked image consistency for context-enhanced domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11721--11732.

[24]

Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, and Ming-Hsuan Yang. 2020. Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part IX 16. Springer, 733--748.

[25]

Wei-Jie Huang, Yu-Lin Lu, Shih-Yao Lin, Yusheng Xie, and Yen-Yu Lin. 2022. AQT: Adversarial Query Transformers for Domain Adaptive Object Detection. In 31st International Joint Conference on Artificial Intelligence, IJCAI 2022. International Joint Conferences on Artificial Intelligence, 972--979.

[26]

Jisoo Jeong, Seungeui Lee, Jeesoo Kim, and Nojun Kwak. 2019. Consistency-based semi-supervised learning for object detection. Advances in neural information processing systems, Vol. 32 (2019).

[27]

Peidong Jia, Jiaming Liu, Senqiao Yang, Jiarui Wu, Xiaodong Xie, and Shanghang Zhang. 2023. PM-DETR: Domain Adaptive Prompt Memory for Object Detection with Transformers. arXiv preprint arXiv:2307.00313 (2023).

[28]

Matthew Johnson-Roberson, Charles Barto, Rounak Mehta, Sharath Nittur Sridhar, Karl Rosaen, and Ram Vasudevan. 2016. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983 (2016).

[29]

Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli, and Robby T Tan. 2023. 2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11484--11493.

[30]

Taekyung Kim, Minki Jeong, Seunghyeon Kim, Seokeon Choi, and Changick Kim. 2019. Diversify and match: A domain adaptive representation learning paradigm for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12456--12465.

[31]

Congcong Li, Dawei Du, Libo Zhang, Longyin Wen, Tiejian Luo, Yanjun Wu, and Pengfei Zhu. 2020. Spatial attention pyramid network for unsupervised domain adaptation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XIII 16. Springer, 481--497.

[32]

Liang Li, Xingyu Gao, Jincan Deng, Yunbin Tu, Zheng-Jun Zha, and Qingming Huang. 2022. Long Short-Term Relation Transformer With Global Gating for Video Captioning. IEEE Transactions on Image Processing, Vol. 31 (2022), 2726--2738. https://doi.org/10.1109/TIP.2022.3158546

Digital Library

[33]

Pengteng Li, Ying He, F. Richard Yu, Pinhao Song, Dongfu Yin, and Guang Zhou. 2023. IGG: Improved Graph Generation for Domain Adaptive Object Detection. In Proceedings of the 31st ACM International Conference on Multimedia. 1314--1324.

Digital Library

[34]

Wuyang Li, Xinyu Liu, Xiwen Yao, and Yixuan Yuan. 2022 d. Scan: Cross domain object detection with semantic conditioned adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1421--1428.

[35]

Wuyang Li, Xinyu Liu, and Yixuan Yuan. 2022. Sigma: Semantic-complete graph matching for domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5291--5300.

[36]

Xianfeng Li, Weijie Chen, Di Xie, Shicai Yang, Peng Yuan, Shiliang Pu, and Yueting Zhuang. 2021. A free lunch for unsupervised domain adaptive object detection without source data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 8474--8481.

[37]

Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, and Peter Vajda. 2022. Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7581--7590.

[38]

Zeyi Li, Pan Wang, Zixuan Wang, and De-chuan Zhan. 2024. Flowgananomaly: Flow-based anomaly network intrusion detection with adversarial learning. Chinese Journal of Electronics, Vol. 33, 1 (2024), 58--71.

[39]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117--2125.

[40]

Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, and Qingming Huang. 2022. Entity-enhanced adaptive reconstruction network for weakly supervised referring expression grounding. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 3 (2022), 3003--3018.

[41]

Yabo Liu, Jinghua Wang, Chao Huang, Yaowei Wang, and Yong Xu. 2023. CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23776--23786.

[42]

Dinh Phat Do, Taehoon Kim, Jaemin Na, Jiwon Kim, Keonho Lee, Kyunghwan Cho, and Wonjun Hwang. 2024. D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection. arXiv e-prints (2024), arXiv--2403.

[43]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, Vol. 28 (2015).

[44]

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, Vol. 126 (2018), 973--992.

Digital Library

[45]

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.

[46]

Petru Soviany, Radu Tudor Ionescu, Paolo Rota, and Nicu Sebe. 2021. Curriculum self-paced learning for cross-domain object detection. Computer Vision and Image Understanding, Vol. 204 (2021), 103166.

[47]

Peng Su, Kun Wang, Xingyu Zeng, Shixiang Tang, Dapeng Chen, Di Qiu, and Xiaogang Wang. 2020. Adapting object detectors with conditional domain normalization. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XI 16. Springer, 403--419.

[48]

Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems, Vol. 30 (2017).

[49]

Ye Tian, Ying Fu, and Jun Zhang. 2023. Transformer-based under-sampled single-pixel imaging. Chinese Journal of Electronics, Vol. 32, 5 (2023), 1151--1159.

[50]

Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, and Qingming Huang. 2024. SMART: Syntax-Calibrated Multi-Aspect Relation Transformer for Change Captioning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 46, 7 (2024), 4926--4943. https://doi.org/10.1109/TPAMI.2024.3365104

Digital Library

[51]

Vibashan Vs, Vikram Gupta, Poojan Oza, Vishwanath A Sindagi, and Vishal M Patel. 2021. Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4516--4526.

[52]

Hao Wang, Zheng-Jun Zha, Liang Li, Xuejin Chen, and Jiebo Luo. 2023. Semantic and Relation Modulation for Audio-Visual Event Localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 6 (2023), 7711--7725. https://doi.org/10.1109/TPAMI.2022.3226328

Digital Library

[53]

Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-Jun Zha, Yonggang Wen, and Dacheng Tao. 2021. Exploring sequence feature alignment for domain adaptive detection transformers. In Proceedings of the 29th ACM International Conference on Multimedia. 1730--1738.

Digital Library

[54]

Weixi Weng and Chun Yuan. 2024. Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 5912--5920.

Digital Library

[55]

Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, and Han Hu. 2021. Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16684--16693.

[56]

Chang-Dong Xu, Xing-Ran Zhao, Xin Jin, and Xiu-Shen Wei. 2020. Exploring categorical regularization for domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11724--11733.

[57]

Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, and Wenjun Zhang. 2020. Cross-domain detection via graph-induced prototype alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12355--12364.

[58]

Chenggang Yan, Biao Gong, Yuxuan Wei, and Yue Gao. 2020. Deep multi-view enhancement hashing for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 43, 4 (2020), 1445--1451.

[59]

Chenggang Yan, Yiming Hao, Liang Li, Jian Yin, Anan Liu, Zhendong Mao, Zhenyu Chen, and Xingyu Gao. 2021. Task-adaptive attention for image captioning. IEEE Transactions on Circuits and Systems for Video technology, Vol. 32, 1 (2021), 43--51.

Digital Library

[60]

Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, and Yongdong Zhang. 2020. Depth image denoising using nuclear norm and learning graph model. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 16, 4 (2020), 1--17.

Digital Library

[61]

Chenggang Yan, Lixuan Meng, Liang Li, Jiehua Zhang, Zhan Wang, Jian Yin, Jiyong Zhang, Yaoqi Sun, and Bolun Zheng. 2022. Age-invariant face recognition by multi-feature fusionand decomposition with self-attention. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 18, 1s (2022), 1--18.

Digital Library

[62]

Chenggang Yan, Tong Teng, Yutao Liu, Yongbing Zhang, Haoqian Wang, and Xiangyang Ji. 2021. Precise no-reference image quality evaluation based on distortion identification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 17, 3s (2021), 1--21.

Digital Library

[63]

Zhaoda Ye, Xiangteng He, and Yuxin Peng. 2022. Unsupervised Cross-Media Hashing Learning via Knowledge Graph. Chinese Journal of Electronics, Vol. 31, 6 (2022), 1081--1091.

[64]

Jayeon Yoo, Inseop Chung, and Nojun Kwak. 2022. Unsupervised domain adaptation for one-stage object detector using offsets to bounding box. In European Conference on Computer Vision. Springer, 691--708.

Digital Library

[65]

Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. 2020. Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2636--2645.

[66]

Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, and Shanghang Zhang. 2022. MTTrans: Cross-domain object detection with mean teacher transformer. In European Conference on Computer Vision. Springer, 629--645.

Digital Library

[67]

Beichen Zhang, Liang Li, Shuhui Wang, Shaofei Cai, Zheng-Jun Zha, Qi Tian, and Qingming Huang. 2024. Inductive State-Relabeling Adversarial Active Learning with Heuristic Clique Rescaling. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024), 1--17. https://doi.org/10.1109/TPAMI.2024.3432099

Digital Library

[68]

Dan Zhang, Jingjing Li, Lin Xiong, Lan Lin, Mao Ye, and Shangming Yang. 2019. Cycle-consistent domain adaptive faster RCNN. IEEE Access, Vol. 7 (2019), 123903--123911.

[69]

Jingyi Zhang, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Xiaoqin Zhang, and Shijian Lu. 2023. DA-DETR: Domain Adaptive Detection Transformer With Information Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23787--23798.

[70]

Tao Zhang, Ying Fu, and Jun Zhang. 2024. Deep Guided Attention Network for Joint Denoising and Demosaicing in Real Image. Chinese Journal of Electronics, Vol. 33, 1 (2024), 303--312.

[71]

Yixin Zhang, Zilei Wang, and Yushi Mao. 2021. Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12425--12434.

[72]

Zhedong Zhang, Liang Li, Gaoxiang Cong, YIN Haibing, Yuhan Gao, Chenggang Yan, Anton van den Hengel, and Yuankai Qi. 2024. From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning. In ACM Multimedia 2024.

Digital Library

[73]

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 13001--13008.

[74]

Qianyu Zhou, Qiqi Gu, Jiangmiao Pang, Xuequan Lu, and Lizhuang Ma. 2023. Self-adversarial disentangling for specific domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

Digital Library

[75]

Wenzhang Zhou, Dawei Du, Libo Zhang, Tiejian Luo, and Yanjun Wu. 2022. Multi-granularity alignment domain adaptation for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9581--9590.

Cited By

Chen LHan JWang Y(2025)DATR: Unsupervised Domain Adaptive Detection Transformer With Dataset-Level Adaptation and Prototypical AlignmentIEEE Transactions on Image Processing10.1109/TIP.2025.352737034(982-994)Online publication date: 2025
https://doi.org/10.1109/TIP.2025.3527370
Zhang ZLi LCong GYin HGao YYan CHengel AQi YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680777(7523-7532)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680777

Index Terms

Stochastic Context Consistency Reasoning for Domain Adaptive Object Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Domain consistency regularization for unsupervised multi-source domain adaptive classification
Highlights
- We propose a novel multi-source domain adaptation method for classification.
- We ...
Abstract
Deep learning-based multi-source unsupervised domain adaptation (MUDA) has been actively studied in recent years. Compared with single-source unsupervised domain adaptation (SUDA), domain shift in MUDA exists not only between the ...
Few-shot Adaptive Object Detection with Cross-Domain CutMix
Computer Vision – ACCV 2022
Abstract
In object detection, data amount and cost are a trade-off, and collecting a large amount of data in a specific domain is labor-intensive. Therefore, existing large-scale datasets are used for pre-training. However, conventional transfer learning ...
Discriminative distribution alignment for domain adaptive object detection
Highlights
- We propose discriminative image-level alignment to make model focus on areas that are quantified with high localization probability and put less emphasis on ...
Abstract
Domain adaptive object detection has achieved appealing performance by constructing an effective transferable model for unlabeled target images, which takes advantage of the well-labeled source images with different distributions. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
160
Total Downloads

Downloads (Last 12 months)160
Downloads (Last 6 weeks)96

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen LHan JWang Y(2025)DATR: Unsupervised Domain Adaptive Detection Transformer With Dataset-Level Adaptation and Prototypical AlignmentIEEE Transactions on Image Processing10.1109/TIP.2025.352737034(982-994)Online publication date: 2025
https://doi.org/10.1109/TIP.2025.3527370
Zhang ZLi LCong GYin HGao YYan CHengel AQi YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680777(7523-7532)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680777

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten