skip to main content
10.1145/3503161.3546969acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
abstract

Multimedia Content Understanding in Harsh Environments

Published: 10 October 2022 Publication History

Abstract

Multimedia content understanding methods often encounter a severe performance degradation under harsh environments. This tutorial covers several important components of multimedia content understanding in harsh environments. It introduces some multimedia enhancement methods, presents recent advances in 2D and 3D visual scene understanding, shows strategies to estimate the prediction uncertainty, provides a brief summary, and shows some typical applications.

References

[1]
Mengshun Hu, Kui Jiang, Liang Liao, Jing Xiao, Junjun Jiang, and Zheng Wang. 2022. Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning. In CVPR. 3574--3583.
[2]
Kui Jiang, Zhongyuan Wang, Zheng Wang, Chen Chen, Peng Yi, Tao Lu, and Chia-Wen Lin. 2022. Degrade is upgrade: Learning degradation for low-light image enhancement. In AAAI. 1078--1086.
[3]
Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Baojin Huang, Yimin Luo, Jiayi Ma, and Junjun Jiang. 2020b. Multi-scale progressive fusion network for single image deraining. In CVPR. 8346--8355.
[4]
Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Zheng Wang, Xiao Wang, Junjun Jiang, and Chia-Wen Lin. 2021. Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining. IEEE TIP, Vol. 30 (2021), 7404--7418.
[5]
Kui Jiang, Zhongyuan Wang, Peng Yi, and Junjun Jiang. 2020a. Hierarchical dense recursive network for image super-resolution. PR, Vol. 107 (2020), 107475.
[6]
Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2020. Guidance and evaluation: Semantic-aware image inpainting for mixed scenes. In ECCV. 683--700.
[7]
Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2021. Image inpainting guided by coherence priors of semantics and textures. In CVPR. 6539--6548.
[8]
Yuting Liu, Zheng Wang, Miaojing Shi, Shin'ichi Satoh, Qijun Zhao, and Hongyu Yang. 2020. Towards unsupervised crowd counting via regression-detection bi-knowledge transfer. In ACM MM. 129--137.
[9]
Xianzheng Ma, Zhixiang Wang, Yacheng Zhan, Yinqiang Zheng, Zheng Wang, Dengxin Dai, and Chia-Wen Lin. 2022. Both style and fog matter: Cumulative domain adaptation for semantic foggy scene understanding. In CVPR. 18922--18931.
[10]
Zhenxing Mi, Chang Di, and Dan Xu. 2022. Generalized Binary Search Network for Highly-Efficient Multi-View Stereo. In CVPR. 12991--13000.
[11]
Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, and Yan Yan. 2019. Cycle in cycle generative adversarial networks for keypoint-guided image generation. In ACM MM. 2052--2060.
[12]
Hao Tang, Dan Xu, Yan Yan, Philip HS Torr, and Nicu Sebe. 2020. Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation. In CVPR. 7870--7879.
[13]
Jiapeng Tang, Jiabao Lei, Dan Xu, Feiying Ma, Kui Jia, and Lei Zhang. 2021a. Sa-convonet: Sign-agnostic optimization of convolutional occupancy networks. In ICCV. 6504--6513.
[14]
Jiapeng Tang, Dan Xu, Kui Jia, and Lei Zhang. 2021b. Learning parallel dense correspondence from spatio-temporal descriptors for efficient and robust 4d reconstruction. In CVPR. 6022--6031.
[15]
Xiao Wang, Zheng Wang, Wu Liu, Xin Xu, Jing Chen, and Chia-Wen Lin. 2021. Consistency-constancy bi-knowledge learning for pedestrian detection in night surveillance. In ACM MM. 4463--4471.
[16]
Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin'ichi Satoh. 2019. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In CVPR. 618--626.
[17]
Dan Xu, Wanli Ouyang, Xiaogang Wang, and Nicu Sebe. 2018a. Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing. In CVPR. 675--684.
[18]
Dan Xu, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, and Elisa Ricci. 2018b. Structured attention guided convolutional neural fields for monocular depth estimation. In CVPR. 3917--3925.
[19]
Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel, and Dan Xu. 2021. Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In ICCV. 6984--6993.
[20]
Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, and Jiayi Ma. 2020. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE TPAMI (2020).
[21]
Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, Xin Tian, and Jiayi Ma. 2021. Omniscient video super-resolution. In ICCV. 4429--4438.
[22]
Xuanmeng Zhang, Zhedong Zheng, Daiheng Gao, Bang Zhang, Pan Pan, and Yi Yang. 2022. Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis. In CVPR. 18450--18459.
[23]
Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020a. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In ACM MM. 1395--1403.
[24]
Zhedong Zheng and Yi Yang. 2021a. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. IJCV, Vol. 129, 4 (2021), 1106--1120.
[25]
Zhedong Zheng and Yi Yang. 2021b. Unsupervised scene adaptation with memory regularization in vivo. In IJCAI. 1076--1082.
[26]
Zhedong Zheng, Liang Zheng, Michael Garrett, Yi Yang, Mingliang Xu, and Yi-Dong Shen. 2020b. Dual-path convolutional image-text embeddings with instance loss. ACM TOMM, Vol. 16, 2 (2020), 1--23.

Cited By

View all
  • (2024)MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and OpportunitiesProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658892(1336-1338)Online publication date: 30-May-2024

Index Terms

  1. Multimedia Content Understanding in Harsh Environments

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 October 2022

      Check for updates

      Author Tags

      1. domain adaptation
      2. multimedia content enhancement
      3. multimedia content understanding

      Qualifiers

      • Abstract

      Funding Sources

      Conference

      MM '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)25
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and OpportunitiesProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658892(1336-1338)Online publication date: 30-May-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media