abstract

Multimedia Content Understanding in Harsh Environments

Authors:

Kui JiangAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 7372 - 7373

https://doi.org/10.1145/3503161.3546969

Published: 10 October 2022 Publication History

Abstract

Multimedia content understanding methods often encounter a severe performance degradation under harsh environments. This tutorial covers several important components of multimedia content understanding in harsh environments. It introduces some multimedia enhancement methods, presents recent advances in 2D and 3D visual scene understanding, shows strategies to estimate the prediction uncertainty, provides a brief summary, and shows some typical applications.

References

[1]

Mengshun Hu, Kui Jiang, Liang Liao, Jing Xiao, Junjun Jiang, and Zheng Wang. 2022. Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning. In CVPR. 3574--3583.

[2]

Kui Jiang, Zhongyuan Wang, Zheng Wang, Chen Chen, Peng Yi, Tao Lu, and Chia-Wen Lin. 2022. Degrade is upgrade: Learning degradation for low-light image enhancement. In AAAI. 1078--1086.

[3]

Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Baojin Huang, Yimin Luo, Jiayi Ma, and Junjun Jiang. 2020b. Multi-scale progressive fusion network for single image deraining. In CVPR. 8346--8355.

[4]

Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Zheng Wang, Xiao Wang, Junjun Jiang, and Chia-Wen Lin. 2021. Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining. IEEE TIP, Vol. 30 (2021), 7404--7418.

[5]

Kui Jiang, Zhongyuan Wang, Peng Yi, and Junjun Jiang. 2020a. Hierarchical dense recursive network for image super-resolution. PR, Vol. 107 (2020), 107475.

[6]

Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2020. Guidance and evaluation: Semantic-aware image inpainting for mixed scenes. In ECCV. 683--700.

[7]

Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2021. Image inpainting guided by coherence priors of semantics and textures. In CVPR. 6539--6548.

[8]

Yuting Liu, Zheng Wang, Miaojing Shi, Shin'ichi Satoh, Qijun Zhao, and Hongyu Yang. 2020. Towards unsupervised crowd counting via regression-detection bi-knowledge transfer. In ACM MM. 129--137.

[9]

Xianzheng Ma, Zhixiang Wang, Yacheng Zhan, Yinqiang Zheng, Zheng Wang, Dengxin Dai, and Chia-Wen Lin. 2022. Both style and fog matter: Cumulative domain adaptation for semantic foggy scene understanding. In CVPR. 18922--18931.

[10]

Zhenxing Mi, Chang Di, and Dan Xu. 2022. Generalized Binary Search Network for Highly-Efficient Multi-View Stereo. In CVPR. 12991--13000.

[11]

Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, and Yan Yan. 2019. Cycle in cycle generative adversarial networks for keypoint-guided image generation. In ACM MM. 2052--2060.

[12]

Hao Tang, Dan Xu, Yan Yan, Philip HS Torr, and Nicu Sebe. 2020. Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation. In CVPR. 7870--7879.

[13]

Jiapeng Tang, Jiabao Lei, Dan Xu, Feiying Ma, Kui Jia, and Lei Zhang. 2021a. Sa-convonet: Sign-agnostic optimization of convolutional occupancy networks. In ICCV. 6504--6513.

[14]

Jiapeng Tang, Dan Xu, Kui Jia, and Lei Zhang. 2021b. Learning parallel dense correspondence from spatio-temporal descriptors for efficient and robust 4d reconstruction. In CVPR. 6022--6031.

[15]

Xiao Wang, Zheng Wang, Wu Liu, Xin Xu, Jing Chen, and Chia-Wen Lin. 2021. Consistency-constancy bi-knowledge learning for pedestrian detection in night surveillance. In ACM MM. 4463--4471.

[16]

Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin'ichi Satoh. 2019. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In CVPR. 618--626.

[17]

Dan Xu, Wanli Ouyang, Xiaogang Wang, and Nicu Sebe. 2018a. Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing. In CVPR. 675--684.

[18]

Dan Xu, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, and Elisa Ricci. 2018b. Structured attention guided convolutional neural fields for monocular depth estimation. In CVPR. 3917--3925.

[19]

Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel, and Dan Xu. 2021. Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In ICCV. 6984--6993.

[20]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, and Jiayi Ma. 2020. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE TPAMI (2020).

[21]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, Xin Tian, and Jiayi Ma. 2021. Omniscient video super-resolution. In ICCV. 4429--4438.

[22]

Xuanmeng Zhang, Zhedong Zheng, Daiheng Gao, Bang Zhang, Pan Pan, and Yi Yang. 2022. Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis. In CVPR. 18450--18459.

[23]

Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020a. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In ACM MM. 1395--1403.

[24]

Zhedong Zheng and Yi Yang. 2021a. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. IJCV, Vol. 129, 4 (2021), 1106--1120.

Digital Library

[25]

Zhedong Zheng and Yi Yang. 2021b. Unsupervised scene adaptation with memory regularization in vivo. In IJCAI. 1076--1082.

[26]

Zhedong Zheng, Liang Zheng, Michael Garrett, Yi Yang, Mingliang Xu, and Yi-Dong Shen. 2020b. Dual-path convolutional image-text embeddings with instance loss. ACM TOMM, Vol. 16, 2 (2020), 1--23.

Digital Library

Cited By

Zheng ZWang YQian XZhong ZWang ZZheng LGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and OpportunitiesProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658892(1336-1338)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3658892

Index Terms

Multimedia Content Understanding in Harsh Environments
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. Information retrieval

Recommendations

Understanding multimedia content using web scale social media data
MM '10: Proceedings of the 18th ACM international conference on Multimedia

Nowadays, increasingly rich and massive social media data (such as texts, images, audios, videos, blogs, and so on) are being posted to the web, including social networking websites (e.g., MySpace, Facebook), photo and video sharing websites (e.g., ...
Using content-based multimedia data retrieval for multimedia content adaptation
HCI'07: Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments

The effective retrieval and multimedia data management techniques to facilitate the searching and querying of large multimedia data sets are very important in multimedia applications development. The content-based retrieval systems must use the ...
Querying Multimedia Presentations Based on Content

In this paper, we consider the problem of querying multimedia presentations based on content information. We believe that presentations should become an integral part of multimedia database systems and users should be able to store, query, and, possibly,...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Check for updates

Author Tags

Qualifiers

Abstract

Funding Sources

National Natural Science Foundation of China
National Key R&D Project

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
112
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)5

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zheng ZWang YQian XZhong ZWang ZZheng LGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and OpportunitiesProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658892(1336-1338)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3658892

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten