skip to main content
10.1145/3240508.3243931acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
panel

Deep Learning for Multimedia: Science or Technology?

Published: 15 October 2018 Publication History

Abstract

Deep learning has been successfully explored in addressing different multimedia topics recent years, ranging from object detection, semantic classification, entity annotation, to multimedia captioning, multimedia question answering and storytelling. Open source libraries and platforms such as Tensorflow, Caffe, MXnet significantly help promote the wide deployment of deep learning in solving real-world applications. On one hand, deep learning practitioners, while not necessary to understand the involved math behind, are able to set up and make use of a complex deep network. One recent deep learning tool based on Keras even provides the graphical interface to enable straightforward 'drag and drop' operation for deep learning programming. On the other hand, however, some general theoretical problems of learning such as the interpretation and generalization, have only achieved limited progress. Most deep learning papers published these days follow the pipeline of designing/modifying network structures - tuning parameters - reporting performance improvement in specific applications. We have even seen many deep learning application papers without one single equation. Theoretical interpretation and the science behind the study are largely ignored. While excited about the successful application of deep learning in classical and novel problems, we multimedia researchers are responsible to think and solve the fundamental topics in deep learning science. Prof. Guanrong Chen recently wrote an editorial note titled 'Science and Technology, not SciTech' [1]. This panel falls into similar discussion and aims to invite prestigious multimedia researchers and active deep learning practitioners to discuss the positioning of deep learning research now and in the future. Specifically, each panelist is asked to present their opinions on the following five questions: 1)How do you think the current phenomenon that deep learning applications are explosively growing, while the general theoretical problems remain slow progress? 2)Do you agree that deployment of deep learning techniques is getting easy (with a low barrier), while deep learning research is difficult (with a high barrier) 3)What do you think are the core problems for deep learning techniques? 4)What do you think are the core problems for deep learning science? 5)What's your suggestion on the multimedia research in the post-deep learning era?

Reference

[1]
Chen, Guanrong. "Science and technology, not SciTech." National Science Review (2017).

Cited By

View all
  • (2022)CNN based recognition of handwritten multilingual city namesMultimedia Tools and Applications10.1007/s11042-022-12193-881:8(11501-11517)Online publication date: 18-Feb-2022
  • (2021)The Transition From White Box to Black Box: Challenges and Opportunities in Signal Processing EducationIEEE Signal Processing Magazine10.1109/MSP.2021.305099638:3(163-173)Online publication date: May-2021
  • (2019)A Modern C++ Parallel Task Programming LibraryProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3350537(2284-2287)Online publication date: 15-Oct-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Check for updates

Author Tags

  1. application
  2. deep learning
  3. theory

Qualifiers

  • Panel

Conference

MM '18
Sponsor:
MM '18: ACM Multimedia Conference
October 22 - 26, 2018
Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)CNN based recognition of handwritten multilingual city namesMultimedia Tools and Applications10.1007/s11042-022-12193-881:8(11501-11517)Online publication date: 18-Feb-2022
  • (2021)The Transition From White Box to Black Box: Challenges and Opportunities in Signal Processing EducationIEEE Signal Processing Magazine10.1109/MSP.2021.305099638:3(163-173)Online publication date: May-2021
  • (2019)A Modern C++ Parallel Task Programming LibraryProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3350537(2284-2287)Online publication date: 15-Oct-2019
  • (2019)Deep learning for spoken language identification: Can we visualize speech signal patterns?Neural Computing and Applications10.1007/s00521-019-04468-3Online publication date: 5-Sep-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media