skip to main content
10.1145/3422852acmconferencesBook PagePublication PagesmmConference Proceedingsconference-collections
HuMA'20: Proceedings of the 1st International Workshop on Human-centric Multimedia Analysis
ACM2020 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA 12 October 2020
ISBN:
978-1-4503-8151-2
Published:
12 October 2020
Sponsors:
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Reflects downloads up to 19 Feb 2025Bibliometrics
Skip Abstract Section
Abstract

It is our great pleasure to welcome you to the 1st International Workshop on Human-Centric Multimedia Analysis (HuMA20). The workshop is co-located with ACM Multimedia 2020 in Seattle, United States. It addresses a very timely topic, the Human-centric multimedia analysis, which is one of the fundamental problems of multimedia understanding. It is a very challenging problem, which involves multiple tasks such as face detection and recognition, human body pattern analysis, person re-identification, human action detection, person tracking, human-object interaction, and so on. Today, multiple multimedia sensing technologies and large-scale computing infrastructures are producing at a rapid velocity a wide variety of big multi-modality data for human-centric analysis, which provide rich knowledge to help tackle these challenges. Researchers have strived to push the limits of human-centric multimedia analysis in a wide variety of applications, such as intelligent surveillance, retailing, fashion design, and services. Therefore, the purpose of this workshop is to: 1) bring together the state of the art research on human-centric multimedia analysis; 2) call for a coordinated effort to understand the opportunities and challenges emerging in human-centric multimedia analysis; 3) identify key tasks and evaluate the state-of-the-art methods; 4) showcase innovative methodologies and ideas; 5) introduce interesting real-world human-centric multimedia analysis systems or applications; and 6) propose new real-world datasets and discuss future directions. We solicit original contributions in all fields of human-centric multimedia analysis that explore the multi-modality data to help us understand the heavier of humans and promote the multimodal human-machine interaction. We believe the workshop will offer a timely collection of research updates to benefit the researchers and practitioners working in the broad multimedia communities. The call for papers for the workshop attracted 18 high-quality submissions from around the world of which 10 were accepted (55.6%).

Skip Table Of Content Section
SESSION: Keynote Talks I
keynote
Human-Centric Object Interactions - A Fine-Grained Perspective from Egocentric Videos

This talk aims to argue for a fine(r)-grained perspective onto human-object interactions. Motivation: Observe a person chopping some parsley. Can you detect the moment at which the parsley was first chopped? Whether the parsley was chopped coarsely or ...

keynote
Sensing, Understanding and Synthesizing Humans in an Open World

Sensing, understanding and synthesizing humans in images and videos have been a long-pursuing goal of computer vision and graphics, with extensive real-life applications. It is at the core of embodied intelligence. In this talk, I will discuss our work ...

SESSION: Session 1: Multimedia Event Detection
research-article
Intra and Inter-modality Interactions for Audio-visual Event Detection

The presence of auditory and visual sensory streams enables human beings to obtain a profound understanding of a scene. While audio and visual signals are able to provide relevant information separately, the combination of both modalities offers more ...

research-article
Open Access
Personalized User Modelling for Sleep Insight

Sleep is critical to leading a healthy lifestyle. Each day, most people go to sleep without any idea about how their night's rest is going to be. For an activity that humans spend around a third of their life doing, there is a surprising amount of ...

research-article
AI at the Disco: Low Sample Frequency Human Activity Recognition for Night Club Experiences

Human activity recognition (HAR) has grown in popularity as sensors have become more ubiquitous. Beyond standard health applications, there exists a need for embedded low cost, low power, accurate activity sensing for entertainment experiences. We ...

SESSION: Keynote Talk II
keynote
Unseen Activity Recognition in Space and Time

Progress in video understanding has been astonishing in the past decade. Classifying, localizing, tracking and even segmenting actor instances at the pixel level is now common place, thanks to label-supervised machine learning. Yet, it is becoming ...

SESSION: Session 2: Face, Gesture, and Body Pose
research-article
Towards Purely Unsupervised Disentanglement of Appearance and Shape for Person Images Generation

There have been a fairly of research interests in exploring the disentanglement of appearance and shape from human images. Most existing endeavours pursuit this goal by either using training images with annotations or regulating the training process ...

research-article
R-FENet: A Region-based Facial Expression Recognition Method Inspired by Semantic Information of Action Units

Facial expression recognition is a challenging problem in real-world scenarios owing to obstacles of illumination, occlusion, pose variations, and low-quality images. Recent works have paid attention to the concept of the region of interest (RoI) to ...

research-article
StarGAN-EgVA: Emotion Guided Continuous Affect Synthesis

Recent advancement of Generative Adversarial Network (GAN) based architectures has achieved impressive performance on static facial expression synthesis. Continuous affect synthesis, which has applications in generating videos and movies, is ...

SESSION: Session 3: Human Object Interaction
research-article
Human-Object Interaction Detection: A Quick Survey and Examination of Methods

Human-object interaction detection is a relatively new task in the world of computer vision and visual semantic information extraction. With the goal of machines identifying interactions that humans perform on objects, there are many real-world use ...

research-article
Online Video Object Detection via Local and Mid-Range Feature Propagation

This work proposes a new Local and Mid-range feature Propagation (LMP) method for video object detection to well capture feature correlations and reduce the redundant computation. Specifically, the proposed LMP model contains two modules with two ...

research-article
iWink: Exploring Eyelid Gestures on Mobile Devices

Although gaze has been widely studied for mobile interactions, eyelid-based gestures are relatively understudied and limited to few basic gestures (e.g., blink). In this work, we propose a gesture grammar to construct both basic and compound eyelid ...

research-article
Commonsense Learning: An Indispensable Path towards Human-centric Multimedia

Learning commonsense knowledge and conducting commonsense reasoning are basic human ability to make presumptions about the type and essence of ordinary situation in daily life, which serve as very important goals in human-centric Artificial Intelligence ...

Contributors
  • University of Massachusetts Amherst
  • IBM Thomas J. Watson Research Center
  • University of Electronic Science and Technology of China
  • Northwestern Polytechnical University
  • Renmin University of China

Recommendations