ABSTRACT
Human action recognition from motion videos plays an important role in multimedia analysis. Different from the temporal cues of action series in motion videos, the motion tendency can also be revealed from the still images or key frames. Thus, if the action knowledge in related still images can be well adapted to the target motion videos, we would have a great chance to improve the performance of video action recognition. In this paper, we propose a framework of Still-to-Motion Adaptation (SMA) for human action recognition. Common visual features are extracted both from the related images and target videos' key frames, by which the gap between still images and videos are bridged. Meanwhile, to utilize the unlabeled training videos in target domain, we incorporate a semi-supervised process into our framework. By minimizing the difference of action prediction from still features and motion features, we formulate the still-to-motion adaptation into a joint optimization process. Experiments successfully demonstrate the effectiveness of the proposed framework and show the better performance of action recognition compared with the state-of-the-art methods. We also analyze the impact on the recognition results of target videos by knowledge adaptation from still images.
- V. Delaitre, I. Laptev, and J. Sivic. Recognizing human actions in still images: a study of bag-of-features and part-based representations. In BMVC (2010).Google Scholar
- L. Duan, D. Xu, and I. Tsang. Learning with augmented features for heterogeneous domain adaptation. In ICML (2012).Google Scholar
- A. Gupta, A. Kembhavi, and L. S. Davis. Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE Transactions on PAMI, 2009. Google ScholarDigital Library
- Y. Han, Y. Yang, Z. Ma, H. Shen, N. Sebe, and X. Zhou. Image attribute adaptation. IEEE Transactions on Multimedia, 2014. Google ScholarDigital Library
- Z. Ma, Y. Yang, N. Sebe, and A. G. Hauptmann. Knowledge adaptation with partially shared features for event detection using few exemplars. IEEE Transactions on PAMI, 2014.Google Scholar
- K. Soomro, A. R. Zamir, and M. Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. CRCV-TR-12-01, November 2012.Google Scholar
- H. Wang, M. M. Ullah, A. Klaser, I. Laptev, C. Schmid, et al. Evaluation of local spatio-temporal features for action recognition. In BMVC(2009).Google Scholar
- F. Wu, X. Lu, Z. Zhang, S. Yan, Y. Rui, and Y. Zhuang. Cross-media semantic representation via bi-directional learning to rank. In ACM MM (2013). Google ScholarDigital Library
- Y. Yang, Z. Ma, A. G. Hauptmann, and N. Sebe. Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia, 2013.Google Scholar
- Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Transfer tagging from image to video. In ACM MM (2011). Google ScholarDigital Library
- B. Yao and L. Fei-Fei. Grouplet: A structured image representation for recognizing human and object interactions. In CVPR (2010).Google Scholar
- B. Yao, X. Jiang, A. Khosla, A. L. Lin, L. Guibas, and L. Fei-Fei. Human action recognition by learning bases of action attributes and parts. In ICCV (2011). Google ScholarDigital Library
Index Terms
- What Can We Learn about Motion Videos from Still Images?
Recommendations
Do less and achieve more
We collect three large web action image datasets.We verify that web action images are complementary to training videos by extensive experiments.We show both filtered and unfiltered web action images are complementary to training videos.We show ...
Local velocity-adapted motion events for spatio-temporal recognition
In this paper, we address the problem of motion recognition using event-based local motion representations. We assume that similar patterns of motion contain similar events with consistent motion across image sequences. Using this assumption, we ...
Human action recognition in videos using motion impression image
ICIMCS '09: Proceedings of the First International Conference on Internet Multimedia Computing and ServiceHuman action recognition in surveillance has become a hot topic in computer vision. In this paper, we develope a new method to recognize human action using motion information in video. Video sequence is compressed along time axis into a Motion ...
Comments