Loading [a11y]/accessibility-menu.js
Learning to Represent Spatio-Temporal Features for Fine Grained Action Recognition | IEEE Conference Publication | IEEE Xplore

Learning to Represent Spatio-Temporal Features for Fine Grained Action Recognition


Abstract:

Convolutional neural networks have pushed the boundaries of action recognition in videos, especially with the introduction of 3D convolutions. But it is an open ended que...Show More

Abstract:

Convolutional neural networks have pushed the boundaries of action recognition in videos, especially with the introduction of 3D convolutions. But it is an open ended question on how efficiently a 3D CNN can model temporal information? which we try to investigate and introduce a new optical flow representation to improve the motion stream. We use the baseline inflated 3D CNN networks and separate the convolutional filters into spatial and temporal, which reduces the number of parameters with minimal loss of accuracy. We evaluate our approach on NTU RGBD dataset which is the largest human action dataset and outperform the state-of-the-art by a large margin.
Date of Conference: 12-14 December 2018
Date Added to IEEE Xplore: 09 May 2019
ISBN Information:
Conference Location: Sophia Antipolis, France

References

References is not available for this document.