Abstract:
Human action recognition from still image has recently drawn increasing attention in human behavior analysis vision and also poses great challenges due to the huge inter ...Show MoreMetadata
Abstract:
Human action recognition from still image has recently drawn increasing attention in human behavior analysis vision and also poses great challenges due to the huge inter ambiguity and intra variability. Vector of locally aggregated descriptors (VLAD) has achieved state-of-the-art performance in many image classification tasks based on local features. The great success of VLAD is largely due to its high descriptive ability and computational efficiency. In this paper, towards optimal VLAD representations for human action recognition from still images, we improve VLAD by tackling two important issues in VLAD including empty cavity and assignment ambiguity. The empty cavity issue severely compromises the performance of VLAD and has long been overlooked. We investigate the empty cavity and provide an effective solution to deal with it, which largely improves the performance of VLAD; we propose middle level assignments to conquer the assignment ambiguity, which are more reliable and can provide more useful information for realistic activity. We have conducted extensive experiments on two widely-used benchmarks to validate the proposed method for human action recognition from still images. Our method produces competitive performance with state-of-the-art algorithms.
Published in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 20-25 March 2016
Date Added to IEEE Xplore: 19 May 2016
ISBN Information:
Electronic ISSN: 2379-190X