Effective Action Detection Using Temporal Context and Posterior Probability of Length

Liu, Xinran; Song, Yan; Tang, Jinhui

doi:10.1007/978-3-319-73600-6_10

Effective Action Detection Using Temporal Context and Posterior Probability of Length

Xinran Liu²¹,
Yan Song²¹ &
Jinhui Tang²¹

Conference paper
First Online: 13 January 2018

2753 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10705))

Abstract

In this paper, we focus on human action detection for untrim-med long videos. We propose an effective action detection system aiming at solving two difficulties in existing works. Firstly, we propose to take into account the temporal context information in model learning to tackle with the problem of high-quality proposal generation. Secondly, we propose to utilize the posterior probability of proposal length to adjust the selection criterion of action proposals. This can effectively encourage the proposals with reasonable lengths and suppress the high-classification-score proposals with unreasonable lengths. We test our method on the THUMOS14 Dataset and the experiment results show that our action detection system improve the performance by about 4% compared with the state-of-art methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: IEEE International Conference on Computer Vision, pp. 1817–1824 (2013)
Google Scholar
Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision, pp. 4489–4497 (2016)
Google Scholar
Van Gemert, J.C., Jain, M., Gati, E., Snoek, C.G.M.: APT: action localization proposals from dense trajectories. In: BMVC (2015)
Google Scholar
Gkioxari, G., Malik, J.: Finding action tubes, pp. 759–768 (2014)
Google Scholar
Heilbron, F.C., Niebles, J.C., Ghanem, B.: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1914–1923 (2016)
Google Scholar
Jiang, Y.G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: Thumos challenge: action recognition with a large number of classes (2014). http://crcv.ucf.edu/THUMOS14/
Karaman, S., Seidenari, L., Bimbo, A.D.: Fast saliency based pooling of fisher encoded dense trajectories. In: European Conference on Computer Vision (2014)
Google Scholar
Klaser, A., Schmid, C.: Action recognition by dense trajectories. In: Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)
Google Scholar
Laptev, I., Lindeberg, T.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Article Google Scholar
Mettes, P., Van Gemert, J.C., Cappallo, S., Mensink, T., Snoek, C.G.M.: Bag-of-fragments: selecting and encoding video fragments for event detection and recounting. In: ACM on International Conference on Multimedia Retrieval, pp. 427–434 (2015)
Google Scholar
Shou, Z., Wang, D., Chang, S.F.: Temporal action localization in untrimmed videos via multi-stage CNNs, pp. 1049–1058 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. Adv. Neural Inf. Process. Syst. 1(4), 568–576 (2014)
Google Scholar
Singh, G., Cuzzolin, F.: Untrimmed video classification for activity detection: submission to activitynet challenge (2016)
Google Scholar
Wang, L., Qiao, Y., Tang, X.: Action recognition and detection by combining motion and appearance features (2014). http://crcv.ucf.edu/THUMOS14/papers/CUHK&SIAT.pdf
Yeung, S., Russakovsky, O., Mori, G., Li, F.F.: End-to-end learning of action detection from frame glimpses in videos. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2678–2687 (2016)
Google Scholar
Yu, G., Yuan, J.: Fast action proposals for human action detection and search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1302–1311 (2015)
Google Scholar
Yuan, J., Ni, B., Yang, X., Kassim, A.A.: Temporal action localization with pyramid of score distribution features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3093–3102 (2016)
Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2442–2449 (2009)
Google Scholar

Download references

Acknowledgments

This work was supported in part by the 973 Program under Grant 2014CB347600; in part by the National Nature Science Foundation of China under Grants 61672285.

Author information

Authors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Xinran Liu, Yan Song & Jinhui Tang

Authors

Xinran Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Song
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Song .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Song, Y., Tang, J. (2018). Effective Action Detection Using Temporal Context and Posterior Probability of Length. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10705. Springer, Cham. https://doi.org/10.1007/978-3-319-73600-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-73600-6_10
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73599-3
Online ISBN: 978-3-319-73600-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics