Abstract
In recent years, action recognition from egocentric videos has emerged as an important research problem. Availability of several wearable camera devices at affordable costs has resulted in a huge amount of first- person/egocentric videos. Recognizing actions from this extensive unstructured data in the presence of camera motion becomes extremely difficult. Existing solutions to this problem are mostly supervised in nature, which require a large number of training samples. In sharp contrast, we propose a weakly supervised solution to this problem using random walk. Our solution requires only a few training samples (seeds). Overall, the proposed method consists of three major components, namely, feature extraction using PHOG (Pyramidal HOG) and a Center-Surround model, construction of a Video Similarity Graph (VSG), and execution of random walk on the VSG. Experimental results on five standard ADL egocentric video datasets clearly indicate the advantage of our solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bolanos, M., Dimiccoli, M., Radeva, P.: Toward storytelling from visual lifelogging: an overview. IEEE Trans. Hum.-Mach. Syst. 47(1), 77–90 (2017)
Yan, Y., Ricci, E., Liu, G., Sebe, N.: Egocentric daily activity recognition via multitask clustering. IEEE Trans. Image Process. 24(10), 2984–2995 (2015)
Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: CVPR, pp. 2847–2854 (2012)
Koohzadi, M., Charkari, N.M.: Survey on deep learning methods in human action recognition. IET Comput. Vis. 11(8), 623–632 (2017)
Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 15(3), 1192–1209 (2013)
McCandless, T., Grauman, K.: Object-centric spatio-temporal pyramids for egocentric activity recognition. In: BMVC (2013)
Singh, S., Arora, C., Jawahar, C.V.: First person action recognition using deep learned descriptors. In: CVPR, pp. 2620–2628 (2016)
Ercolano, G., Riccio, D., Rossi, S.: Two deep approaches for ADL recognition: a multi-scale LSTM and a CNN-LSTM with a 3D matrix skeleton representation. In: RO-MAN, pp. 877–882 (2017)
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: CIVR, pp. 401–408 (2007)
Sahu, A., Chowdhury, A.S.: Shot level egocentric video co-summarization. In: ICPR, Beijing, China (2018) (accepted)
Fathi, A., Farhadi, A., Rehg, J.M.: Understanding egocentric activities. In: ICCV, pp. 407–414 (2011)
Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: CVPR, pp. 3281–3288 (2011)
Lei, J., Ren, X., Fox, D.: Fine-grained kitchen activity recognition using RGB-D. In: ACM Conference on Ubiquitous Computing, pp. 208–211 (2012)
Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR, pp. 1194–1201 (2012)
Ma, M., Fan, H., Kitani, K.M.: Going deeper into first-person activity recognition. arXiv preprint arXiv:1605.03688 (2016)
Alfaro, A., Mery, D., Soto, A.: Action recognition in video using sparse coding and relative features. In: CVPR, pp. 2688–2697 (2016)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS, vol. 2011, no. 2, p. 5 (2011)
Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: CVPR, pp. 1346–1353 (2012)
Barron, J.L., Fleet, D.J., Beauchemin, S.S.: Performance of optical flow techniques. IJCA 12(1), 43–77 (1994)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Panda, R., Kuanar, S.K., Chowdhury, A.S.: Scalable video summarization using skeleton graph and random walk. In: ICPR, pp. 3481–3486 (2014)
Paragios, N., Chen, Y., Faugeras, O.D.: Handbook of Mathematical Models in Computer Vision. Springer Science and Business Media (2006)
Grady, L., Schwartz, E.: Anisotropic interpolation on graphs: the combinatorial Dirichlet problem. Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems (2003)
Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1768–1783 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sahu, A., Bhattacharya, R., Bhura, P., Chowdhury, A.S. (2020). Action Recognition from Egocentric Videos Using Random Walks. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1024. Springer, Singapore. https://doi.org/10.1007/978-981-32-9291-8_31
Download citation
DOI: https://doi.org/10.1007/978-981-32-9291-8_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9290-1
Online ISBN: 978-981-32-9291-8
eBook Packages: EngineeringEngineering (R0)