Abstract
The widespread adoption of low-cost wearable devices requires novel paradigms for analysing human behaviour. In particular, when focusing on first-person cameras continuously recording several hours of the users life, the task of activity recognition is especially challenging. As a huge amount of unlabeled data is automatically generated in this scenario, despite recent notable attempts, more scalable algorithms and more effective feature representations are required. In this paper, we address the problem of everyday activity recognition from visual data gathered from a wearable camera proposing a novel multi-task learning framework. We argue that, even if label information is not provided, we can take advantage of the fact that the tasks of recognizing activities of daily life of multiple individuals are related, i.e. typically people tend to perform the same actions in the same environment (e.g. people at home in the morning typically have breakfast and brush their teeth). To exploit this information we propose a novel multi-task clustering approach. With our method, rather than clustering data from different users separately, we look for data partitions which are similar among related tasks. Thorough experiments on two publicly available first-person vision datasets demonstrate that the proposed approach consistently and significantly outperforms several state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kanade, T., Hebert, M.: First-person vision. Proc. IEEE 100, 2442–2453 (2012)
Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: CVPR (2013)
Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: CVPR (2012)
Ogaki, K., Kitani, K.M., Sugano, Y., Sato, Y.: Coupling eye-motion and ego-motion features for first-person activity recognition. In: CVPR Workshop on Egocentric Vision (2012)
Tapia, E.M., Intille, S.S., Larson, K.: Activity recognition in the home using simple and ubiquitous sensors. In: Ferscha, A., Mattern, F. (eds.) PERVASIVE 2004. LNCS, vol. 3001, pp. 158–175. Springer, Heidelberg (2004)
Casale, P., Pujol, O., Radeva, P.: Human activity recognition from accelerometer data using a wearable device. In: Vitrià , J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 289–296. Springer, Heidelberg (2011)
Omid, A., Josephine, S., Stefan, C.: Novelty detection from an egocentric perspective. In: CVPR (2011)
Fathi, A., Rehg, J.M.: Social interactions: a first-person perspective. In: CVPR (2012)
Lei, J., Ren, X., Fox, D.: Fine-grained kitchen activity recognition using rgb-d. In: UBICOMP (2012)
Taralova, E., De la Torre, F., Hebert, M.: Source constrained clustering. In: ICCV (2011)
Mahasseni, B., Todorovic, S.: Latent multitask learning for view-invariant action recognition. In: ICCV (2013)
Yan, Y., Liu, G., Ricci, E., Sebe., N.: Multi-task linear discriminant analysis for multi-view action recognition. In: ICIP (2013)
Yuan, C., Hu, W., Tian, G., Yang, S., Wang, H.: Multi-task sparse learning with beta process prior for action recognition. In: CVPR (2013)
Caruana, R.: Multitask learning. Mach. Learn. 28, 41–75 (1997)
Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: CVPR (2011)
Lu, G., Yan, Y., Sebe, N., Kambhamettu, C.: Knowing where i am: exploiting multi-task learning for multi-view indoor image-based localization. In: BMVC (2014)
Wang, X., Zhang, C., Zhang, Z.: Boosted multi-task learning for face verification with applications to web image and video search. In: CVPR (2009)
Yan, Y., Ricci, E., Subramanian, R., Lanz, O., Sebe, N.: No matter where you are: flexible graph-guided multi-task learning for multi-view head pose classification under target motion. In: ICCV (2013)
Gu, Q., Zhou, J.: Learning the shared subspace for multi-task clustering and transductive transfer classification. In: ICDM (2009)
Kulis, B., Jordan, M.I.: Revisiting k-means: new algorithms via bayesian nonparametrics. In: ICML (2012)
Zhang, J., Zhang, C.: Multitask bregman clustering. Neurocomput. 74, 1720–1734 (2011)
Bulling, A., Ward, J.A., Gellersen, H., Troster, G.: Eye movement analysis for activity recognition using electrooculography. TPAMI 33, 741–753 (2011)
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: ICCV (1998)
Borgwardt, K., Gretton, A., Rasch, M., Kriegel, H.P., Schoelkopf, B., Smola, A.: Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22, 1–9 (2006)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3, 1–122 (2011)
Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32, 45–55 (2010)
Acknowledgement
This work has been supported by the project Cluster Active Ageing at Home.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Yan, Y., Ricci, E., Liu, G., Sebe, N. (2015). Recognizing Daily Activities from First-Person Videos with Multi-task Clustering. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-16817-3_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16816-6
Online ISBN: 978-3-319-16817-3
eBook Packages: Computer ScienceComputer Science (R0)