Abstract
Detection and recognition of Activities of Daily Living (ADLs) from visual data is a useful tool for unobtrusive home environment monitoring. ADLs are detected spatio-temporally in long videos, while activity recognition is applied for the purposes of human behaviour analysis and life logging. We propose a novel ADL detection schema for the fast and accurate temporal localization of activities of daily living. At the same time, a novel representation, based on trajectories extracted in Activity Areas, along with a hybrid feature descriptor, are proposed to improve ADL recognition results. A temporal sliding window approach is used to classify overlapping time intervals using a multi-class SVM classifier, while a voting procedure accumulates recognition results in order to detect ambiguous time intervals. The proposed framework is tested on realistic scenarios of daily living, in videos recorded in home and lab environments, while accuracy rates on benchmark datasets are also provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Avgerinakis, K., Briassouli, A., & Kompatsiaris, I. (2013). Robust monocular recognition of activities of daily living for smart homes. In The 9th International Conference on Intelligent Environments (IE2013), Athens, Greece, July 18–19.
Avgerinakis, K., & Kompatsiaris, I. (2013). Demcare action dataset for evaluating dementia patients in a home-based environment. In Ambient TeleCare session of Innovation in Medicine and Healthcare (InMed), Athens.
Blagouchine, I. V., & Moreau, E. (2010). Unbiased efficient estimator of the fourth-order cumulant for random zero-mean non-i.i.d. signals: Particular case of ma stochastic process. IEEE Transactions on Information Theory, 56(12), 6450–6458. ISSN 0018-9448.
Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 257–267.
Briassouli, A., & Kompatsiaris, I. (2009). Robust temporal activity templates using higher order statistics. IEEE Transactions on Image Processing, 18(12), 2756–2768.
Derpanis, K. G., Sizintsev, M., Cannons, K., & Wildes, R. P. (2010). Efficient action spotting based on a spacetime oriented structure representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (pp. 65–72).
Gaidon, A., Harchaoui, Z., & Schmid, C. (2013). Temporal localization of actions with actoms. IEEE Transactions on Pattern Analysis and Machince Intelligence, 35(11), 2782–2795.
Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. In IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI) (pp. 1395–1402).
Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., & Schmid, C. (2012). Aggregating local image descriptors into compact codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1704–1716. ISSN 0162-8828.
Ke, Y., Sukthankar, R., & Hebert, M. (2010). Volumetric features for video event detection. International Journal of Computer Vision, 88(3), 339–362.
Kim, T. K., & Cipolla, R. (2009). Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1415–1428.
Klaser, A., Marszalek, M., & Schmid, C. (2008). A spatio-temporal descriptor based on 3d-gradients. In In British Machine Vision Conference (BMVC).
Klaser, A., Marszałek, M., Schmid, C., & Zisserman, A. (2010). Human focused action localization in video. In K. N. Kutulakos (Ed.), Lecture Notes in Computer Science: Vol. 6553. IEEE European Conference on Computer Vision (ECCV Workshops) (pp. 219–233). Berlin: Springer. ISBN 978-3-642-35748-0.
Laptev, I., & Lindeberg, T. (2003). Space-time interest points. In IEEE International Conference on Computer Vision (ICCV) (Vol. 1, pp. 432–439).
Laptev, I., Marszałek, M., Schmid, C., & Rozenfeld, B. (2008). Learning realistic human actions from movies. In IEEE Conference on Computer Vision & Pattern Recognition (CVPR).
Laptev, I., & Perez, P. (2007). Retrieving actions in movies. IEEE International Conference on Computer Vision (ICCV).
Laxton, B., Lim, J., & Kriegman, D. (2007). Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video. In Computer Vision and Pattern Recognition (CVPR).
Marsaglia, G., Tsang, W. W., & Wang, J. (2003). Evaluating kolmogorov’s distribution. Journal of Statistical Software, 8(18), 1–4, 11. ISSN 1548-7660. URL http://www.jstatsoft.org/v08/i18.
Messing, R., Pal, C., & Kautz, H. (2009). Activity recognition using the velocity histories of tracked keypoints. In IEEE International Conference on Computer Vision (ICCV). Washington, DC.: IEEE Computer Society.
Matikainen, P., Hebert, M., & Sukthankar, R. (2009). Trajectons: Action recognition through the motion analysis of tracked features. In International Conference on Computer Vision Workshop on Video-Oriented Object and Event Classification (ICCV Workshop).
Oliver, N. M., Rosario, B., & Pentland, A. P. (2000). A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 831–843.
Oneata, D., Verbeek, J., & Schmid, C. (2013). Action and event recognition with fisher vectors on a compact feature set. In IEEE International Conference in Computer Vision (ICCV).
Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activity in video. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. Transactions on Acoustics, Speech and Signal Processing, 26(1), 43–49.
Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th International Conference on Multimedia, MULTIMEDIA ’07 (pp. 357–360). New York: ACM. ISBN 978-1-59593-702-5. URL http://doi.acm.org/10.1145/1291233.1291311.
Wang, H., Klaser, A., Schmid, C., & Liu, C.-L. (2011). Action recognition by dense trajectories. In IEEE Conference on Computer Vision & Pattern Recognition (CVPR) (pp. 3169–3176). Colorado Springs. URL http://hal.inria.fr/inria-00583818/en.
Willems, G., Tuytelaars, T., & Gool, L. (2008). An efficient dense and scale-invariant spatio-temporal interest point detector. In Proceedings of the 10th European Conference on Computer Vision: Part II (pp. 650–663). Berlin/Heidelberg: IEEE European Conference on Computer Vision (ECCV). ISBN 978-3-540-88685-3.
Weinland, D., Ronfard, R., & Boyer, E. (2006). Free viewpoint action recognition using motion history volumes. In Computer Vision and Image Understanding (CVIU).
Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime tv-l1 optical flow. In Annual Symposium of the German Association for Pattern Recognition (pp. 214–223).
Acknowledgements
This work was funded by the European Commission under the 7th Framework Program (FP7 2007–2013), grant agreement 288199 Dem@Care.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Avgerinakis, K., Briassouli, A., Kompatsiaris, I. (2015). Activity Detection and Recognition of Daily Living Events. In: Briassouli, A., Benois-Pineau, J., Hauptmann, A. (eds) Health Monitoring and Personalized Feedback using Multimedia Data. Springer, Cham. https://doi.org/10.1007/978-3-319-17963-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-17963-6_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17962-9
Online ISBN: 978-3-319-17963-6
eBook Packages: Computer ScienceComputer Science (R0)