Skip to main content

Activity Detection and Recognition of Daily Living Events

  • Chapter

Abstract

Detection and recognition of Activities of Daily Living (ADLs) from visual data is a useful tool for unobtrusive home environment monitoring. ADLs are detected spatio-temporally in long videos, while activity recognition is applied for the purposes of human behaviour analysis and life logging. We propose a novel ADL detection schema for the fast and accurate temporal localization of activities of daily living. At the same time, a novel representation, based on trajectories extracted in Activity Areas, along with a hybrid feature descriptor, are proposed to improve ADL recognition results. A temporal sliding window approach is used to classify overlapping time intervals using a multi-class SVM classifier, while a voting procedure accumulates recognition results in order to detect ambiguous time intervals. The proposed framework is tested on realistic scenarios of daily living, in videos recorded in home and lab environments, while accuracy rates on benchmark datasets are also provided.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Avgerinakis, K., Briassouli, A., & Kompatsiaris, I. (2013). Robust monocular recognition of activities of daily living for smart homes. In The 9th International Conference on Intelligent Environments (IE2013), Athens, Greece, July 18–19.

    Google Scholar 

  2. Avgerinakis, K., & Kompatsiaris, I. (2013). Demcare action dataset for evaluating dementia patients in a home-based environment. In Ambient TeleCare session of Innovation in Medicine and Healthcare (InMed), Athens.

    Google Scholar 

  3. Blagouchine, I. V., & Moreau, E. (2010). Unbiased efficient estimator of the fourth-order cumulant for random zero-mean non-i.i.d. signals: Particular case of ma stochastic process. IEEE Transactions on Information Theory, 56(12), 6450–6458. ISSN 0018-9448.

    Google Scholar 

  4. Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 257–267.

    Article  Google Scholar 

  5. Briassouli, A., & Kompatsiaris, I. (2009). Robust temporal activity templates using higher order statistics. IEEE Transactions on Image Processing, 18(12), 2756–2768.

    Article  MathSciNet  Google Scholar 

  6. Derpanis, K. G., Sizintsev, M., Cannons, K., & Wildes, R. P. (2010). Efficient action spotting based on a spacetime oriented structure representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

  7. Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (pp. 65–72).

    Google Scholar 

  8. Gaidon, A., Harchaoui, Z., & Schmid, C. (2013). Temporal localization of actions with actoms. IEEE Transactions on Pattern Analysis and Machince Intelligence, 35(11), 2782–2795.

    Article  Google Scholar 

  9. Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. In IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI) (pp. 1395–1402).

    Google Scholar 

  10. Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., & Schmid, C. (2012). Aggregating local image descriptors into compact codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1704–1716. ISSN 0162-8828.

    Google Scholar 

  11. Ke, Y., Sukthankar, R., & Hebert, M. (2010). Volumetric features for video event detection. International Journal of Computer Vision, 88(3), 339–362.

    Article  MathSciNet  Google Scholar 

  12. Kim, T. K., & Cipolla, R. (2009). Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1415–1428.

    Article  Google Scholar 

  13. Klaser, A., Marszalek, M., & Schmid, C. (2008). A spatio-temporal descriptor based on 3d-gradients. In In British Machine Vision Conference (BMVC).

    Google Scholar 

  14. Klaser, A., Marszałek, M., Schmid, C., & Zisserman, A. (2010). Human focused action localization in video. In K. N. Kutulakos (Ed.), Lecture Notes in Computer Science: Vol. 6553. IEEE European Conference on Computer Vision (ECCV Workshops) (pp. 219–233). Berlin: Springer. ISBN 978-3-642-35748-0.

    Chapter  Google Scholar 

  15. Laptev, I., & Lindeberg, T. (2003). Space-time interest points. In IEEE International Conference on Computer Vision (ICCV) (Vol. 1, pp. 432–439).

    Google Scholar 

  16. Laptev, I., Marszałek, M., Schmid, C., & Rozenfeld, B. (2008). Learning realistic human actions from movies. In IEEE Conference on Computer Vision & Pattern Recognition (CVPR).

    Google Scholar 

  17. Laptev, I., & Perez, P. (2007). Retrieving actions in movies. IEEE International Conference on Computer Vision (ICCV).

    Google Scholar 

  18. Laxton, B., Lim, J., & Kriegman, D. (2007). Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video. In Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

  19. Marsaglia, G., Tsang, W. W., & Wang, J. (2003). Evaluating kolmogorov’s distribution. Journal of Statistical Software, 8(18), 1–4, 11. ISSN 1548-7660. URL http://www.jstatsoft.org/v08/i18.

  20. Messing, R., Pal, C., & Kautz, H. (2009). Activity recognition using the velocity histories of tracked keypoints. In IEEE International Conference on Computer Vision (ICCV). Washington, DC.: IEEE Computer Society.

    Google Scholar 

  21. Matikainen, P., Hebert, M., & Sukthankar, R. (2009). Trajectons: Action recognition through the motion analysis of tracked features. In International Conference on Computer Vision Workshop on Video-Oriented Object and Event Classification (ICCV Workshop).

    Google Scholar 

  22. Oliver, N. M., Rosario, B., & Pentland, A. P. (2000). A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 831–843.

    Article  Google Scholar 

  23. Oneata, D., Verbeek, J., & Schmid, C. (2013). Action and event recognition with fisher vectors on a compact feature set. In IEEE International Conference in Computer Vision (ICCV).

    Google Scholar 

  24. Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activity in video. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

  25. Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. Transactions on Acoustics, Speech and Signal Processing, 26(1), 43–49.

    Article  MATH  Google Scholar 

  26. Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th International Conference on Multimedia, MULTIMEDIA ’07 (pp. 357–360). New York: ACM. ISBN 978-1-59593-702-5. URL http://doi.acm.org/10.1145/1291233.1291311.

  27. Wang, H., Klaser, A., Schmid, C., & Liu, C.-L. (2011). Action recognition by dense trajectories. In IEEE Conference on Computer Vision & Pattern Recognition (CVPR) (pp. 3169–3176). Colorado Springs. URL http://hal.inria.fr/inria-00583818/en.

  28. Willems, G., Tuytelaars, T., & Gool, L. (2008). An efficient dense and scale-invariant spatio-temporal interest point detector. In Proceedings of the 10th European Conference on Computer Vision: Part II (pp. 650–663). Berlin/Heidelberg: IEEE European Conference on Computer Vision (ECCV). ISBN 978-3-540-88685-3.

    Google Scholar 

  29. Weinland, D., Ronfard, R., & Boyer, E. (2006). Free viewpoint action recognition using motion history volumes. In Computer Vision and Image Understanding (CVIU).

    Google Scholar 

  30. Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime tv-l1 optical flow. In Annual Symposium of the German Association for Pattern Recognition (pp. 214–223).

    Google Scholar 

Download references

Acknowledgements

This work was funded by the European Commission under the 7th Framework Program (FP7 2007–2013), grant agreement 288199 Dem@Care.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Avgerinakis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Avgerinakis, K., Briassouli, A., Kompatsiaris, I. (2015). Activity Detection and Recognition of Daily Living Events. In: Briassouli, A., Benois-Pineau, J., Hauptmann, A. (eds) Health Monitoring and Personalized Feedback using Multimedia Data. Springer, Cham. https://doi.org/10.1007/978-3-319-17963-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17963-6_8

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17962-9

  • Online ISBN: 978-3-319-17963-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics