Skip to main content

A Dynamic Approach and a New Dataset for Hand-detection in First Person Vision

  • Conference paper
  • First Online:
Computer Analysis of Images and Patterns (CAIP 2015)

Abstract

Hand detection and segmentation methods stand as two of the most most prominent objectives in First Person Vision. Their popularity is mainly explained by the importance of a reliable detection and location of the hands to develop human-machine interfaces for emergent wearable cameras. Current developments have been focused on hand segmentation problems, implicitly assuming that hands are always in the field of view of the user. Existing methods are commonly presented with new datasets. However, given their implicit assumption, none of them ensure a proper composition of frames with and without hands, as the hand-detection problem requires. This paper presents a new dataset for hand-detection, carefully designed to guarantee a good balance between positive and negative frames, as well as challenging conditions such as illumination changes, hand occlusions and realistic locations. Additionally, this paper extends a state-of-the-art method using a dynamic filter to improve its detection rate. The improved performance is proposed as a baseline to be used with the dataset.

This work was supported in part by the Erasmus Mundus joint Doctorate in Interactive and Cognitive Environments, which is funded by the EACEA, Agency of the European Commission under EMJD ICE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abbeel, P., Coates, A.: Discriminative training of Kalman filters. In: Robotics: Science and Systems, pp. 1–8. Cambridge, MA, USA (2005)

    Google Scholar 

  2. Aghazadeh, O., Sullivan, J., Carlsson, S.: Novelty detection from an ego-centric perspective. In: Computer Vision and Pattern Recognition, pp. 3297–3304. IEEE, Pittsburgh, June 2011

    Google Scholar 

  3. Alletto, S., Serra, G., Calderara, S., Cucchiara, R.: Head pose estimation in first-person camera views. In: International Conference on Pattern Recognition, p. 4188. IEEE Computer Society (2014)

    Google Scholar 

  4. Alletto, S., Serra, G., Calderara, S., Solera, F., Cucchiara, R.: From ego to nos-vision: detecting social relationships in first-person views. In: Computer Vision and Pattern Recognition, pp. 594–599. IEEE, June 2014

    Google Scholar 

  5. Bengio, Y., Grandvalet, Y.: No Unbiased Estimator of the Variance of k-fold Cross-Validation. The Journal of Machine Learning Research 5, 1089–1105 (2004)

    MathSciNet  MATH  Google Scholar 

  6. Betancourt, A.M.L., Rauterberg, M., Regazzoni, C.: A sequential classifier for hand detection in the framework of egocentric vision. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, vol. 1, pp. 600–605. IEEE, Columbus, June 2014

    Google Scholar 

  7. Betancourt, A., Morerio, P., Marcenaro, L., Barakova, E., Rauterberg, M., Regazzoni, C.: Towards a unified framework for hand-based methods in first person vision. In: IEEE International Conference on Multimedia and Expo (Workshops). IEEE, Turin (2015)

    Google Scholar 

  8. Betancourt, A., Morerio, P., Marcenaro, L., Rauterberg, M., Regazzoni, C.: Filtering SVM frame-by-frame binary classification in a detection framework. In: International Conference on Image Processing. IEEE, Quebec (2015)

    Google Scholar 

  9. Betancourt, A., Morerio, P., Regazzoni, C., Rauterberg, M.: The Evolution of First Person Vision Methods: A Survey. IEEE Transactions on Circuits and Systems for Video Technology 25(5), 744–760 (2015)

    Article  Google Scholar 

  10. Chelouah, R., Siarry, P.: Genetic and NelderMead algorithms hybridized for a more accurate global optimization of continuous multiminima functions. European Journal of Operational Research 148(2), 335–348 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  11. Damen, D., Haines, O.: Multi-user egocentric online system for unsupervised assistance on object usage. In: European Conference on Computer Vision (2014)

    Google Scholar 

  12. Fathi, A., Farhadi, A., Rehg, J.: Understanding egocentric activities. In: International Conference on Computer Vision, pp. 407–414. IEEE, November 2011

    Google Scholar 

  13. Fathi, A., Hodgins, J., Rehg, J.: Social interactions: a first-person perspective. In: Computer Vision and Pattern Recognition, pp. 1226–1233. IEEE, Providence, June 2012

    Google Scholar 

  14. Fathi, A., Li, Y., Rehg, J.: Learning to recognize daily actions using gaze. In: European Conference on Computer Vision, pp. 314–327. Georgia Institute of Technology, Florence (2012)

    Google Scholar 

  15. Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: Computer Vision and Pattern Recognition, pp. 1346–1353. IEEE, June 2012

    Google Scholar 

  16. Kitani, K., Okabe, T.: Fast unsupervised ego-action learning for first-person sports videos. In: Computer Vision and Pattern Recognition, pp. 3241–3248. IEEE, Providence, June 2011

    Google Scholar 

  17. Lee, S., Bambach, S., Crandall, D., Franchak, J., Yu, C.: This hand is my hand: a probabilistic approach to hand disambiguation in egocentric video. In: Computer Vision and Pattern Recognition, pp. 1–8. IEEE Computer Society, Columbus (2014)

    Google Scholar 

  18. Li, C., Kitani, K.: Pixel-level hand detection in ego-centric videos. In: Computer Vision and Pattern Recognition, pp. 3570–3577. IEEE, June 2013

    Google Scholar 

  19. Li, Y., Fathi, A., Rehg, J.: Learning to predict gaze in egocentric video. In: International Conference on Computer Vision, pp. 1–8. IEEE (2013)

    Google Scholar 

  20. Mayol, W., Murray, D.: Wearable hand activity recognition for event summarization. In: International Symposium on Wearable Computers, pp. 1–8. IEEE (2005)

    Google Scholar 

  21. Morerio, P., Marcenaro, L., Regazzoni, C.: Hand detection in first person vision. In: Information Fusion, pp. 1502–1507. University of Genoa, Istanbul (2013)

    Google Scholar 

  22. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine Learning in Python. Research, Journal of Machine Learning 12, 2825–2830 (2011)

    MATH  Google Scholar 

  23. Philipose, M.: Egocentric recognition of handled objects: benchmark and analysis. In: Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Miami, June 2009

    Google Scholar 

  24. Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: Computer Vision and Pattern Recognition, pp. 2847–2854. IEEE, June 2012

    Google Scholar 

  25. Ryoo, M., Matthies, L.: First-person activity recognition: what are they doing to me? In: Conference on Computer Vision and Pattern Recognition, pp. 2730–2737. IEEE Comput. Soc, Portland (2013)

    Google Scholar 

  26. Schiele, B., Oliver, N., Jebara, T., Pentland, A.: An interactive computer vision system DyPERS: dynamic personal enhanced reality system. In: Christensen, H.I. (ed.) ICVS 1999. LNCS, vol. 1542, pp. 51–65. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  27. Serra, G., Camurri, M., Baraldi, L.: Hand segmentation for gesture recognition in ego-vision. In: Workshop on Interactive Multimedia on Mobile & Portable Devices, pp. 31–36. ACM Press, New York (2013)

    Google Scholar 

  28. Spriggs, E., De La Torre, F., Hebert, M.: Temporal segmentation and activity classification from first-person sensing. In: Computer Vision and Pattern Recognition Workshops, pp. 17–24. IEEE, June 2009

    Google Scholar 

  29. Starner, T., Schiele, B., Pentland, A.: Visual contextual awareness in wearable computing. In: International Symposium on Wearable Computers, pp. 50–57. IEEE Computer Society (1998)

    Google Scholar 

  30. Sun, L., Klank, U., Beetz, M.: Eyewatchme3d hand and object tracking for inside out activity analysis. In: Computer Vision and Pattern Recognition, pp. 9–16 (2009)

    Google Scholar 

  31. Zariffa, J., Popovic, M.: Hand Contour Detection in Wearable Camera Video Using an Adaptive Histogram Region of Interest. Journal of NeuroEngineering and Rehabilitation 10(114), 1–10 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Betancourt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Betancourt, A., Morerio, P., Barakova, E.I., Marcenaro, L., Rauterberg, M., Regazzoni, C.S. (2015). A Dynamic Approach and a New Dataset for Hand-detection in First Person Vision. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23192-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23191-4

  • Online ISBN: 978-3-319-23192-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics