Object Discovery Using CNN Features in Egocentric Videos

Bolaños, Marc; Garolera, Maite; Radeva, Petia

doi:10.1007/978-3-319-19390-8_8

Marc Bolaños¹⁶,
Maite Garolera¹⁷ &
Petia Radeva^16,18

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9117))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

4145 Accesses
4 Citations

Abstract

Lifelogging devices based on photo/video are spreading faster everyday. This growth can represent great benefits to develop methods for extraction of meaningful information about the user wearing the device and his/her environment. In this paper, we propose a semi-supervised strategy for easily discovering objects relevant to the person wearing a first-person camera. The egocentric video sequence acquired by the camera, uses both the appearance extracted by means of a deep convolutional neural network and an object refill methodology that allow to discover objects even in case of small amount of object appearance in the collection of images. We validate our method on a sequence of 1000 egocentric daily images and obtain results with an F-measure of 0.5, 0.17 better than the state of the art approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Leveraging Activity Indexing for Egocentric Image Retrieval

Human Activity Classification Using Convolutional Neural Networks

Deep Convolutional Neural Networks for Human Activity Classification

Notes

1.
Refilling the space with more samples of the same class can form a more compact and clear cluster.
2.
On any case, the refilled samples, which were already labeled, can only get their labels changed if they did not belong to the initial selection set (40 %).

References

Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: a retrospective memory aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006)
Chapter Google Scholar
Michael, K.: Wearable computers challenge human rights. ABC Science (2013)
Google Scholar
Schulter, S., Leistner, C., Roth, P., Bischof, H.: Unsupervised object discovery and segmentation in videos. In: Proceedings of the British Machine Vision Conference, pp. 53.1–53.12. BMVA Press (2013)
Google Scholar
Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1605–1614. IEEE (2006)
Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images In: Tenth International Conference on Computer Vision, ICCV, vol. 1, pp. 370–377. IEEE (2005)
Google Scholar
Liu, D., Chen, T.: Unsupervised image categorization and object localization using topic models and correspondences between images. In: 11th International Conference on Computer Vision, ICCV, pp. 1–7. IEEE (2007)
Google Scholar
Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: Conference on CVPR, pp. 1346–1353. IEEE (2012)
Google Scholar
Lee, Y.J., Grauman, K.: Object-graphs for context-aware visual category discovery. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 346–358 (2012)
Article Google Scholar
Honglak, L., Roger, G., Rajesh, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. Computer Science Department, Stanford University, Stanford (2009)
Google Scholar
Honglak, L., Yan, L., Rajesh, R., Peter, P., Ng, A.Y.: Unsupervised feature learning for audio classification using convolutional deep belief networks. Computer Science Department, Stanford University, Stanford (2009)
Google Scholar
Goodfellow, I.J., Bulatov, Y., Ibarz, J., Arnoud, S.: Vinay Shet: Multi-digit Number Recognition from Street View Imagery Using Deep Convolutional Neural Networks. Google Inc., Mountain View (2014)
Google Scholar
Moghimi, M., Azagra, P., Montesano, L., Murillo, A.C., Belongie, S.: Experiments on an RGB-D wearable vision system for egocentric activity recognition. In: 3rd Workshop on Egocentric (First-person) Vision, CVPR (2014)
Google Scholar
Lee, Y.J., Grauman, K.: Learning the easy things first: self-paced visual category discovery. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1721–1728. IEEE (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Jia, Y.: Caffe: an open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org/
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 73–80. IEEE (2010)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)
Google Scholar
Tu, Z.: Auto-context and its application to high-level vision tasks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1–8. IEEE (2008)
Google Scholar
Bolaños, M., Garolera, M., Radeva, P.: Active labeling application applied to food-related object recognition. In: Proceedings of the 5th International Workshop on Multimedia for Cooking & Eating Activities, ACM Multiedia International Conference, pp. 45–50 (2013)
Google Scholar
Bolaños, M., Garolera, M., Radeva, P.: Video segmentation of life-logging videos. In: Perales, F.J., Santos-Victor, J. (eds.) AMDO 2014. LNCS, vol. 8563, pp. 1–9. Springer, Heidelberg (2014)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universitat de Barcelona, Barcelona, Spain
Marc Bolaños & Petia Radeva
Hospital de Terrassa-Consorci Sanitari de Terrassa, Terrassa, Spain
Maite Garolera
Computer Vision Center of Barcelona, Bellaterra, Spain
Petia Radeva

Authors

Marc Bolaños
View author publications
You can also search for this author in PubMed Google Scholar
Maite Garolera
View author publications
You can also search for this author in PubMed Google Scholar
Petia Radeva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Bolaños .

Editor information

Editors and Affiliations

Universitat Politècnica de València, València, Spain
Roberto Paredes
Universidade do Porto, Porto, Portugal
Jaime S. Cardoso
Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Xosé M. Pardo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bolaños, M., Garolera, M., Radeva, P. (2015). Object Discovery Using CNN Features in Egocentric Videos. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-19390-8_8
Published: 09 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19389-2
Online ISBN: 978-3-319-19390-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics