Fine-grained object recognition in underwater visual data

Spampinato, C.; Palazzo, S.; Joalland, P. H.; Paris, S.; Glotin, H.; Blanc, K.; Lingrand, D.; Precioso, F.

doi:10.1007/s11042-015-2601-x

Fine-grained object recognition in underwater visual data

Published: 24 May 2015

Volume 75, pages 1701–1720, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

C. Spampinato¹,
S. Palazzo¹,
P. H. Joalland^2,3,
S. Paris^2,3,4,
H. Glotin^2,3,
K. Blanc⁵,
D. Lingrand⁵ &
…
F. Precioso⁵

776 Accesses
15 Citations
Explore all metrics

Abstract

In this paper we investigate the fine-grained object categorization problem of determining fish species in low-quality visual data (images and videos) recorded in real-life settings. We first describe a new annotated dataset of about 35,000 fish images (MA-35K dataset), derived from the Fish4Knowledge project, covering 10 fish species from the Eastern Indo-Pacific bio-geographic zone. We then resort to a label propagation method able to transfer the labels from the MA-35K to a set of 20 million fish images in order to achieve variability in fish appearance. The resulting annotated dataset, containing over one million annotations (AA-1M), was then manually checked by removing false positives as well as images with occlusions between fish or showing partially fish. Finally, we randomly picked more than 30,000 fish images distributed among ten fish species and extracted from about 400 10-minute videos, and used this data (both images and videos) for the fish task of the LifeCLEF 2014 contest. Together with the fine-grained visual dataset release, we also present two approaches for fish species classification in, respectively, still images and videos. Both approaches showed high performance (for some fish species the precision and recall were close to one) in object classification and outperformed state-of-the-art methods. In addition, despite the fact that dataset is unbalanced in the number of images per species, both methods (especially the one operating on still images) appear to be rather robust against the long-tail curse of data, showing the best performance on the less populated object classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A labeled data set of underwater images of fish and crab species from five mesohabitats in Puget Sound WA USA

Article Open access 13 November 2023

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

Tracking Fish Abundance by Underwater Image Recognition

Article Open access 13 September 2018

Notes

http://www.fish4knowledge.eu
If main image’s transformations are captured during the stacked/deep feature extraction pipeline, a non-linear classification is not improving results in practice.

References

Barnich O, Van Droogenbroeck M (June 2011) Vibe: A universal background subtraction algorithm for video sequences. IEEE Trans Image Process 20(6):1709–1724
Article MathSciNet Google Scholar
Blanc FPK, Lingrand D (2014) Fish species recognition from video using SVM classifier, in LifeClef’14 - Proceedings, http://www.imageclef.org/2014/lifeclef/fish
Boom BJ, He J, Palazzo S, Huang PX, Beyan C, Chou H-M, Lin F-P, Spampinato C, Fisher RB (2014) A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage. Ecological Informatics 23(0):83–97
Article Google Scholar
Boureau Y (2012) Learning hierarchical feature extractors for image recognition, Ph.D. dissertation, New York University
Branson S, Wah C, Schroff F, Babenko B, Welinder P, Perona P, Belongie S (2010) Visual recognition with humans in the loop. In: 11th European Conference on Computer Vision, vol 6314. Springer, pp 438–451
Chapter Google Scholar
Deng J, Krause J, Fei-Fei L (2013) Fine-grained crowdsourcing for fine-grained recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 580–587
Duan K, Parikh D, Crandall D, Grauman K (2012) Discovering localized attributes for fine-grained recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3474–3481
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Farrell R, Oza O, Zhang N, Morariu V, Darrell T, Davis L (2011) Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp 161–168
Fei-Fei L, Fergus R, Perona P (2003) A bayesian approach to unsupervised one-shot learning of object categories. In: Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, ser. ICCV ’03, pp 1134–1141
Giordano D, Kavasidis I, Palazzo S, Spampinato C (2015) Nonparametric label propagation using mutual local similarity in nearest neighbors. Comp Vision Image Underst 131:116–127
Article Google Scholar
Huang P, Boom B, Fisher R (2013) Underwater live fish recognition using a balance-guaranteed optimized tree, in Computer Vision ACCV 2012, ser. Lecture Notes in Computer Science. In: Lee K, Matsushita Y, Rehg J, Hu Z (eds), vol 7724. Springer, Berlin Heidelberg, pp 422–433. [Online]. Available:, doi:10.1007/978-3-642-37331-2_32
Chapter Google Scholar
Huang P, Boom B, Fisher R (2015) Hierarchical classification with reject option for live fish recognition. Mach Vis Appl 26(1):89–102
Article Google Scholar
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval (SIGIR ’03), pp 119–126
Joalland P, Paris S, Glotin H (2014) Efficient instance-based fish species visual identification by global representation, in LifeClef’14 - Proceedings, http://www.imageclef.org/2014/lifeclef/fish
Joly A, Muller H, Goeau H, Glotin H, Spampinato C, Rauber A, Bonnet P, Vellinga W, Fisher B (2014) Multimedia life species identification challenges. In: Proceedings of CLEF 2014, vol 1
Khan FS, van de Weijer J, Bagdanov AD, Vanrell M (2011) Portmanteau vocabularies for multi-cue image representation. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K (eds) Advances in Neural Information Processing Systems (NIPS 2011), pp 1323–1331
Khosla A, Yao B, Fei-Fei L (2014) Integrating randomization and discrimination for classifying human-object interaction activities, in Human-Centered Social Media Analytics
Chapter Google Scholar
Kumar N, Belhumeur PN, Biswas A, Jacobs DW, Kress WJ, Lopez I, Soares JVB (2012) Leafsnap: A computer vision system for automatic plant species identification. In: The 12th European Conference on Computer Vision (ECCV)
Chapter Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2, pp 2169–2178
Lowe D (1999) Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol 2, pp 1150–1157
Mairal J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: ICML ’09
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):971–987
Article Google Scholar
Paris S, Halkias X, Glotin H (2012) Sparse coding for histograms of local binary patterns applied for image categorization: Toward a bag-of-scenes analysis. In: 21st International Conference on Pattern Recognition (ICPR), pp 2817–2820
Paris S, Halkias X, Glotin H (2013) Efficient bag of scenes analysis for image categorization. In: ICPRAM, pp 335–344
Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp 3498–3505
Snchez J, Perronnin F, de Campos T (2012) Modeling the spatial layout of images beyond spatial pyramids. Pattern Recogn Lett 33(16):2216–2223
Article Google Scholar
Spampinato C, Beauxis-Aussalet E, Palazzo S, Beyan C, Ossenbruggen J, He J, Boom B, Huang X (2014) A rule-based event detection system for real-life underwater domain. Mach Vis Appl 25(1):99–117
Article Google Scholar
Spampinato C, Fisher R, Boom BJ (2014) CLEF working notes 2014, LifeCLEF Fish Identification Task 2014. In: Proceedings of CLEF 2014, vol 1
Spampinato C, Palazzo S, Giordano D, Kavasidis I, Lin F, Lin Y (2012) Covariance based fish tracking in real-life underwater environment. In: VISAPP 2012 - Proceedings of the International Conference on Computer Vision Theory and Applications, Volume 2, Rome, Italy, 24–26 February, 2012, pp 409–414
Spampinato C, Palazzo S, Kavasidis I (2014) A texton-based kernel density estimation approach for background modeling under extreme conditions. Comp Vision Image Underst 122(0):74–83
Article Google Scholar
Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans Image Process 19(6):1635–1650
Article MathSciNet Google Scholar
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Transactions of Pattern Analysis and Machine Intelligence 30(11):1958–1970
Article Google Scholar
Vedaldi A, Fulkerson B (2010) VLFeat - an open and portable library of computer vision algorithms. In: ACM International Conference on Multimedia
Wah C, Branson S, Perona P, Belongie S (2011) Interactive localization and recognition of fine-grained visual categories. In: 2011 IEEE International Conference on Computer Vision (ICCV)
Yao B, Bradski GR, Li F-F (2012) A codebook-free and annotation-free approach for fine-grained image categorization. In: CVPR, pp 3466–3473
Yao B, Khosla A, Fei-Fei L (2011) Combining randomization and discrimination for fine-grained image categorization. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition , pp 1577–1584
Yao B, Li F-F (2010) Grouplet: A structured image representation for recognizing human and object interactions. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 9–16
Yang J, Yu K, Gong Y, Huang TS (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. [Online]. Available: doi:10.1109/CVPRW.2009.5206757, pp 1794–1801
Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol 2, pp 28–31

Download references

Acknowledgments

We thank the Ministére du Redressement Productif (DGCIS) for the support to the RAPID PHRASE project, and the BPI, PACA, TPM for the FUI14 SYCIE project.

Author information

Authors and Affiliations

Department of Electrical, Electronics and Computer Engineering, University of Catania, Catania, Italy
C. Spampinato & S. Palazzo
Aix-Marseille Université, CNRS, ENSAM, LSIS UMR 7296, 13397, Marseille, France
P. H. Joalland, S. Paris & H. Glotin
Université de Toulon, CNRS, LSIS UMR 7296, 83957, La Garde, France
P. H. Joalland, S. Paris & H. Glotin
Institut Universitaire de France (IUF), 75005, Paris, France
S. Paris
I3S, UMR UNS-CNRS 7271, University of Nice Sophia Antipolis, Nice, France
K. Blanc, D. Lingrand & F. Precioso

Authors

C. Spampinato
View author publications
You can also search for this author in PubMed Google Scholar
S. Palazzo
View author publications
You can also search for this author in PubMed Google Scholar
P. H. Joalland
View author publications
You can also search for this author in PubMed Google Scholar
S. Paris
View author publications
You can also search for this author in PubMed Google Scholar
H. Glotin
View author publications
You can also search for this author in PubMed Google Scholar
K. Blanc
View author publications
You can also search for this author in PubMed Google Scholar
D. Lingrand
View author publications
You can also search for this author in PubMed Google Scholar
F. Precioso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Spampinato.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spampinato, C., Palazzo, S., Joalland, P.H. et al. Fine-grained object recognition in underwater visual data. Multimed Tools Appl 75, 1701–1720 (2016). https://doi.org/10.1007/s11042-015-2601-x

Download citation

Received: 01 August 2014
Revised: 23 March 2015
Accepted: 01 April 2015
Published: 24 May 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s11042-015-2601-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-grained object recognition in underwater visual data

Abstract

Access this article

Similar content being viewed by others

A labeled data set of underwater images of fish and crab species from five mesohabitats in Puget Sound WA USA

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

Tracking Fish Abundance by Underwater Image Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fine-grained object recognition in underwater visual data

Abstract

Access this article

Similar content being viewed by others

A labeled data set of underwater images of fish and crab species from five mesohabitats in Puget Sound WA USA

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

Tracking Fish Abundance by Underwater Image Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation