Abstract
The current rate of decline in biodiversity exclaims ecological conservation. In response, camera traps are being increasingly deployed for the perlustration of wildlife. The analyses of camera trap data can aid in curbing species extinction. However, a substantial amount of time is lost in the manual review curtailing the usage of camera traps for prompt decision-making. The insuperable visual challenges and proneness of camera trap to record empty frames (frames that are natural backdrops with no wildlife presence) deem wildlife detection and species recognition a demanding and taxing task. Thus, we propose a pipeline for wildlife detection and species recognition to expedite the processing of camera trap sequences. The proposed pipeline consists of three stages: (i) empty frame removal, (ii) wildlife detection, and (iii) species recognition and classification. We leverage vision transformer (ViT), DEtection TRansformer (DETR), vision and detection transformer (ViDT), faster region based convolutional neural network (Faster R-CNN), inception v3, and ResNet 50 for the same. We examine the adroitness of the leveraged algorithms at new and unseen locations against the challenges of domain generalisation. We demonstrate the effectiveness of the proposed pipeline using the Caltech camera trap (CCT) dataset.
This work is partially supported by the National Mission for Himalayan Studies (NMHS) grant GBPNI/NMHS-2019-20/SG/314.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banerjee, A., Dinesh, D.A., Bhavsar, A.: Sieving camera trap sequences in the wild. In: ICPRAM, pp. 470–479 (2022)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Beery, S., Van Horn, G., Perona, P.: Recognition in terra incognita. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 472–489. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_28
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Cheema, G.S., Anand, S.: Automatic detection and recognition of individuals in patterned species. In: Altun, Y., et al. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10536, pp. 27–38. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71273-4_3
Cunha, F., dos Santos, E.M., Barreto, R., Colonna, J.G.: Filtering empty camera trap images in embedded systems. In: Proceedings of the IEEE CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 2438–2446 (2021)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of 2005 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005)
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Emami, E., Fathy, M.: Object tracking using improved CAMshift algorithm combined with motion segmentation. In: Proceedings of the 7th Machine Vision and Image Processing (MVIP), 2011 Iranian, pp. 1–4 (2011)
Figueroa, K., Camarena-Ibarrola, A., García, J., Villela, H.T.: Fast automatic detection of wildlife in images from trap cameras. In: Bayro-Corrochano, E., Hancock, E. (eds.) CIARP 2014. LNCS, vol. 8827, pp. 940–947. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12568-8_114
Guo, Z., Zhang, L., Zhang, D.: A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 19(6), 1657–1663 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 770–778 (2016)
Hidayatullah, P., Konik, H.: CAMshift improvement on multi-hue and multi-object tracking. In: Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, pp. 1–6 (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. (NeurIPS 2012) 25, 1097–1105 (2012)
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam (2018)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 1150–1157. IEEE (1999)
Matuska, S., Hudec, R., Kamencay, P., Trnovszky, T.: A video camera road sign system of the early warning from collision with the wild animals. Civil Environ. Eng. 12(1), 42–46 (2016)
Norouzzadeh, M.S., et al.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Natl. Acad. Sci. 115(25), E5716–E5725 (2018)
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)
Pinto, F., Torr, P., Dokania, P.K.: Are vision transformers always more robust than convolutional neural networks? In: Advances in Neural Information Processing Systems (NeurIPS 2021) (2021)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR 2017), pp. 7263–7271 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015)
Schneider, S., Taylor, G.W., Kremer, S.: Deep learning object detection methods for ecological camera trap data. In: Proceedings of 2018 15th Conference on Computer and Robot Vision (CRV), pp. 321–328. IEEE (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Swanson, A., Kosmala, M., Lintott, C., Simpson, R., Smith, A., Packer, C.: Snapshot serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna. Scientific Data 2(1), 1–14 (2015)
Swinnen, K.R., Reijniers, J., Breno, M., Leirs, H.: A novel method to reduce time investment when processing videos from camera trap studies. PLoS ONE 9(6), e98881 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 2818–2826 (2016)
Wu, B., et al.: Visual transformers: token-based image representation and processing for computer vision. arXiv preprint arXiv:2006.03677 (2020)
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2 (2019)
Yu, Y., Li, Y., Quian, T.: Automatic species identification in camera-trap images. Tech. rep, Stanford InfoLab (2018)
Zhang, Z., He, Z., Cao, G., Cao, W.: Animal detection from highly cluttered natural scenes using spatiotemporal object region proposals and patch verification. IEEE Trans. Multimedia 18(10), 2079–2092 (2016)
Zhou, D.: Real-time animal detection system for intelligent vehicles, Ph. D. thesis, University of Ottawa (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Banerjee, A., Dinesh, D.A., Bhavsar, A. (2023). Perusal of Camera Trap Sequences Across Locations. In: De Marsico, M., Sanniti di Baja, G., Fred, A. (eds) Pattern Recognition Applications and Methods. ICPRAM ICPRAM 2021 2022. Lecture Notes in Computer Science, vol 13822. Springer, Cham. https://doi.org/10.1007/978-3-031-24538-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-24538-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24537-4
Online ISBN: 978-3-031-24538-1
eBook Packages: Computer ScienceComputer Science (R0)