Skip to main content

Advertisement

Log in

Deep MAnTra: deep learning-based multi-animal tracking for Japanese macaques

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

Multi-instance object tracking is an active research problem in computer vision, where most novel methods analyze and locate targets on videos taken from static camera set-ups, just as many existing monitoring systems worldwide. These have proved efficient and effective for many established monitoring systems worldwide, such as animal behavior studies and human and road traffic. However, despite the growing success of computer vision in animal monitoring and behavior analysis, such a system has yet to be developed for free-ranging Japanese macaques. With this, our study aims to establish a tracking system for Japanese macaques in their natural habitat. We begin by training a monkey detector using You Only Look Once (YOLOv4) and investigating the effect of different transfer learning techniques, curriculum learning, and dataset heterogeneity to improve the model’s accuracy. Using the resulting box detections from our monkey detection model, we use SuperGlue and Murty’s algorithm for re-identifying the monkey individuals across the succeeding frames. With a mean \(AP^{50}\) of 96.59%, a precision score of 93%, a recall of 96%, and a mean \(IOU_{AP@50}\) of 77.2%, our Japanese macaque detection model trained using a YOLO-v4 architecture with spatial attention module, and Mish activation function based on 3-stage training curriculum yielded the best performance. For animal behavior studies, our tracking system can prove effective and reliable with our achieved 91.35% MOTA even on our heterogeneous dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Meyer JS, Hamel AF (2014) Models of stress in nonhuman primates and their relevance for human psychopathology and endocrine dysfunction. ILAR J 55(2):347–360

    Article  Google Scholar 

  2. Willard SL, Shively CA (2012) Modeling depression in adult female cynomolgus monkeys (Macaca fascicularis). Am J Primatol 74(6):528–542

    Article  Google Scholar 

  3. Matsuzawa T (2018) Hot-spring bathing of wild monkeys in Shiga-Heights: origin and propagation of a cultural behavior. Primates 59(3):209–213

    Article  Google Scholar 

  4. Kawai M (1965) Newly-acquired pre-cultural behavior of the natural troop of Japanese monkeys on Koshima islet. Primates 6(1):1–30

    Article  Google Scholar 

  5. Kawamura S (1959) The process of sub-culture propagation among Japanese macaques. Primates 2(1):43–60

    Article  Google Scholar 

  6. Matsuzawa T (2015) Sweet-potato washing revisited: 50th anniversary of the Primates article. Primates 56(4):285–287

    Article  Google Scholar 

  7. Girshick RB (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV). IEEE, pp 1440–1448

  8. Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. CoRR (abs/2004.10934). https://doi.org/10.48550/arXiv.2004.10934

  9. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 779–788

  10. Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. CoRR (abs/1804.02767). https://doi.org/10.48550/arXiv.1804.02767

  11. Wang CY, Bochkovskiy A, Liao HYM (2021) Scaled-YOLOv4: Scaling cross stage partial network. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, pp 13024–13033

  12. Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, pp 10778–10787

  13. Lin TY, et al. (2014) Microsoft COCO: Common Objects in Context. In: D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars (Eds) ComputerVision – ECCV. Lecture Notes in Computer Science, vol 8693. Springer, Cham, pp 740–755.

  14. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 248–255

  15. Everingham M, Gool L, Williams CK, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88:303–338

    Article  Google Scholar 

  16. Bozinovski S (2020) Reminder of the first paper on transfer learning in neural networks, 1976. Informatica (Slovenia) 44. https://doi.org/10.31449/inf.v44i3.2828

  17. Zhuang F et al (2020) A comprehensive survey on transfer learning. Proc IEEE Inst Electr Electron Eng 109(1):43–76

    Article  Google Scholar 

  18. Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: 26th annual international conference on machine learning (ICML’09). ACM, pp. 41–48

  19. Soviany P, Ionescu RT, Rota P et al (2022) Curriculum learning: a survey. Int J Comput Vis 130:1526–1565

    Article  Google Scholar 

  20. Clapham M, Miller E, Nguyen M, Darimont CT (2020) Automated facial recognition for wildlife that lack unique markings: a deep learning approach for brown bears. Ecol Evol 10(23):12883–12892

    Article  Google Scholar 

  21. McIntosh D, Marques TP, Albu AB, Rountree R, Leo FD (2020) Movement tracks for the automatic detection of fish behavior in videos. CoRR (abs/2011.14070). https://doi.org/10.48550/arXiv.2011.14070

  22. Sarfati R, Hayes J, Sarfati E, Peleg O (2020) Spatio-temporal reconstruction of emergent flash synchronization in firefly swarms via stereoscopic 360-degree cameras. J R Soc Interface 17:20200179

    Article  Google Scholar 

  23. Labuguen R, Matsumoto J, Negrete SB, Nishimaru H, Nishijo H, Takada M, Go Y, Inoue K-i, Shibata T (2021) Macaquepose: a novel “in the wild” macaque monkey pose dataset for markerless motion capture. Front Behav Neurosci 14:268 

    Article  Google Scholar 

  24. Schofield D, Nagrani A, Zisserman A, Hayashi M, Matsuzawa T, Biro D, Carvalho S (2019) Chimpanzee face recognition from videos in the wild using deep learning. Sci Adv 5(9):eaaw0736

    Article  Google Scholar 

  25. Sarlin PE, DeTone D, Malisiewicz T, Rabinovich A (2020) Superglue: Learning feature matching with graph neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 4937–4946

  26. Crouse DF (2016) On implementing 2D rectangular assignment algorithms. IEEE Trans Aeros Electron Syst 52(4):1679–1696

    Article  Google Scholar 

  27. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision – ECCV. Lecture notes in computer science, vol 8691. Springer, Cham, pp 346–361

  28. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 8759–8768

  29. Misra D (2020) Mish: a self regularized non-monotonic neural activation function. In: 2020 British machine vision conference (BMVC). https://doi.org/10.48550/arXiv.1908.08681

  30. Ramachandran P, Zoph B, Le Q (2018) Searching for activation functions. In: 2018 International conference on learning representations (ICLR) workshop. https://doi.org/10.48550/arXiv.1710.05941

  31. Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. In: 2015 International conference on machine learning (ICML) workshop. https://doi.org/10.48550/arXiv.1505.00853

  32. Woo S, Park J, Lee J, Kweon IS (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV. Lecture notes in computer science, vol 11211. Springer, Cham, pp 3–19

  33. Krueger KA, Dayan P (2009) Flexible shaping: How learning in small steps helps. Cognition 110(3):380–394

    Article  Google Scholar 

  34. Shimada M, Sueur C (2018) Social play among juvenile wild Japanese macaques (Macaca fuscata) strengthens their social bonds. Am J Primatol 80(1):e22728

    Article  Google Scholar 

  35. Shimada M, Uno T, Nakagawa N, Fujita S, Izawa K (2009) Case study of a one-sided attack by multiple troop members on a nontroop adolescent male and the death of Japanese macaques (Macaca fuscata). Aggress Behav 35(4):334–341

    Article  Google Scholar 

  36. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: The CLEAR MOT metrics. J Image Video Proc 2008: https://doi.org/10.1155/2008/246309

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the joint project of Kyoto University and Toyota Motor Corporation, titled “Advanced Mathematical Science for Mobility Society”, JSPS KAKENHI Grant Numbers 17H05863, and 18K19821. The first author, R. R. Pineda, is supported by a postgraduate scholarship from the Engineering Research and Development for Technology (ERDT), Philippines.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riza Rae Pineda.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was submitted and accepted for the Journal Track of the joint symposium of the 28th International Symposium on Artificial Life and Robotics, the 8th International Symposium on BioComplexity, and the 6th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Beppu, Oita, January 25–27, 2023).

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pineda, R.R., Kubo, T., Shimada, M. et al. Deep MAnTra: deep learning-based multi-animal tracking for Japanese macaques. Artif Life Robotics 28, 127–138 (2023). https://doi.org/10.1007/s10015-022-00837-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-022-00837-9

Keywords

Navigation