Skip to main content

Temporal Repetition Counting Based on Multi-stride Collaboration

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14119))

  • 653 Accesses

Abstract

Visual repetition occurs in various forms in our world, such as human activities, animal behaviors, and even natural phenomena. Visual repetition counting remains a challenging task, especially in long videos, where repetitions exhibit certain characteristics such as discontinuous actions and inconsistent cycles. The existing methods that focus on counting repetitive actions in short videos face challenges in accurately counting repetitions in long videos due to these characteristics. To tackle this challenge, we propose a multi-stride collaborative counting framework based on adaptive temporal correlation to estimate repetitions in short and long videos. Our framework predicts the final counting result based on the counting results of the same video sampled with different strides. Additionally, since existing repetition counting datasets do not adequately cover all the challenging scenarios considered in our work, we have collected and labeled a new dataset called ActCount, which includes 172 videos with approximately 1,870 annotated repetitive actions. Our dataset includes repetitions that are non-human-centric, making it more realistic and challenging. Specifically, our model outperforms all previous models on the RepCount dataset, achieving an MAE of 0.3053 and an OBO of 0.3708, setting a new state-of-the-art benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Soro, A., Brunner, G., Tanner, S.: Recognition and repetition counting for complex physical exercises with deep learning. Sensors 19(3), 714 (2019)

    Article  Google Scholar 

  2. Xie, W., Noble, J.A., Zisserman, A.: Microscopy cell counting and detection with fully convolutional regression networks. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 6(3), 283–292 (2018)

    Google Scholar 

  3. Lu, C., Ferrier, N.J.: Repetitive motion analysis: segmentation and event classification. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 258–263 (2004)

    Article  Google Scholar 

  4. Li, X., Singh, V., Wu, Y., Kirchberg, K., Duncan, J., Kapoor, A.: Repetitive motion estimation network: recover cardiac and respiratory signal from thoracic imaging. arXiv preprint arXiv:1811.03343 (2018)

  5. Laptev, I., Belongie, S.J., Pérez, P., Wills, J.: Periodic motion detection and segmentation via approximate sequence alignment. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 816–823 (2005)

    Google Scholar 

  6. Belongie, S.J., Wills, J.: Structure from periodic motion. In: Spatial Coherence for Visual Motion Analysis, pp. 16–24 (2006)

    Google Scholar 

  7. Huang, S., Ying, X., Rong, J., Shang, Z., Zha., H.: Camera calibration from periodic motion of a pedestrian. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3025–3033 (2016)

    Google Scholar 

  8. Pogalin, E., Smeulders, A.W., Thean, A.H.: Visual quasi-periodicity. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)

    Google Scholar 

  9. Runia, T.F., Snoek, C.G., Smeulders, A.W.: Real-world repetition estimation by div, grad and curl. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9009–9017 (2018)

    Google Scholar 

  10. Levy, O., Wolf, L.: Live repetition counting. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3020–3028 (2015)

    Google Scholar 

  11. Zhang, H., Xu, X., Han, G., He, S.: Context-aware and scale-insensitive temporal repetition counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 670–678 (2020)

    Google Scholar 

  12. Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: Counting out time: class agnostic video repetition counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10387–10396 (2020)

    Google Scholar 

  13. Zhang, Y., Shao, L., Snoek, C.G.: Repetitive activity counting by sight and sound. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14070–14079 (2021)

    Google Scholar 

  14. Hu, H., Dong, S., Zhao, Y., Lian, D., Li, Z., Gao, S.: Transrac: encoding multi-scale temporal correlation with transformers for repetitive action counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19013–19022 (2022)

    Google Scholar 

  15. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  16. Li, K., et al.: Uniformer: Unifying convolution and self-attention for visual recognition. arXiv preprint arXiv:2201.09450 (2022)

  17. Kobayashi, T., Otsu, N.: Motion recognition using local auto-correlation of space-time gradients. Pattern Recogn. Lett. 33(9), 1188–1195 (2012)

    Article  Google Scholar 

  18. Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-independent action recognition from temporal self-similarities. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 172–185 (2010)

    Article  Google Scholar 

  19. Vaswani, A., et al.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 30, 5998–6008 (2017)

    Google Scholar 

  20. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2018)

    Google Scholar 

  21. Girshick, R.: Fast r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)

    Google Scholar 

  22. Girshick, R., Donahue, J., Darrell, T.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)

    Google Scholar 

  23. Kay, W., et al.: The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017)

  24. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  25. Liu, Z., et al.: Video swin transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3202–3211 (2022)

    Google Scholar 

  26. Liu, Z., Wang, L., Wu, W., Qian, C., Lu, T.: Tam: Temporal adaptive module for video recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 13708–13718 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gan, G., Su, J., Wen, Z., Zhang, S. (2023). Temporal Repetition Counting Based on Multi-stride Collaboration. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14119. Springer, Cham. https://doi.org/10.1007/978-3-031-40289-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40289-0_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40288-3

  • Online ISBN: 978-3-031-40289-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics