Detection of complex video events through visual rhythm

Torres, Berthin S.; Pedrini, Helio

doi:10.1007/s00371-016-1321-1

Detection of complex video events through visual rhythm

Original Article
Published: 19 October 2016

Volume 34, pages 145–165, (2018)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Berthin S. Torres¹ &
Helio Pedrini¹

533 Accesses
13 Citations
Explore all metrics

Abstract

The recognition of complex events in videos has currently several important applications, particularly due to the wide availability of digital cameras in environments such as airports, train and bus stations, shopping centers, stadiums, hospitals, schools, buildings, roads, among others. Advances in digital technology have enhanced the capabilities for detection of video events through the development of devices with high resolution, small physical size, and high sampling rates. This work presents and evaluates the use of feature descriptors extracted from visual rhythms of video sequences in three computer vision problems: abnormal event detection, human action classification, and gesture recognition. Experiments conducted on well-known public datasets demonstrate that the method produces promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Article 16 November 2015

Compact Video Description and Representation for Automated Summarization of Human Activities

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

Article 08 August 2023

Notes

Although some efforts have been made to differentiate among these terms, in this work, abnormal events will be assumed to similar to unusual, rare, suspicious, anomalous, irregular, outlying or atypical events.
Normalcy or normality is the state of being normal or usual.
An object that fits the principle of distance conservation [54].
Many works from the literature consider the first k frames for training, where k varies from 200 to 300.

References

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)
Article Google Scholar
Alcantara, M., Moreira, T., Pedrini, H.: Real-time action recognition based on cumulative motion shapes. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2917–2921 (2014)
Alcantara, M., Moreira, T., Pedrini, H.: Real-time action recognition using a multilayer descriptor with variable size. J. Electron. Imaging 25(1), 013,020–013,020 (2016)
Article Google Scholar
Almotairi, S.M.: Using variations of shape and appearance in alignment methods for classifying human actions. Florida Institute of Technology, Melbourne (2014)
Antonucci, A., De Rosa, R., Giusti, A., Cuzzolin, F.: Robust classification of multivariate time series by imprecise hidden Markov models. Int. J. Approx. Reason. 56, 249–263 (2015)
Article MathSciNet MATH Google Scholar
Berent, J., Dragotti, P.: Segmentation of epipolar-plane image volumes with occlusion and disocclusion competition. In: IEEE 8th Workshop on Multimedia Signal Processing, pp. 182–185 (2006)
Biswas, S., Babu, R.V.: Real time anomaly detection in H.264 compressed videos. In: Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, pp. 1–4. IEEE (2013)
Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. In: Human Motion: Understanding, Modeling, Capture and Animation, pp. 285–298. Springer (2007)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: International Conference on Computer Vision, Beijing, pp. 1395–1402 (2005)
Bolles, R.C., Baker, H.H.: Epipolar-plane image analysis: a technique for analyzing motion sequences. In: 3th IEEE Workshop on Computer Vision, Representation, and Control, pp. 168–178. IEEE (1985)
Boughorbel, S., Tarel, J.P., Boujemaa, N.: Generalized histogram intersection kernel for image recognition. In: IEEE International Conference on Image Processing, vol. 3, pp. III–161. IEEE (2005)
Bourke, A., O’brien, J., Lyons, G.: Evaluation of a threshold-based tri-axial accelerometer fall detection algorithm. Gait Posture 26(2), 194–199 (2007)
Article Google Scholar
Buch, N., Velastin, S., Orwell, J.: A review of computer vision techniques for the analysis of urban traffic. IEEE Trans. Intell. Transp. Syst. 12(3), 920–939 (2011)
Article Google Scholar
Calderara, S., Heinemann, U., Prati, A., Cucchiara, R., Tishby, N.: Detecting anomalies in people’s trajectories using spectral graph analysis. Comput. Vis. Image Underst. 115(8), 1099–1111 (2011)
Article Google Scholar
Candamo, J., Shreve, M., Goldgof, D., Sapper, D., Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)
Article Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)
Article Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Article Google Scholar
Chen, D.Y., Huang, P.C.: Motion-based unusual event detection in human crowds. J. Vis. Commun. Image Represent. 22(2), 178–186 (2011)
Article MathSciNet Google Scholar
Cong, Y., Yuan, J., Liu, J.: Abnormal event detection in crowded scenes using sparse representation. Pattern Recognit. 46(7), 1851–1864 (2013)
Article Google Scholar
Cui, L., Li, K., Chen, J., Li, Z.: Abnormal event detection in traffic video surveillance based on local features. In: 4th International Congress on Image and Signal Processing, vol. 1, pp. 362–366. IEEE (2011)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)
De Rosa, R., Cesa-Bianchi, N., Gori, I., Cuzzolin, F.: Online action recognition via nonparametric incremental learning. In: British Machine Vision Conference. BMVA Press (2014)
Dee, H.M., Velastin, S.A.: How close are we to solving the problem of automated visual surveillance? Mach. Vis. Appl. 19(5–6), 329–343 (2007)
Google Scholar
Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Del Bimbo, A.: Space-time pose representation for 3D human action recognition. In: International Conference on Image Analysis and Processing, pp. 456–464. Springer (2013)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72. IEEE (2005)
Duda, R.O., Hart, P.E.: Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)
Article MATH Google Scholar
Fanello, S.R., Gori, I., Metta, G., Odone, F.: Keep it Simple and Sparse: Real-Time Action Recognition. Journal of Machine Learning Research 14(1), 2617–2640 (2013)
Google Scholar
Farneback, G.: Two-frame motion estimation based on polynomial expansion. In: Image Analysis, pp. 363–370. Springer (2003)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Fawzy, F., Abdelwahab, M., Mikhael, W.: 2DHOOF-2DPCA contour based optical flow algorithm for human activity recognition. In: IEEE 56th International Midwest Symposium on Circuits and Systems, pp. 1310–1313 (2013)
Feng, J., Zhang, C., Hao, P.: Online learning with self-organizing maps for anomaly detection in crowd scenes. In: 20th International Conference on Pattern Recognition, vol. 1, pp. 3599–3602 (2010)
Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: a survey. Comput. Vis. Image Underst. 134, 1–21 (2015)
Article MATH Google Scholar
Gong, S., Xiang, T.: Person re-identification. In: Visual Analysis of Behaviour, pp. 301–313. Springer (2011)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Guimarães, S., de A.-Araújo, A., Couprie, M., Leite, N.: An Approach to detect video transitions based on mathematical morphology. In: International Conference on Image Processing, vol. 3, pp. II–1021–4 (2003)
Guo, K., Ishwar, P., Konrad, J.: Action recognition from video using feature covariance matrices. IEEE Trans. Image Process. 22(6), 2479–2494 (2013)
Article MathSciNet MATH Google Scholar
Horn, B.K., Schunck, B.G.: Determining optical flow. In: 1981 Technical Symposium East, pp. 319–331. International Society for Optics and Photonics (1981)
Hung, T.Y., Lu, J., Tan, Y.P.: Cross-scene abnormal event detection. In: IEEE International Symposium on Circuits and Systems, pp. 2844–2847 (2013)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D Convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Jiang, F., Yuan, J., Tsaftaris, S.A., Katsaggelos, A.K.: Anomalous video event detection using spatiotemporal context. Comput. Vis. Image Underst. 115(3), 323–333 (2011)
Article Google Scholar
Jiang, X., Zhong, F., Peng, Q., Qin, X.: Online robust action recognition based on a hierarchical model. Vis. Comput. 30(9), 1021–1033 (2014)
Article Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 166–173. IEEE (2005)
Kliper-Gross, O., Hassner, T., Wolf, L.: The action similarity labeling challenge. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 615–621 (2012)
Article Google Scholar
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Article Google Scholar
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2014)
Article Google Scholar
Liu, L., Shao, L.: Learning Discriminative representations from RGB-D video data. In: International Joint Conference on Artificial Intelligence, vol. 1, p. 3 (2013)
Lowe, D.: Object recognition from local scale-invariant features. In: Computer Vision, vol. 2, pp. 1150–1157 (1999)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. Int. Jt. Conf. Artif. Intell. 81, 674–679 (1981)
Google Scholar
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1975–1981 (2010)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: 26th Annual International Conference on Machine Learning, pp. 689–696. ACM (2009)
McCahill, M., Norris, C.: CCTV systems in London: their structures and practices. Tech. rep., Centre for Criminology and Criminal Justice, University of Hull (2003)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 935–942 (2009)
Mitiche, A.: Rigid body kinematics: some basic notions. In: Computational Analysis of Visual Motion. Advances in Computer Vision and Machine Intelligence, pp. 31–43. Springer, US (1994)
Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., Kautz, J.: Online Detection and classification of dynamic hand gestures with recurrent 3D convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4207–4215 (2016)
Moreira, T., Alcantara, M., Pedrini, H., Menotti, D.: Fast and accurate gesture recognition based on motion shapes. In: Iberoamerican Congress on Pattern Recognition, pp. 247–254. Springer (2015)
Nalwa, V.S.: A Guided Tour of Computer Vision. Addison-Wesley Longman Publishing Co. Inc., Boston (1993)
Google Scholar
Nam, Y.: Crowd flux analysis and abnormal event detection in unstructured and structured scenes. Multimed. Tools Appl. 72(3), 3001–3029 (2014)
Article Google Scholar
Ngo, C., Pong, T., Chin, R.: Detection of gradual transitions through temporal slice analysis. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1 (1999)
Ngo, C.W., Pong, T.C., Zhang, H.J.: Motion analysis and segmentation through spatio-temporal slices processing. IEEE Trans. Image Process. 12(3), 341–355 (2003)
Article Google Scholar
Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008)
Article Google Scholar
Nishida, N., Nakayama, H.: Multimodal gesture recognition using multi-stream recurrent neural network. In: Pacific-Rim Symposium on Image and Video Technology, pp. 682–694. Springer (2015)
Niyogi, S., Adelson, E.: Analyzing and recognizing walking figures in XYT. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 469–474 (1994)
Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)
Google Scholar
Ozturk, O., Yamasaki, T., Aizawa, K.: Detecting dominant motion flows in unstructured/structured crowd scenes. In: 20th International Conference on Pattern Recognition, pp. 3533–3536 (2010)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Piciarelli, C., Foresti, G.: On-line trajectory clustering for anomalous events detection. Pattern Recognit. Lett. 27(15), 1835–1842 (2006)
Article Google Scholar
Raja, K., Laptev, I., Pérez, P., Oisel, L.: Joint pose estimation and action recognition in image graphs. In: 18th IEEE International Conference on Image Processing, pp. 25–28. IEEE (2011)
Ran, Y.: Symmetry in Human motion analysis: theory and experiment. Ph.D. thesis, University of Maryland (2006)
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)
Article Google Scholar
da S. Pinto, A., Pedrini, H., Schwartz, W., Rocha, A.: Video-based face spoofing detection through visual rhythm analysis. In: 25th SIBGRAPI Conference on Graphics, Patterns and Images, pp. 221–228 (2012)
Saligrama, V.: Video anomaly detection based on local statistical aggregates. In: IEEE Conference on Computer Vision and Pattern Recognition pp. 2112–2119 (2012)
Schindler, K., van Gool, L.: Action snippets: how many frames does human action recognition require? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. NIPS 12, 582–588 (1999)
Google Scholar
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 17th International Conference on Pattern Recognition, vol. 3, pp. 32–36. IEEE (2004)
Prison Service Order.: Display screen equipment health and safety issues. H.M. Prison Service (2000)
Sobral, A., Vacavant, A.: A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput. Vis. Image Underst. 122, 4–21 (2014)
Article Google Scholar
Sun, X., Chen, M., Hauptmann, A.: Action Recognition via local descriptors and holistic features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 58–65. IEEE (2009)
Suriani, N.S., Hussain, A., Zulkifley, M.A.: Sudden event recognition: a survey. Sensors 13(8), 9966–9998 (2013)
Article Google Scholar
Tang, X., Zhang, S., Yao, H.: Sparse Coding based motion attention for abnormal event detection. In: 20th IEEE International Conference on Image Processing, pp. 3602–3606 (2013)
Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)
Article MATH Google Scholar
Thida, M., Eng, H.L., Remagnino, P.: Laplacian eigenmap with temporal constraints for local abnormality detection in crowded scenes. IEEE Trans. Cybern. 43(6), 2147–2156 (2013)
Tung, P.T., Ngoc, L.Q.: Elliptical density shape model for hand gesture recognition. In: Fifth Symposium on Information and Communication Technology, pp. 186–191. ACM (2014)
UMN—Detection of Unusual Crowd Dataset (2015) http://mha.cs.umn.edu/
Valio, F.B., Pedrini, H., Leite, N.J.: Fast rotation-invariant video caption detection based on visual rhythm. In: San Martin, C., Kim, S.W. (eds.) Progress in pattern recognition, image analysis, computer vision, and applications, Lecture notes in computer science, pp. 157–164. Springer, Berlin, Heidelberg (2011)
Chapter Google Scholar
Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)
Article Google Scholar
Vo, D.H., Huynh, H.H., Meaunier, J.: Geometry-based dynamic hand gesture recognition. J. Sci. Technol. 1, 13–19 (2015)
Google Scholar
Wallace, E., Diffley, C., Britain, G.: CCTV: Making it work: CCTV control room ergonomics. Publication (Great Britain. Home Office. Police Scientific Development Branch). Police Scientific Development Branch (1998)
van der Walt, S., Colbert, S.C., Varoquaux, G.: The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13(2), 22–30 (2011)
Article Google Scholar
van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T.: The scikit-image contributors: scikit-image: image processing in Python. PeerJ 2, e453 (2014)
Article Google Scholar
Wang, S., Huang, K., Tan, T.: A Compact optical flow based motion representation for real-time action recognition in surveillance scenes. In: 16th IEEE International Conference on Image Processing, pp. 1121–1124 (2009)
Wang, T., Chen, J., Snoussi, H.: Online detection of abnormal events in video streams. J. Electr. Comput. Eng. 2013, 1–12 (2013)
Wong, S.F., Cipolla, R.: Extracting spatiotemporal interest points using global information. In: IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Yang, W., Wang, Y., Mori, G.: Human action recognition from a single clip per action. In: IEEE 12th International Conference on Computer Vision Workshops, pp. 482–489 (2009)
Yu, M., Liu, L., Shao, L.: Structure-preserving binary representations for RGB-D action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1651–1664 (2016)
Article Google Scholar
Zack, G., Rogers, W., Latt, S.: Automatic measurement of sister chromatid exchange frequency. J. Histochem. Cytochem. 25(7), 741–753 (1977)
Article Google Scholar
Zhang, Y., Qin, L., Yao, H., Huang, Q.: Abnormal crowd behavior detection based on social attribute-aware force model. In: 19th IEEE International Conference on Image Processing, pp. 2689–2692 (2012)
Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3313–3320. IEEE Computer Society (2011)

Download references

Acknowledgments

The authors are grateful to FAPESP, CNPq and CAPES for the financial support.

Author information

Authors and Affiliations

Institute of Computing, University of Campinas, Campinas, SP, 13083-852, Brazil
Berthin S. Torres & Helio Pedrini

Authors

Berthin S. Torres
View author publications
You can also search for this author in PubMed Google Scholar
Helio Pedrini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Helio Pedrini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Torres, B.S., Pedrini, H. Detection of complex video events through visual rhythm. Vis Comput 34, 145–165 (2018). https://doi.org/10.1007/s00371-016-1321-1

Download citation

Published: 19 October 2016
Issue Date: February 2018
DOI: https://doi.org/10.1007/s00371-016-1321-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of complex video events through visual rhythm

Abstract

Access this article

Similar content being viewed by others

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Compact Video Description and Representation for Automated Summarization of Human Activities

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detection of complex video events through visual rhythm

Abstract

Access this article

Similar content being viewed by others

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Compact Video Description and Representation for Automated Summarization of Human Activities

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation