Skip to main content
Log in

Anomaly detection with a moving camera using spatio-temporal codebooks

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

This paper proposes a method to detect anomalies in videos acquired by a camera mounted on a moving inspection robot. The proposed method is based on a spatio-temporal composition (STC) method, where a dense sampling is used to break the video into small 3D volumes that are used to calculate the probability of the spatio-temporal arrangements. This class of methods has been successfully used for surveillance videos obtained by static cameras. However, when applied to videos recorded by cameras on moving platforms, the STC gives a large number of false detections. In this work, we propose improvements to the present STC method that will alleviate this problem in two ways. First, a two-stage dictionary learning process is performed to allow a more reliable anomaly detection. Second, improved spatio-temporal features are employed. These modified features are extracted after an enhanced temporal filtering that performs a temporal regularization of the video sequence. The proposed approach gives very good results in the identification of anomalies without the need of background subtraction, motion estimation or tracking. The results are shown to be comparable or even superior to those of other state-of-the-art methods using bag-of-video words or other moving-camera surveillance methods. The system is accurate even with no prior knowledge of the type of event to be observed, being robust to cluttered environments, as illustrated by several practical examples. These results are obtained without compromising the performance of the algorithm in the static cameras case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  • Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346–359.

    Article  Google Scholar 

  • Bertini, M., Del Bimbo, A., & Seidenari, L. (2012). Multi-scale and real-time non-parametric approach for anomaly detection and localization. Computer Vision and Image Understanding, 116(3), 320–829.

    Article  Google Scholar 

  • Bilmes, J. (1998). A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical report, International Computer Science Institute and Computer Science Division. University of California at Berkeley.

  • Boiman, O., & Irani, M. (2007). Detecting irregularities in images and in video. International Journal of Computer Vision, 74(1), 17–31.

    Article  Google Scholar 

  • Cheng, G., & Han, J. (2016). A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117, 11–28.

    Article  Google Scholar 

  • Cheng, G., Zhou, P., & Han, J. (2016). Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(12), 7405–7415.

    Article  Google Scholar 

  • Cuevas, C., Martínez, R., & García, N. (2016). Detection of stationary foreground objects: A survey. Computer Vision and Image Understanding, 152, 41–57.

    Article  Google Scholar 

  • de Carvalho, G. H. F. (2015). Automatic detection of abandoned objects with a moving camera using multiscale video analysis. D.Sc. thesis, Federal University of Rio de Janeiro, Rio de Janeiro, RJ.

  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

    Article  MathSciNet  Google Scholar 

  • Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

    Article  MathSciNet  Google Scholar 

  • Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). New Jersey: Pearson Prentice Hall.

    Google Scholar 

  • Haering, N., Venetianer, P. L., & Lipton, A. (2008). The evolution of video surveillance: An overview. Machine Vision and Applications, 19(5–6), 279–290.

    Article  Google Scholar 

  • Heijden, F. V. D., Duin, R. P. W., de Ridder, D., & Tax, D. M. J. (2004). Classification, parameter estimation and state estimation. West Sussex: Wiley.

    Book  MATH  Google Scholar 

  • Kong, H., Audibert, J., & Ponce, J. (2010). Detecting abandoned objects with a moving camera. IEEE Transactions on Image Processing, 19(8), 2201–2210.

    Article  MathSciNet  MATH  Google Scholar 

  • Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE conference on computer vision and pattern recognition (pp. 2169–2178). New York.

  • Liu, J., & Shah, M. (2008). Learning human actions via information maximization. In IEEE conference on computer vision and pattern recognition (pp. 1–8). Alaska.

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Matsuyama, T., Ohya, T., & Habe, H. (2000). Background subtraction for non-stationary scene. ACCV 200—Asian conference on computer vision (pp. 622–667). Taipei.

  • Mukojima, H., Deguchi, D., Kawanish, Y., Ide, I., Murase, H., Ukai, M., Nagamine, N., & Nakasone, R. (2016). Moving camera background-subtraction for obstacle detection on railway tracks. In IEEE international conference on image processing (pp. 3967–3971). Phoenix.

  • Rapantzikos, K., Avrithis, Y., & Kollias, S. (2009). Dense saliency-based spatiotemporal feature points for action recognition. In IEEE conference on computer vision and pattern recognition (pp. 1454–1461). Miami.

  • Roshtkhari, M. J., & Levine, M. D. (2013). An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions. Computer Vision and Image Understanding, 117(10), 1436–1452.

    Article  Google Scholar 

  • Satoh, Y., Tanahashi, H., Wang, C., Kaneko, S., Niwa, Y., & Yamamoto, K. (2012). Robust event detection by radial reach filter (rrf). In ICPR 2012—international conference on pattern recognition (pp. 623–626). Tsukuba Science City.

  • Schwartz, O., Hsu, A., & Dayan, P. (2007). Space and time in visual context. Nature Reviews Neuroscience, 8(7), 522–535.

    Article  Google Scholar 

  • Silva, A. F., Thomaz, L. A., de Carvalho G. H. F., Nakahata, M. T., Jardim, E., Oliveira, J., Silva E. A. B., Netto, S. L., Freitas, G., & Costa, R. R. (2014). An annotated video database forabandoned-object detection in a cluttered environment. In Proccedings of the 2014 international telecommunications symposium. Sao Paulo.

  • Suhr, J. K., Jung, H. G., Li, G., Noh, S. I., & Kim, J. (2011). Background compensation for pan-tilt-zoom cameras using 1-D feature matching and outlier rejection. IEEE Transactions on Circuits and Systems for Video Technology, 21(3), 371–377.

    Article  Google Scholar 

  • UCSD (2014) UCSD anomaly detection dataset. [Online] http://www.svcl.ucsd.edu/projects/anomaly.

  • VDAO (2016) VDAO—Video database of abandoned objects in a cluttered industrial environment. [Online] http://www.smt.ufrj.br/~tvdigital/database/objects.

  • Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (2013). DeepFlow: Large displacement optical flow with deep matching. In ICCV 2013—IEEE international conference on computer vision (pp. 1385–1392). Sydney.

  • Zhong, H., Shi, J., & Visontai, H. (2004). Detecting unusual activity in video. In IEEE conference on computer vision and pattern recognition (pp. 819–826). Washington, DC.

  • Zhou, D., Wang, L., Cai, X., & Liu, Y. (2009). Detection of moving targets with a moving camera. In IEEE international conference on robotics and biomimetics (pp. 677–681). Guilin.

  • Zhou, P., Cheng, G., Liu, Z., Bu, S., & Hu, X. (2016). Weakly supervised target detection in remote sensing images based on transferred deep features and negative bootstrapping. Multidimensional Systems and Signal Processing, 27(4), 925–944.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mateus T. Nakahata.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nakahata, M.T., Thomaz, L.A., da Silva, A.F. et al. Anomaly detection with a moving camera using spatio-temporal codebooks. Multidim Syst Sign Process 29, 1025–1054 (2018). https://doi.org/10.1007/s11045-017-0486-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-017-0486-8

Keywords

Navigation