Exploring the Use of Efficient Projection Kernels for Motion Saliency Estimation

Nicora, Elena; Noceti, Nicoletta

doi:10.1007/978-3-031-06433-3_14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13233))

Included in the following conference series:

International Conference on Image Analysis and Processing

1187 Accesses
3 Citations

Abstract

In this paper we investigate the potential of a family of efficient filters – the Gray-Code Kernels – for addressing visual saliency estimation guided by motion. Our implementation relies on the use of 3D kernels applied to overlapping blocks of frames and is able to gather meaningful spatio-temporal information with a very light computation. We introduce an attention module that reasons on the use of pooling strategies, combined in an unsupervised way to derive a saliency map highlighting the presence of motion in the scene. In the experiments we show that our method is able to effectively and efficiently identify the portion of the image where the motion is occurring, providing tolerance to a variety of scene conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The implementation of our method in Python will be made soon publicly available.

References

Ahad, M.A.R., Tan, J.K., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)
Article Google Scholar
Ben-Artzi, G., Hel-Or, H., Hel-Or, Y.: The gray-code filter kernels. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 382–393 (2007)
Article Google Scholar
Bouwmans, T.: Recent advanced statistical background modeling for foreground detection-a systematic survey. Recent Pat. Comput. Sci. 4(3), 147–176 (2011)
Google Scholar
Cong, R., Lei, J., Fu, H., Cheng, M.M., Lin, W., Huang, Q.: Review of visual saliency detection with comprehensive information. IEEE Trans. Circ. Syst. Video Technol. 29(10), 2941–2959 (2018)
Article Google Scholar
Faktor, A., Irani, M.: Video segmentation by non-local consensus voting. In: BMVC. vol. 2, p. 8 (2014)
Google Scholar
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50
Chapter Google Scholar
Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: a survey. Comput. Vis. Image Underst. 134, 1–21 (2015)
Article Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Hel-Or, Y., Hel-Or, H.: Real-time pattern matching using projection kernels. IEEE Trans. Pattern Anal. Mach. Intell. 27(9), 1430–1445 (2005)
Article Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Jain, S.D., Xiong, B., Grauman, K.: FusionSeg: learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2126. IEEE (2017)
Google Scholar
Korman, S., Avidan, S.: Coherency sensitive hashing. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1099–1112 (2015)
Article Google Scholar
Lee, D.S.: Effective gaussian mixture learning for video background subtraction. IEEE Trans. Pattern Anal. Mach. Intell. 27(5), 827–832 (2005)
Article Google Scholar
Li, F., Kim, T., Humayun, A., Tsai, D., Rehg, J.M.: Video segmentation by tracking many figure-ground segments. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2192–2199 (2013)
Google Scholar
Moshe, Y., Hel-Or, H.: Video block motion estimation based on gray-code kernels. IEEE Trans. Image Process. 18(10), 2243–2254 (2009)
Article MathSciNet Google Scholar
Moshe, Y., Hel-Or, H., Hel-Or, Y.: Foreground detection using spatiotemporal projection kernels. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3210–3217. IEEE (2012)
Google Scholar
Noceti, N., Delponte, E., Odone, F.: Spatio-temporal constraints for on-line 3D object recognition in videos. Comput. Vis. Image Underst. 113(12), 1198–1209 (2009)
Article Google Scholar
Noceti, N., Sciutti, A., Sandini, G.: Cognition helps vision: recognizing biological motion using invariant dynamic cues. In: Murino, V., Puppo, E. (eds.) ICIAP 2015. LNCS, vol. 9280, pp. 676–686. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23234-8_62
Chapter Google Scholar
Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1187–1200 (2013)
Article Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybernet. 9(1), 62–66 (1979)
Article Google Scholar
Ouyang, W., Zhang, R., Cham, W.K.: Fast pattern matching using orthogonal Haar transform. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3050–3057. IEEE (2010)
Google Scholar
Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: Proceedings of the IEEE international conference on computer vision. pp. 1777–1784 (2013)
Google Scholar
Perazzi, F., Khoreva, A., Benenson, R., Schiele, B., Sorkine-Hornung, A.: Learning video object segmentation from static images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2663–2672 (2017)
Google Scholar
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)
Google Scholar
Rea, F., Vignolo, A., Sciutti, A., Noceti, N.: Human motion understanding for selecting action timing in collaborative human-robot interaction. Front. Robot. AI 6, 58 (2019)
Article Google Scholar
Stagliano, A., Noceti, N., Verri, A., Odone, F.: Online space-variant background modeling with sparse coding. IEEE Trans. Image Process. 24(8), 2415–2428 (2015)
Article MathSciNet Google Scholar
Vignolo, A., Noceti, N., Rea, F., Sciutti, A., Odone, F., Sandini, G.: Detecting biological motion for human-robot interaction: a link between perception and action. Front. Robot. AI p. 14 (2017)
Google Scholar
Voigtlaender, P., Leibe, B.: Online adaptation of convolutional neural networks for video object segmentation. arXiv preprint arXiv:1706.09364 (2017)
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)
Google Scholar
Werlberger, M., Pock, T., Bischof, H.: Motion estimation with non-local total variation regularization. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2464–2471. IEEE (2010)
Google Scholar
Xiao, F., Jae Lee, Y.: Track and segment: an iterative unsupervised approach for video object proposals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 933–942 (2016)
Google Scholar
Zhuo, T., Cheng, Z., Zhang, P., Wong, Y., Kankanhalli, M.: Unsupervised online video object segmentation with motion property understanding. IEEE Trans. Image Process. 29, 237–249 (2019)
Article MathSciNet Google Scholar
Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 28–31. IEEE (2004)
Google Scholar

Download references

Acknowledgements

This work has been carried out at the Machine Learning Genoa (MaLGa) center, Università di Genova (IT). It has been supported by AFOSR with the project “Cognitively-inspired architectures for human motion understanding”, grant no. FA8655-20-1-7035.

Author information

Authors and Affiliations

MaLGa-DIBRIS, Università degli Studi di Genova, Genoa, Italy
Elena Nicora & Nicoletta Noceti

Authors

Elena Nicora
View author publications
You can also search for this author in PubMed Google Scholar
Nicoletta Noceti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elena Nicora .

Editor information

Editors and Affiliations

Boston University, Boston, MA, USA
Stan Sclaroff
National Research Council, Lecce, Italy
Cosimo Distante
National Research Council, Lecce, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni M. Farinella
Technische Universität München, Garching, Germany
Federico Tombari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nicora, E., Noceti, N. (2022). Exploring the Use of Efficient Projection Kernels for Motion Saliency Estimation. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-06433-3_14
Published: 15 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06432-6
Online ISBN: 978-3-031-06433-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring the Use of Efficient Projection Kernels for Motion Saliency Estimation