Abstract
The availability of new techniques and tools for Video Surveillance and the capability of storing huge amounts of visual data acquired by hundreds of cameras every day call for a convergence between pattern recognition, computer vision and multimedia paradigms. A clear need for this convergence is shown by new research projects which attempt to exploit both ontology-based retrieval and video analysis techniques also in the field of surveillance. This paper presents the ViSOR (Video Surveillance Online Repository) framework, designed with the aim of establishing an open platform for collecting, annotating, retrieving, and sharing surveillance videos, as well as evaluating the performance of automatic surveillance systems. Annotations are based on a reference ontology which has been defined integrating hundreds of concepts, some of them coming from the LSCOM and MediaMill ontologies. A new annotation classification schema is also provided, which is aimed at identifying the spatial, temporal and domain detail level used. The ViSOR web interface allows video browsing, querying by annotated concepts or by keywords, compressed video previewing, media downloading and uploading. Finally, ViSOR includes a performance evaluation desk which can be used to compare different annotations.
Similar content being viewed by others
References
Aggarwal K, Cucchiara R, Prati A (2006) In: VSSN ’06: proceedings of the 4th ACM international workshop on video surveillance and sensor networks. ACM, New York
BEHAVE Website. http://homepages.inf.ed.ac.uk/rbf/BEHAVE/
Bertini M, Del Bimbo A, Torniai C, Grana C, Vezzani R, Cucchiara R (2007) Sports video annotation using enhanced hsv histograms in multimedia ontologies. In: International workshop on visual and multimedia digital libraries. Modena, Italy, pp 160–167
Branch HOSD (2006) i-lids—imagery library for intelligent detection systems. Website. http://scienceandresearch.homeoffice.gov.uk/hosdb/
Calderara S, Cucchiara R, Prati A (2008) Action signature: a novel holistic representation for action recognition. In: 5th IEEE international conference on advanced video and signal based surveillance (AVSS2008)
CANDELA Website. http://www.extra.research.philips.com/euprojects/candela/
CAVIAR Website. http://homepages.inf.ed.ac.uk/rbf/CAVIARDATA1/
CMU Graphics Lab Motion Capture Database Website. http://mocap.cs.cmu.edu/
Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghosts and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25(10):1337–1342
Doermann D, Mihalcik D (2000) Tools and techniques for video performance evaluation. In: Proc. of int’l conference on pattern recognition, vol 04, p 4167
Francois AR, Nevatia R, Hobbs J, Bolles RC (2005) Verl: an ontology framework for representing and annotating video events. IEEE MultiMed 12(4):76–86
HumanEva - Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion Website. http://vision.cs.brown.edu/humaneva/
Image Sequence Server of the Institut für Algorithmen und Kognitive Systeme Website. http://i21www.ira.uka.de/image_sequences/
Joly P, Benois-Pineau J, Kijak E, Quénot G (2007) The argos campaign: evaluation of video analysis tools. In: Fifth international workshop on content-based multimedia indexing (CBMI’07)
Kasturi R, Goldgof D, Soundararajan P, Manohar V, Garofolo J, Bowers R, Boonstra M, Korzhova V, Zhang J (2009) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319–336
Kennedy L (2006) Revision of lscom event/activity annotations, dto challenge workshop on large scale concept ontology for multimedia. Columbia University ADVENT, Tech. Rep.
Machy C, Desurmont X, Delaigle J-F, Bastide A (2007) Introduction of CCTV at level crossings with automatic detection of potentially dangerous situations. In: 2nd Selcat workshop
Naphade M, Kennedy L, Kender JR, Chang S-F, Smith JR, Over P, Hauptmann A (2005) A light scale concept ontology for multimedia understanding for trecvid 2005. IBM Research, Tech. Rep.
Nevatia R, Hobbs J, Bolles B (2004) An ontology for video event representation. In: CVPRW ’04: proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04), vol 7. IEEE Computer Society, Washington, DC, p 119
Nghiem A-T, Bremond F, Thonnat M, Valentin V (2007) Etiseo, performance evaluation for video surveillance systems. In: Proceedings of AVSS 2007
ObjectVideo Virtual Video Website. http://development.objectvideo.com/
Pets: Performance evaluation of tracking and surveillance (2000–2007) Website. http://www.cvg.cs.rdg.ac.uk/slides/pets.html
Phillips PJ, Moon H, Rizvi SA, Rauss PJ (2000) The feret evaluation methodology for face-recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22(10):1090–1104
Rijsbergen CJV (1979) Information retrieval. Butterworth-Heinemann, Newton
Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and trecvid. In: MIR ’06: proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, pp 321–330
Snoek C, Worring M, Van Gemert J, Geusebroek J, Smeulders A (2006) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of the 14th ACM int’l conference on multimedia. ACM, New York, pp 421–430
Surveillance Performance EValuation Initiative (SPEVI) Website. http://www.spevi.org
TRECVID (2008) Surveillance video. Website. http://www-nlpir.nist.gov/projects/tv2008
van Harmelen F, Hendler J, Horrocks I, McGuinness D, Patel-Schneider PF, Stein LA (2002) Owl web ontology language reference. http://www.w3.org/TR/owl-ref/
Vezzani R, Cucchiara R (2008) Visor: video surveillance on-line repository for annotation retrieval. In: ICME. Hannover
Vezzani R, Cucchiara R (2008) Annotation collection and online performance evaluation for video surveillance: the visor project, Santa Fe, New Mexico
Vidi-video web site (2007) Website. http://www.vidivideo.info
Viper toolkit at sourceforge (2005) Website. http://viper-toolkit.sourceforge.net/
Visor web site (2007) Website. http://www.openvisor.org
Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104(2):249–257
Acknowledgements
This work is supported by the project VIDI-Video (Interactive semantic video search with a large thesaurus of machine-learned audio-visual concepts), funded by E.C. FP6.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vezzani, R., Cucchiara, R. Video Surveillance Online Repository (ViSOR): an integrated framework. Multimed Tools Appl 50, 359–380 (2010). https://doi.org/10.1007/s11042-009-0402-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0402-9