Abstract
This article presents GAT, a Graphical Annotation Tool based on a region-based hierarchical representation of images. The proposed solution uses Partition Trees to navigate through the image segments which are automatically defined at different spatial scales. Moreover, the system focuses on the navigation through ontologies for a semantic annotation of objects and of the parts that compose them. The tool has been designed under usability criteria to minimize the user interaction by trying to predict the future selection of regions and semantic classes. The implementation uses MPEG-7/XML input and output data to allow interoperability with any type of Partition Tree. This tool is publicly available and its source code can be downloaded under a free software license.
Similar content being viewed by others
References
Akrivas G, Wallace M, Andreou G, Stamou G, Kollias S. (2002) Context-sensitive semantic query expansion. Proc. of IEEE International Conference on Artificial Intelligence Systems (ICAIS). Geelong, Australia, p 109. doi:10.1109/ICAIS.2002.1048064
Alatan A, Onural L, Wollborn M, Mech R, Tuncel E, Sikora T (1998) Image sequence analysis for emerging interactive multimedia services-the european cost 211 framework. IEEE Trans Circuits Syst Video Technol 8:802–813
Arndt R, Troncy R, Staab S, Hardman L, Vacura M (2008) COMM: sesigning a well-founded multimedia ontology for the web. The semantic web. Springer, Berlin, pp 30–43. doi:10.1007/978-3-540-76298-0_3
Ballester C, Caselles V, Monasse P (2003) The tree of shapes of an image. ESAIM, COCV 9:1–18. doi:10.1109/83.663500
Bloehdorn S, Petridis K, Saathoff C, Simou N, Tzouvaras V, Avrithis Y, Handscuh S, Kompatsiaris I, Staab S, Strinzis MG (2005) Semantic annotation of images and videos for multimedia analaysis. Proc. 2nd European Semantic Web Conference. Heraklion, Greece. doi:10.1007/11431053_40
Burnett IS, Pereira F, Van de Walle R, Koenen R (2006) The MPEG-21 book. Wiley, Chichester
Calderero F, Marques F (2008) Object-based evaluation of hierarchical region-based representations based on information theory statistical measures. Proceedings CBMI 2008 (International Sixth International Workshop on Content-Based Multimedia Indexing). London, UK
Dasiopoulou S, Tzouvaras V, Kompatsiaris I, Strinzis MG (2008) Capturing MPEG-7 Semantics. Metadata and semantics. Springer, US, pp 113–122
Dimitrova N, McGee T, Elenbass H (1997) Video keyframe extraction and filtering: a keyframe is not a keyframe to everyone. Proceedings of the sixth international conference on Information and knowledge management. Las Vegas, USA, pp 113–120. doi:10.1145/266714.266876
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2008) The PASCAL visual object classes challenge available via http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html. Accessed 23 Jun 2009
Facebook (2009) Available via http://www.facebook.com. Accessed 23 Jun 2009
Flickr (2009) Available via http://www.flickr.com. Accessed 23 Jun 2009
Garrido L, Salembier P (2000) Binary partition tree as an efficient representation for image processing, segmentation and information retrieval. IEEE Trans Image Process, 561–576. doi:10.1109/83.841934
Gu C, Lee MC (1998) Semiautomatic segmentation and tracking of semantic video objects. IEEE Trans Circuits Syst Video Technol 8:572–584. doi:10.1109/76.718504
Kruse S, Bardella X, Schweitzer F, Valero M (1998) An interactive image segmentation scheme. Proc. of Picture Coding Symposium. Portland, USA, pp 169–173
Manjunath BS, Salembier P, Sikora T (2002) Introduction to MPEG 7: multimedia content description language. Wiley, Chichester
Marcotegui B, Correia P, Marques F, Mech R, Rosa R, Wollborn M, Zanoguera F (1999) A video generation tool allowing friendly user interaction. Proceedings of the ICIP 99, IEEE International Conference on Image Processing. Kobe, Japan
Marques F, Salembier P (1999) Region-based representation of image and video: segmentation tools for multimedia services. IEEE Trans Circuits Syst Video Technol 9:1147–1167. doi:10.1109/76.809153
Marques F, Marcotegui B, Zanoguera F, Correia P, Mech R, Wollborn M (2000) Partition-based image representation as basis for user-assisted segmentation. Proc. International Conference on Image Processing (ICIP), Vol. 1. Vancouver, Canada, pp 312–315. doi:10.1109/ICIP.2000.900957
Minka TP, Picard RW (1997) Interactive learning using a “society of models”. Pattern Recogn 30. doi:10.1016/S0031-3203(96)00113-6
Monasse P, Guichard F (2000) Fast computation of a contrast-invariant image representation. IEEE Trans Image Process 5:860–872. doi:10.1109/83.841532
Monterey Bay Aquarium Research Institute (2009) Video annotation and reference system. Available via http://vars.sourceforge.net. Accessed 23 June 2009
Naphade MR, Huang TS (2001) Recognizing high-level audio-visual concepts using context. Proceedings IEEE International Conference on Image Processing (ICIP), Vol. 3. Thessaloniki, Greece, pp 46–49
O’Connor NE, Adamek T (2007) An automatic stopping criterion for meaningful region-based image segmentation. Semantic Multimedia. Lecture Notes in Computer Science, Vol. 4816. Springer, Berlin, pp 15–27. doi:10.1007/978-3-540-77051-0_2
Papadopoulos GT, Mezaris V, Dasiopoulou S, Kompatsiaris I (2006) Semantic image analysis using a learning approach and spatial context. Lectures notes in computer science, Vol. 4306. Springer, Berlin, pp 199–211. doi:10.1007/11930334
Petridis K, Anastasopoulos D, Saathoff C, Timmermann N, Kompatsiaris I, Staab S (2006) M-OntoMat-Annotizer: image annotation. Linking ontologies and multimedia low-level features. Proc. of 10th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems (KES 2006). Bournemouth, UK
Rehatschek H, Bailer W, Neuschmied H, Ober S, Bischof H (2007) A tool supporting annotation and analysis of videos. Reconfigurations. Interdisciplinary Perspectives on Religion in a Post-Secular Society. pp 253–268. Vienna, Austria. http://wbailer.wordpress.com/publications/
Rosenfeld A, Pietikainen M (1981) Image segmentation by texture using pyramid node linking. IEEE Trans Syst, Machines Cybern SMC-11:822–825
Rother C, Kolmogorov V, Blake A. (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 309–314. doi:10.1145/1015706.1015720
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77:157–173. doi:10.1007/s11263-007-0090-8
Saathoff C, Schenk S, Scherp A (2008) KAT: the K-space annotation tool. Proceedings of the SAMT 2008 Demo and Poster Session. Koblenz, Germany
Salembier P, Oliveras A, Garrido L (1998) Antiextensive connected operators for image and sequence processing. IEEE Trans Image Process 7:555–570
Smith JR, Lugeon B (2000) Visual annotation tool for multimedia content description. Proc. SPIE, Vol. 4210. Boston, MA, USA doi:10.1117/12.403831
Troncy R, Van Ossenbruggen J, Pan JZ, Stamou G (2007) Image annotation on the semantic web. W3C Incubator Group. http://www.w3.org/2005/Incubator/mmsem/XGR-image-annotation/, Report of 14 August 2007
Vilaplana V, Marques F, Salembier P (2008) Binary partition trees for object detection. IEEE Trans Image Process 17(11):2201–2216
Volkmer T, Smith JR, Nastev A (2005) A web-based system for collaborative annotation of large image and video collections: an evaluation and user study. Proceedings of the 13th annual ACM international conference on Multimedia. Singapore, pp 892–901. doi:10.1145/1101149.1101341
Xue B, Sapiro G (2007) Distancecut: interactive segmentation and matting of images and videos. Proc. of the IEEE International Conference on Image Processing (ICIP), Vol. 2. San Antonio, USA, pp II -249–II -252. doi:10.1109/ICIP.2007.4379139
Acknowledgements
This work was partially founded by the Catalan Broadcasting Corporation (CCMA) and Mediapro S.L. through the Spanish project CENIT-2007-1012 i3media, by TEC2007-66858/TCM PROVEC project of the Spanish Government and by a grant from the Commissioner for Universities and Research of the Innovation, Universities and Industry Department of the Catalan Government.
Copyright warnings
The “TV anchor” and “Formula 1” key-frames used in this paper belongs to TVC, Televisió de Catalunya, and is copyright protected. This key-frame has been provided by TVC with the only goal of research under the framework of the i3media project.
The “soccer” key-frame used in this paper belongs to MEDIAPRO, S.L., and is copyright protected. This key-frame has been provided by MEDIAPRO, S.L. with the only goal of research under the framework of the i3media project.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
(MPG 12328 kb)
(MPG 8570 kb)
(MPG 7994 kb)
(MPG 7814 kb)
Rights and permissions
About this article
Cite this article
Giro-i-Nieto, X., Camps, N. & Marques, F. GAT: a Graphical Annotation Tool for semantic regions. Multimed Tools Appl 46, 155–174 (2010). https://doi.org/10.1007/s11042-009-0389-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0389-2