ABSTRACT
The widespread adoption of Web 2.0 applications has resulted in the creation of huge amounts of user-generated multimedia content, a fact that motivated the investigation of employing this content for training. However, the nature of these annotations (i.e. global level) and the noise existing in the associated information, as well as the ambiguity that characterizes these examples disqualifies them from being directly appropriate learning samples. Nevertheless, the tremendous volume of data that is currently hosted in social networks gives us the luxury to disregard a substantial number of candidate learning examples, provided we can devise a gauging mechanism that could filter out any ambiguous or noisy samples. Our objective in this work is to define a measure for visual ambiguity, which is caused by the visual similarity of semantically dissimilar concepts, in order to help in the process of selecting positive training regions from user tagged images. This is done by limiting the search space of the potential images to the ones yielding a higher probability to contain the desired regions, while at the same time not including visually ambiguous objects that could confuse the selection algorithm. Experimental results show that the employment of visual ambiguity allows for better separation between the targeted true positive and the undesired negative regions.
- E. Chatzilari, S. Nikolopoulos, Y. Kompatsiaris, and J. Kittler. Multi-modal region selection approach for training object detectors. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, pages 5:1--5:8. ACM, 2012. Google ScholarDigital Library
- E. Chatzilari, S. Nikolopoulos, Y. Kompatsiaris, and J. Kittler. Active learning in social context for image classification. In 9th Int. Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, January 5-8 2014.Google Scholar
- E. Chatzilari, S. Nikolopoulos, I. Patras, and I. Kompatsiaris. Leveraging social media for scalable object detection. Pattern Recognition, 45(8):2962--2979, 2012. Google ScholarDigital Library
- H. J. Escalante, C. A. Hernandez, J. A. Gonzalez, A. Lspez-Lspez, M. Montes, E. F. Morales, L. E. Sucar, L. Villase?or, and M. Grubinger. The segmented and annotated iapr tc-12 benchmark. CVIU, 2010. Google ScholarDigital Library
- C. Fellbaum, editor. WordNet An Electronic Lexical Database. The MIT Press, Cambridge, MA; London, May 1998.Google Scholar
- X. Li, C. G. M. Snoek, M. Worring, D. C. Koelma, and A. W. M. Smeulders. Bootstrapping visual categorization with relevant negatives. IEEE Trans. on Multimedia, In press, 2013.Google Scholar
- B. T. Mark J. Huiskes and M. S. Lew. New trends and ideas in visual concept detection: The mir flickr retrieval evaluation initiative. In MIR '10: Proceedings of the 2010 ACM International Conference on Multimedia Information Retrieval, pages 527--536, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- V. Ng and C. Cardie. Bootstrapping coreference classifiers with multiple machine learning algorithms. In Proceedings of the 2003 conference on Empirical methods in natural language processing, EMNLP '03, pages 113--120, 2003. Google ScholarDigital Library
- S. Patwardhan. Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness. Master's thesis, University of Minnesota, Duluth, August 2003.Google Scholar
- Y. Shen and J. Fan. Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In ACM, MM '10, 2010. Google ScholarDigital Library
- J. C. van Gemert, C. J. Veenman, A. W. M. Smeulders, and J. M. Geusebroek. Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7):1271--1283, 2010. Google ScholarDigital Library
- S. Vijayanarasimhan and K. Grauman. Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR, pages 1449--1456, 2011. Google ScholarDigital Library
Index Terms
- Towards modelling visual ambiguity for visual object detection
Recommendations
Using tagged images of low visual ambiguity to boost the learning efficiency of object detectors
MM '13: Proceedings of the 21st ACM international conference on MultimediaMotivated by the abundant availability of user-generated multimedia content, a data augmentation approach that enhances an initial manually labelled training set with regions from user tagged images is presented. Initially, object detection classifiers ...
Seven Types of Visual Ambiguity: On the Merits and Risks of Multiple Interpretations of Collaborative Visualizations
IV '08: Proceedings of the 2008 12th International Conference Information VisualisationThe use of visuals as collaboration catalysts has recently gained attention in research on group work, knowledge management, sense making, and collaboration in general. A special feature of such visualizations (i.e., sketches, diagrams, visual metaphors,...
TAPHSIR: towards AnaPHoric ambiguity detection and ReSolution in requirements
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringWe introduce TAPHSIR – a tool for anaphoric ambiguity detection and anaphora resolution in requirements. TAPHSIR facilities reviewing the use of pronouns in a requirements specification and revising those pronouns that can lead to misunderstandings ...
Comments