Abstract
Although various kinds of image features have been proposed, there exists no single optimal feature which can save the effort of all other features for multimedia analysis applications, e.g. image annotation. In this paper, we propose a novel image representation, Semantic Gist (SemanGist), to combine the merit of multiple features automatically. Given a local image patch, SemanGist converts multiple low-level features of the patch into compact prediction scores of a few predefined semantic categories. To this end, a discriminative multi-label boosting algorithm is adopted. This local SemanGist output allows for incorporating semantic spatial context among adjacent patches. For applications like image annotation, this may further reduce possible annotation errors by considering the label compatibility. The same boosting algorithm is applied to the SemanGist representation, together with low-level features, to ensure the label compatibility. Experiments on an image annotation task show that SemanGist not only achieves compact representation but also incorporates spatial context at low run-time computational cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 145–175 (2001)
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, pp. 209–216. ACM, New York (2007)
van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Snoek, C.G.M., Smeulders, A.W.M.: Robust scene categorization by learning image statistics in context. In: SLAM workshop on CVPR 2006, p. 105 (2006)
Amir, A., et al.: IBM research trecvid-2003 video retrieval system. In: Proc. of TRECVID workshop (2004)
Snoek, C.G.M., Worring, M., Geusebroek, J.M., Koelma, D.C., Seinstra, F.J., Smeulders, A.W.M.: The semantic pathfinder: Using an authoring metaphor for generic multimedia indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 1678–1689 (2006)
Jiang, W., Chang, S.F., Loui, A.C.: Context-based concept fusion with boosted conditional random fields. In: Proc. of ICASSP, Hawaii, USA (April 2007)
Yan, R., Tesic, J., Smith, J.R.: Model-shared subspace boosting for multi-label classification. In: Proc. of ACM KDD 2007 (2007)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, vol. 2, pp. 762–769 (2004)
Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: MIR 2007: Proc. of MIR workshop, pp. 61–70 (2007)
Fredman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28, 274–337 (2000)
Pudil, P., Ferri, F., Novovicova, J., Kittler, J.: Floating search methods for feature selection with nonmonotonic criterion functions. Pattern Recognition 2, 279–283 (1994)
Yuan, J., Li, J., Zhang, B.: Exploiting spatial context constraints for autmatic image region annotation. In: Proc. of ACM Multimedia 2007 (2007)
Altun, Y., Hofmann, T., Johnson, M.: Discriminative learning for label sequences via boosting (2003)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: Proc. of Intl. MIR workshop (2006)
Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell. 23, 800–810 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, D., Liu, X., Wang, D., Li, J., Zhang, B. (2008). SemanGist: A Local Semantic Image Representation. In: Huang, YM.R., et al. Advances in Multimedia Information Processing - PCM 2008. PCM 2008. Lecture Notes in Computer Science, vol 5353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89796-5_64
Download citation
DOI: https://doi.org/10.1007/978-3-540-89796-5_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89795-8
Online ISBN: 978-3-540-89796-5
eBook Packages: Computer ScienceComputer Science (R0)