ABSTRACT
Huge amount of manual efforts are required to annotate large image/video archives with text annotations. Several recent works attempted to automate this task by employing supervised learning approaches to associate visual information extracted in segmented images with semantic concepts provided by associated text. The main limitation of such approaches, however, is that large labeled training corpus is still needed for effective learning, and semantically meaningful segmentation for images is in general unavailable. This paper explores the use of bootstrapping approach to tackle this problem. The idea is to start from a small set of labeled training examples, and successively annotate a larger set of unlabeled examples. This is done using the cotraining approach, in which two "statistically independent" classifiers are used to co-train and co-annotate the unlabeled examples. An active learning approach is used to select the best examples to label at each stage of learning in order to maximize the learning objective. To accomplish this, we break the task of annotating images into the sub-tasks of: (a) segmenting images into meaningful units, (b) extracting appropriate features for the units, and (c) associating these features with text. Because of the uncertainty in sub-tasks (a) and (b), we adopt two independent segmentation methods (task a) and two independent sets of features (task b) to support co-training. We carried out experiments using a mid-sized image collection (comprising about 6,000 images from CorelCD, PhotoCD and Web) and demonstrated that our bootstrapping approach significantly improve the performance of annotation by about 10% in terms of F1 measure as compared to the best results obtained from the traditional supervised learning approach. Moreover, the bootstrapping approach has the key advantage of requiring much fewer labeled examples in training.
- John R. Smith and S-F Chang. VisualSeek: A Fully Automated Content-based Query System. In Proc. Fourth Int. Conf. Multimedia, ACM, 87--92. 1996. In Proc. Fourth Int. Conf. Multimedia, ACM. In Proc. Fourth Int. Conf. Multimedia, ACM.]] Google ScholarDigital Library
- John R. Smilth, milind Naphade and Apostal (Paul) Natsev. Multimedia Semantic Indexing Using Model Vectors. 2003. ICME.]] Google ScholarDigital Library
- Steven Abney. Bootstrapping. 40th Annual Meeting of the Association for Computational Linguistics. 2002. 40th Annual Meeting of the Association for Computational Linguistics.]] Google ScholarDigital Library
- A. Blum and T. Mitchell. Combined Labeled Data and Unlabelled Data with Co-training. In Proceeding of the 11th Annual Conference on Computational Learning Theory. 1998. In Proceeding of the 11th Annual Conference on Computational Learning Theory.]] Google ScholarDigital Library
- David A. Cohn, Zoubin Ghahramani and Michael I. Jordan, Active learning with statistical models. Journal of Artificial Intelligence Reseach 4, 129--145 (1996).]]Google ScholarDigital Library
- Y. Mori, H. Takahashi and R. Oka, Image-to-word transformation based on dividing and vector quantizing images with words. First International Workshop on multimedia Intelligent Storage and Retrieval Management (1999).]]Google Scholar
- K. Barnard and D. A. Forsyth, Learning the semantics of words and pictures. IEEE International Conference on Computer Vision II, 408--415 (2001).]]Google Scholar
- Edward Chang, Kingshy Goh, Gerard Sychay and Gang Wu, CBSA: content-based soft annotation for Multimodal Image Retrieval Using Bayes Point Machines. IEEE Transactions on Circuits and Systems for Video Technology Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description 13, 26--38 (2003).]]Google Scholar
- K. Barnard, P. Duygulu and D. Forsyth, Clustering art. In Proc of IEEE Computer Vision and Pattern Recognition 434--441 (2001).]]Google Scholar
- S. Belongie, C. Carson, H. Greenspan and J. Malik, Recognition of images in large databases using a learning framework. Technical report 07-939, UC Berkelely CS Tech Report 07-939, (1997).]] Google ScholarDigital Library
- C. Carson, M. Thomas, S. B., J. M. Hellerstein and J. Malik, BlobWorld: A System for region-based image indexing and Retrieval. Int Conf Visual Inf Sys (1999).]] Google ScholarDigital Library
- Edward Chang, Kingshy Goh, Gerard Sychay and Gang Wu, CBSA: content-based soft annotation for Multimodal Image Retrieval Using Bayes Point Machines. IEEE Transactions on Circuits and Systems for Video Technology Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description 13, 26--38 (2003).]]Google Scholar
- R. Herbrick, T. Graepel and C. Campbell, Bayes Point Machines. Journal of Machine Learning Research 1, 245--279 (2001).]] Google ScholarDigital Library
- James Z. Wang and Jia Li. Learning-based Linguistic Indexing of Pictures with 2-D MHHMs. The 10th ACM Int. Conference on Multimedia, 436--445. 2002. The 10th ACM Int. Conference on Multimedia.]] Google ScholarDigital Library
- M. Collins and Y. Singer. Unsupervised Models for Name Entity Classification. In Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural language Processing and Very Large Corpora. 1999. In Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural language Processing and Very Large Corpora.]]Google Scholar
- I. Muslea, S. Minton and C. A. Knoblock, Selective sampling with co-testing. in CRM workshop on Combining and Selecting Multiple Models with Machine Learning (2000).]]Google Scholar
- K. Nigam and R. Ghani. Analyzing the Effectiveness and Applicability of Co-training. 2000. In Proceedings of the 9th International Coference on Information and Knowledge management.]] Google ScholarDigital Library
- Y. Cao, H. Li and L. Lian, Uncertainty reduction in collaborative bootstapping:measure and algorithm. In proceeding of the 41th Annual Meeting of the Association for computational Linguistics (2003).]] Google ScholarDigital Library
- D. D Lewis and W. A. Gale, A sequential algorithm for training text classifers. In proceeding of ACM SIGIR 3--12 (1994).]] Google ScholarDigital Library
- Cha Zhang and Tsuhan Chen, An active learning framework for content-based information retrieval. IEEE transactions on multimedia 4, 260--268 (2002).]]Google Scholar
- Y. Deng and B. S. Manjunath, Unsupervised segmentation of color-texture regions in images and video. IEEE Trans on Pattern Analysis and Machine Intelligence 23, 800--810 (2001).]] Google ScholarDigital Library
- Tat_Seng Chua and Jimin Liu, Learning pattern rules for Chinese named-entity extraction. AAAI'2002. AAAI'2002 411--418 (2002).]] Google ScholarDigital Library
- Ross Quinlan. Data Mining Tools See5 and C5.0. http://www.rulequest.com/see5-info.html. 2003.]]Google Scholar
- Vladimir Vapnik, The nature of Statistical Learning Theory, Springer, New York 1995.]] Google ScholarDigital Library
- John C. Platt, Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. In Advances in Large Margin Classifiers MIT Press (1999).]]Google Scholar
Index Terms
- A bootstrapping approach to annotating large image collection
Recommendations
A bootstrapping framework for annotating and retrieving WWW images
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on MultimediaMost current image retrieval systems and commercial search engines use mainly text annotations to index and retrieve WWW images. This research explores the use of machine learning approaches to automatically annotate WWW images based on a predefined ...
Semi-supervised co-training and active learning based approach for multi-view intrusion detection
SAC '09: Proceedings of the 2009 ACM symposium on Applied ComputingAlthough there is immense data available from networks and hosts, a very small proportion of this data is labeled due to the cost of obtaining expert labels. This proves to be a significant bottle-neck for developing supervised intrusion detection ...
Mining relational data from text: From strictly supervised to weakly supervised learning
This paper approaches the relation classification problem in information extraction framework with different machine learning strategies, from strictly supervised to weakly supervised. A number of learning algorithms are presented and empirically ...
Comments