Abstract
Conventional object recognition techniques rely heavily on manually annotated image datasets to achieve good performances. However, collecting high quality datasets is really laborious. The image search engines such as Google Images seem to provide quantities of object images. Unfortunately, a large portion of the search images are irrelevant. In this paper, we propose a semi-supervised framework for learning visual categories from Google Images. We exploit a co-training algorithm, the CoBoost algorithm, and integrate it with two kinds of features, the 1st and 2nd order features, which define bag of words representation and spatial relationship between local features, respectively. We create two boosting classifiers based on the 1st and 2nd order features in the training, during which one classifier provides labels for the other. The 2nd order features are generated dynamically rather than extracted exhaustively to avoid high computation. An active learning technique is also introduced to further improve the performance. Experimental results show that the object models learned from Google Images by our method are competitive with the state-of-the-art unsupervised approaches and some supervised techniques on the standard benchmark datasets.
Similar content being viewed by others
References
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Bennett, K., Demiriz, A.: Semi-supervised support vector machines. In: Advances in Neural Information Processing Systems, pp. 368–374 (1999)
Berg, T.L., Forsyth, D.A.: Animals on the web. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1463–1470. IEEE Press, New York (2006)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM, New York (1998)
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2photo: Internet image montage. ACM Trans. Graph. 124, 1 (2009)
Cohen, I., Cozman, F.G., Sebe, N., Cirelo, M.C., Huang, T.S.: Semisupervised learning of classifiers: theory, algorithms, and their application to human–computer interaction. IEEE Trans. Pattern Anal. Mach. Intell. 26(12), 1553–1566 (2004)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 189–196 (1999)
Fergus, R., Li, F.F., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005), vol. 2, pp. 1816–1823. IEEE Press, New York (2005)
Fergus, R., Li, F.F., Perona, P., Zisserman, A.: Learning object categories from Internet image searches. Proc. IEEE 98(8), 1453–1466 (2010)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264–271. IEEE Press, New York (2003)
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for Google images. In: Lecture Notes in Computer Science, pp. 242–256 (2004)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Thirteenth IEEE International Conference on Machine Learning (ICML 1996), pp. 148–156. Morgan Kaufmann, San Mateo (1996)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE Press, New York (2006)
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on Statistical Learning in Computer Vision (ECCV), pp. 17–32 (2004)
Leistner, C., Grabner, H., Bischof, H.: Semi-supervised boosting using visual similarity learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), pp. 1–8. IEEE Press, New York (2008)
Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)
Li, L.J.: Li, F.F.: Optimol: Automatic online picture collection via incremental model learning. Int. J. Comput. Vis. 88(2), 147–168 (2010)
Liu, D., Hua, G., Viola, P., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), pp. 1–8. IEEE Press, New York (2008)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Computer Vision (ECCV 2006), pp. 490–503 (2006)
Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Computer Vision (ECCV 2004), pp. 71–84 (2004)
Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2006, vol. 2, pp. 2033–2040. IEEE Press, New York (2006)
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. In: IEEE 11th International Conference on Computer Vision (ICCV 2007), pp. 1–8. IEEE Press, New York (2007)
Shen, L., Bai, L.: Mutualboost learning for selecting Gabor features for face recognition. Pattern Recognit. Lett. 27(15), 1758–1767 (2006)
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005), vol. 1, pp. 370–377. IEEE Press, New York (2005)
Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), pp. 1–8. IEEE Press, New York (2008)
Wang, G., Forsyth, D.: Object image retrieval by exploiting online knowledge resources. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), pp. 1–8. IEEE Press, New York (2008)
Wang, G., Hoiem, D., Forsyth, D.: Building text features for object image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 1367–1374. IEEE Press, New York (2009)
Wang, J., Jiang, Y.G., Chang, S.F.: Label diagnosis through self tuning for web image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 1390–1397. IEEE Press, New York (2009)
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–238 (2007)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 16, 321–328 (2004)
Zhou, Z.H., Chen, K.J., Jiang, Y.: Exploiting unlabeled data in content-based image retrieval. In: Machine Learning (ECML 2004), pp. 525–536 (2004)
Zhu, X.: Semi-supervised learning literature survey. Technical report, Department of Computer Sciences, University of Wisconsin at Madison (2005)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Twentieth IEEE International Conference on Machine Learning (ICML 2003), vol. 20, pp. 912–919 (2003)
Acknowledgements
This work is supported by the National Basic Research Priorities Programme (No. 2007CB311004), National Science and Technology Support Plan (No. 2006BAC08B06), and National Science Foundation of China (No. 60775035, No. 60903141, No. 60933004, No. 60970088).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, X., Shi, ZP. & Shi, ZZ. A co-boost framework for learning object categories from Google Images with 1st and 2nd order features. Vis Comput 30, 5–17 (2014). https://doi.org/10.1007/s00371-012-0772-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-012-0772-2