Abstract
When large-scale online images come into view, it is very attractive to incorporate visual concept network for image summarization, organization and exploration. In this paper, we have developed an automatic algorithm for visual concept network generation by determining the diverse visual similarity contexts between the image concepts. To learn more reliable inter-concept visual similarity contexts, the images with diverse visual properties are crawled from multiple sources and multiple kernels are combined to characterize the diverse visual similarity contexts between the images and handle the issue of sparse image distribution more effectively in the high-dimensional multi-modal feature space. Kernel canonical correlation analysis (KCCA) is used to characterize the diverse inter-concept visual similarity contexts more accurately, so that our visual concept network can have better coherence with human perception. A similarity-preserving visual concept network visualization technique is developed to assist users on assessing the coherence between their perceptions and the inter-concept visual similarity contexts determined by our algorithm. Our experimental results on large-scale image collections have observed very good results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. PAMI 22(12), 1349–1380 (2000)
Hauptmann, A., Yan, R., Lin, W.-H., Christel, M., Wactlar, H.: Can high-level concepts fill the semantic gap in video retrieval? A case study with broadcast news. IEEE Trans. on Multimedia 9(5), 958–966 (2007)
Benitez, A.B., Smith, J.R., Chang, S.-F.: MediaNet: A multimedia information network for knowledge representation. In: Proc. SPIE, vol. 4210 (2000)
Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE Multimedia (2006)
Lowe, D.: Distinctive image features from scale invariant keypoints. Intl. Journal of Computer Vision 60, 91–110 (2004)
Wu, L., Li, M., Li, Z., Ma, W.-Y., Yu, N.: Visual lanuage modeling for image classification. In: ACM MIR (2007)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Grin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: A database and web-based tool for image annotation. Int. J. Comput. Vision 77(1-3), 157–173 (2008)
Sonnenburg, S., Ratsch, G., Schafer, C., Scholkopf, B.: Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565 (2006)
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. Intl. Journal of Computer Vision 73(2), 213–238 (2007)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: IEEE CVPR (2004)
Gao, Y., Peng, J., Luo, H., Keim, D., Fan, J.: An Interactive Approach for Filtering out Junk Images from Keyword-Based Google Search Results. IEEE Trans. on Circuits and Systems for Video Technology 19(10) (2009)
Huang, J., Kumar, S.R., Zabih, R.: An automatic hierarchical image classification scheme. In: ACM Multimedia, Bristol, UK (1998)
Vasconcelos, N.: Image indexing with mixture hierarchies. In: IEEE CVPR (2001)
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. on PAMI 25(9), 1075–1088 (2003)
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE CVPR (2005)
Barnard, K., Forsyth, D.: Learning the semantics of words and pictures. In: IEEE ICCV, pp. 408–415 (2001)
Naphade, M., Huang, T.S.: A probabilistic framework for semantic video indexing, filterig and retrieval. IEEE Trans. on Multimedia 3(1), 141–151 (2001)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: An overview with application to learning methods, Technical Report, CSD-TR-03-02, University of London (2003)
Wu, L., Hua, X.-S., Yu, N., Ma, W.-Y., Li, S.: Flickr distance. ACM Multimedia (2008)
Cilibrasi, R., Vitanyi, P.: The Google similarity distance. IEEE Trans. Knowledge and Data Engineering 19 (2007)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Boston (1998)
Lamping, J., Rao, R.: The hyperbolic browser: A focus+content technique for visualizing large hierarchies. Journal of Visual Languages and Computing 7, 33–55 (1996)
Cox, T., Cox, M.: Multidimensional Scaling. Chapman and Hall, Boca Raton (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, C., Luo, H., Fan, J. (2010). Generating Visual Concept Network from Large-Scale Weakly-Tagged Images. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, YP.P. (eds) Advances in Multimedia Modeling. MMM 2010. Lecture Notes in Computer Science, vol 5916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11301-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-11301-7_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11300-0
Online ISBN: 978-3-642-11301-7
eBook Packages: Computer ScienceComputer Science (R0)