Abstract
Keyword-based image search engines such as Google Images are now very popular for accessing large amount of images on the Internet. Because only the text information that are directly or indirectly linked to the images are used for image indexing and retrieval, most existing image search engines such as Google Images may return large amount of junk images which are irrelevant to the given queries. To filter out the junk images from Google Images, we have developed a kernel-based image clustering technique to partition the images returned by Google Images into multiple visually-similar clusters. In addition, users are allowed to input their feedbacks for updating the underlying kernels to achieve more accurate characterization of the diversity of visual similarities between the images. To help users assess the goodness of image kernels and the relevance between the returned images, a novel framework is developed to achieve more intuitive visualization of large amount of returned images according to their visual similarity. Experiments on diverse queries on Google Images have shown that our proposed algorithm can filter out the junk images effectively. Online demo is also released for public evaluation at: http://www.cs.uncc.edu/~jfan/google − demo/.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fan, J., Gao, Y., Luo, H.: Multi-level annotation of natural scenes using dominant image compounds and semantic concepts. ACM Multimedia (2004)
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. on PAMI (2000)
He, X., Ma, W.-Y., King, O., Li, M., Zhang, H.J.: Learning and inferring a semantic space from user’s relevance feedback. ACM Multimedia (2002)
Tong, S., Chang, E.Y.: Support vector machine active learning for image retrieval. ACM Multimedia, 107–118 (2001)
Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance Feedback: A power tool in interactive content-based image retrieval. IEEE Trans. on CSVT 8(5), 644–655 (1998)
Fergus, R., Perona, P., Zisserman, A.: A Visual Category Filter for Google Images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, Springer, Heidelberg (2004)
Fergus, R., Fei-Fei, L., Oerona, P., Zisserman, A.: Learning object categories from Google’s image search. In: IEEE CVPR (2006)
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual, and link information. ACM Multimedia (2004)
Wang, X.-J., Ma, W.-Y., Xue, G.-R., Li, X.: Multi-modal similarity propagation and its application for web image retrieval. ACM Multimedia (2004)
Gao, B., Liu, T.-Y., Qin, T., Zhang, X., Cheng, Q.-S., Ma, W.-Y.: Web image clustering by consistent utilization of visual features and surrounding texts. ACM Multimedia (2005)
Ma, W.-Y., Manjunath, B.S.: Texture features and learning similarity. IEEE CVPR, 425–430 (1996)
Fan, J., Gao, Y., Luo, H., Satoh, S.: New approach for hierarchical classifier training and multi-level image annotation. In: Satoh, S., Nack, F., Etoh, M. (eds.) MMM 2008. LNCS, vol. 4903, pp. 1–12. Springer, Heidelberg (2008)
Schölkopf, B., Smola, A.J., Müller, K.-R.: Kernel principal component analysis. Neural Computation 10(5), 1299–1319 (1998)
Vendrig, J., Worring, M., Smeulders, A.W.M.: Filter image browsing: Interactive image retrieval by using database overviews. Multimedia Tools and Applications 15, 83–103 (2001)
Fan, J., Gao, Y., Luo, H.: Hierarchical classification for automatic image annotation. ACM SIGIR (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gao, Y., Fan, J., Luo, H., Satoh, S. (2008). A Novel Approach for Filtering Junk Images from Google Search Results. In: Satoh, S., Nack, F., Etoh, M. (eds) Advances in Multimedia Modeling. MMM 2008. Lecture Notes in Computer Science, vol 4903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77409-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-77409-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77407-5
Online ISBN: 978-3-540-77409-9
eBook Packages: Computer ScienceComputer Science (R0)