Abstract
This paper presents an efficient approach to group and summarize the large-scale image dataset gathered from the internet. Our method firstly employs the bag-of-visual-words model which has been successfully used in image retrieval applications to give the similarity between images and divides the large image collections into separated coarse groups. Next, in each group, we match the features between each pair of images by using an area ratio constraint which is an affine invariant. The number of matched features is taken as the new similarity between images, by which the initial grouping results are refined. Finally, one canonical image for one group is chosen as the summarization. The proposed approach is tested on two datasets consisting of thousands of images which are collected from the photo-sharing website. The experimental results demonstrate the efficiency and effectiveness of our method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
(www), http://www.flickr.com/
Snavely, N., Seitz, S., Szeliski, R.: Photo Tourism: exploring photo collections in 3D. SIGGRAPH 25(3), 835–846 (2006)
Simon, I., Snavely, N., Seitz, S.M.: Scene summarization for online image collections. In: ICCV (2007)
Li, X., Wu, C., Zach, C., Lazebnik, S., Frahm, J.M.: Modeling and recognition of landmark image collections using iconic scene graphs. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 427–440. Springer, Heidelberg (2008)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)
Chum, O., Matas, J.: Web scale image clustering. Research Report. Czech Technical University (2008)
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. Journal of the ACM 45(6), 891–923 (1998)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, pp. 2161–2168 (2006)
Schindler, G., Brown, M., Szeliski, R.: City-scale location recognition. In: CVPR (2007)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Yao, J., Cham, W.K.: Robust multi-view feature matching from multiple unordered views. Pattern Recognition 40, 3081–3099 (2007)
Zeng, X., Wang, Q., Xu, J.: Map model for large-scale 3d reconstruction and coarse matching for unordered wide-baseline photos. In: BMVC (2008)
Hartley, R., Zisserman, A.: Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge (2003)
(www), http://grail.cs.washington.edu/projects/canonview/pantheon_index/pantheon.html/
Zheng, Y., Zhao, M., Song, Y., Adam, H., Buddemeier, U., Bissacco, A., Brucher, F., Chua, T.S., Neven, H.: Tour the World: Building a Web-Scale Landmark Recognition Engine. In: CVPR (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, H., Wang, Q. (2009). Grouping and Summarizing Scene Images from Web Collections. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2009. Lecture Notes in Computer Science, vol 5876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10520-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-10520-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10519-7
Online ISBN: 978-3-642-10520-3
eBook Packages: Computer ScienceComputer Science (R0)