Visual instance mining from the graph perspective

Li, Wei; Li, Jianmin; Wang, Changhu; Zhang, Lei; Zhang, Bo

doi:10.1007/s00530-016-0533-6

Visual instance mining from the graph perspective

Regular Paper
Published: 04 February 2017

Volume 24, pages 147–162, (2018)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Wei Li¹,
Jianmin Li¹,
Changhu Wang²,
Lei Zhang³ &
…
Bo Zhang¹

334 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we address the problem of visual instance mining, which is to automatically discover frequently appearing visual instances from a large collection of images. We propose a scalable mining method by leveraging the graph structure with images as vertices. Different from most existing approaches that focus on either instance-level similarities or image-level context properties, our method captures both information. In the proposed framework, the instance-level information is integrated during the construction of a sparse instance graph based on the similarity between augmented local features, while the image-level context is explored with a greedy breadth-first search algorithm to discover clusters of visual instances from the graph. This framework can tackle the challenges brought by small visual instances, diverse intra-class variations, as well as noise in large-scale image databases. To further improve the robustness, we integrate two techniques into the basic framework. First, to better cope with the increasing noise of large databases, weak geometric consistency is adopted to efficiently combine the geometric information of local matches into the construction of the instance graph. Second, we propose the layout embedding algorithm, which leverages the algorithm originally designed for graph visualization to fully explore the image database structure. The proposed method was evaluated on four annotated data sets with different characteristics, and experimental results showed the superiority over state-of-the-art algorithms on all data sets. We also applied our framework on a one-million Flickr data set and proved its scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Saliency-GD: A TF-IDF Analogy for Landmark Image Mining

Graph-Based Object Class Discovery from Images with Multiple Objects

Stable Visual Pattern Mining via Pattern Probability Distribution

References

Wang, H., Zhao, G., Yuan, J.: Visual pattern discovery in image and video data: a brief survey. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 4(1), 24–37 (2014)
Article Google Scholar
Zhang, W., Li, H., Ngo, C.W., Chang, S.F.: Scalable visual instance mining with threads of features. In: ACM International Conference on Multimedia, pp. 297–306 (2014)
Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Scalable visual instance mining with instance graph. In: British Machine Vision Conference (BMVC), pp. 98.1–98.11 (2015)
Weng, C., Yuan, J.: Efficient mining of optimal AND/OR patterns for visual recognition. IEEE Trans. Multimed. 17(5), 626–635 (2015)
Article Google Scholar
Zhang, W., Ngo, C.W.: Topological spatial verification for instance search. IEEE Trans. Multimed. 17(8), 1236–1247 (2015)
Article Google Scholar
Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 549–556 (2007)
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 25–32 (2009)
Dong, W., Wang, Z., Charikar, M., Li, K.: High-confidence near-duplicate image detection. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 1–8 (2012)
Liang, J., Han, Y., Hu, Q.: Semi-supervised image clustering with multi-modal information. Multimed. Syst. 22(2), 149–160 (2016)
Article Google Scholar
Zhu, Z., Xu, C.: Organizing photograghs with geospatial and image semantics. Multimed. Syst. 1–9 (2016)
Wang, X.J., Xu, Z., Zhang, L., Liu, C., Rui, Y.: Towards indexing representative images on the web. In: ACM International Conference on Multimedia, pp. 1229–1238 (2012)
Chen, J., Jin, Q., Bao, S., Su, Z., Chen, S., Yu, Y.: Exploitation and exploration balanced hierarchical summary for landmark images. IEEE Trans. Multimed. 17(10), 1773–1786 (2015)
Article Google Scholar
Rematas, K., Fernando, B., Dellaert, F., Tuytelaars, T.: Dataset fingerprints: exploring image collections through data mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4867–4875 (2015)
Kennedy, L., Chang, S.F.: Internet image archaeology: automatically tracing the manipulation history of photographs on the web. In: ACM International Conference on Multimedia, pp. 349–358 (2008)
Hamzaoui, A., Letessier, P., Joly, A., Buisson, O., Boujemaa, N.: Object Vis. Query Suggest. 68(2), 429–454 (2014)
Google Scholar
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: ARISTA—image search to annotation on billions of web photos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2987–2994 (2010)
Wang, Y., Li, S., Kot, A.: Deepbag: recognizing handbag models. IEEE Trans. Multimed. 17(11), 2072–2083 (2015)
Article Google Scholar
Romberg, S., Pueyo, L., Lienhart, R., van Zwol, R.: Scalable logo recognition in real-world images. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 25.1–25.8 (2011)
Romberg, S., Lienhart, R.: Bundle min-hashing for logo recognition. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 113–120 (2013)
Yuan, J., Wu, Y.: Spatial random partition for common visual pattern discovery. In: IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)
Liu, H., Yan, S.: Common visual pattern discovery via spatially coherent correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1609–1616 (2010)
Quack, T., Ferrari, V., Van Gool, L.: Video mining with frequent itemset configurations. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 360–369 (2006)
Philbin, J., Zisserman, A.: Object mining using a matching graph on very large image collections. In: Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), pp. 738–745 (2008)
Chum, O., Matas, J.: Large scale discovery of spatially related images. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 371–377 (2010)
Article Google Scholar
Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: European Conference on Computer Vision (ECCV), pp. 648–662 (2010)
Chum, O., Matas, J.: Fast computation of min-hash signatures for image collections. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3077–3084 (2012)
Tsai, J.T., Lin, Y.Y., Liao, H.Y.: Per-cluster ensemble kernel learning for multi-modal image clustering with group-dependent feature selection. IEEE Trans. Multimed. 16(8), 2229–2241 (2014)
Article Google Scholar
Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2009)
Xie, L., Tian, Q., Zhou, W., Zhang, B.: Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb. Comput. Vis. Image Understand. 124, 31–41 (2014)
Article Google Scholar
Yan, Y., Liu, G., Wang, S., Zhang, J., Zheng, K.: Graph-based clustering and ranking for diversified image search. Multimed. Syst. 1–12 (2016)
Cao, S., Snavely, N.: Graph-based discriminative learning for location recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 700–707 (2013)
Turcot, P., Lowe, D.: Better matching with fewer features: the selection of useful features in large database recognition problems. In: IEEE International Conference on Computer Vision (ICCV) Workshops (2009)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)
Article Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)
Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)
Article Google Scholar
Noack, A.: Energy models for graph clustering. J. Graph Algorithm Appl. 11(2), 453–480 (2007)
Article MathSciNet MATH Google Scholar
Jacomy, M., Venturini, T., Heymann, S., Bastian, M.: ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for Gephi software. PloS One 9(6), e98,679 (2014)
Zhang, W., Pang, L., Ngo, C.W.: Snap-and-ask: answering multimodal question by naming visual instance. In: ACM International Conference on Multimedia, pp. 609–618 (2012)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2161–2168 (2006)
FuentesPineda, G., Koga, H., Watanabe, T.: Scalable object discovery: a hash-based approach to clustering co-occurring visual words. IEICE Trans. Inf. Syst. 94(10), 2024–2035 (2011)
Article Google Scholar
Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects. In: ACM International Conference on Multimedia, pp. 599–608 (2012)
Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects (with new experiments). Research report, LIRMM (2013)
Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Partial-duplicate clustering and visual pattern discovery on web scale image database. IEEE Trans. Multimed. 17(7), 967–980 (2015)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Li, Y., Liu, L., Shen, C., van den Hengel, A.: Mid-level deep pattern mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 971–980 (2015)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: International Conference on Web and Social Media (ICWSM), pp. 361–362 (2009)
Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16(1), 30–34 (1973)
Article MathSciNet Google Scholar
Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20(4), 364–366 (1977)
Article MathSciNet MATH Google Scholar
van Dongen, S.: A cluster algorithm for graphs. Technical Report, CWI (2000)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems (NIPS), pp. 849–856 (2001)
Schaeffer, S.: Graph Cluster. 1(1), 27–64 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Wei Li, Jianmin Li & Bo Zhang
Microsoft Research, No. 5 Danling Street, Haidian District, Beijing, 100080, China
Changhu Wang
Microsoft Research, One Microsoft Way, Redmond, WA, 98052, USA
Lei Zhang

Authors

Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianmin Li
View author publications
You can also search for this author in PubMed Google Scholar
Changhu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Changhu Wang.

Additional information

Communicated by P. Pala.

The work of W. Li, J. Li, and B. Zhang was supported by the National Basic Research Program (973 Program) of China (No. 2013CB329403), and the National Natural Science Foundation of China (Nos. 61332007, 91420201 and 61620106010).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, W., Li, J., Wang, C. et al. Visual instance mining from the graph perspective. Multimedia Systems 24, 147–162 (2018). https://doi.org/10.1007/s00530-016-0533-6

Download citation

Received: 06 June 2016
Accepted: 23 December 2016
Published: 04 February 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s00530-016-0533-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual instance mining from the graph perspective

Abstract

Access this article

Similar content being viewed by others

Saliency-GD: A TF-IDF Analogy for Landmark Image Mining

Graph-Based Object Class Discovery from Images with Multiple Objects

Stable Visual Pattern Mining via Pattern Probability Distribution

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual instance mining from the graph perspective

Abstract

Access this article

Similar content being viewed by others

Saliency-GD: A TF-IDF Analogy for Landmark Image Mining

Graph-Based Object Class Discovery from Images with Multiple Objects

Stable Visual Pattern Mining via Pattern Probability Distribution

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation