Skip to main content
Log in

Visual instance mining from the graph perspective

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

In this paper, we address the problem of visual instance mining, which is to automatically discover frequently appearing visual instances from a large collection of images. We propose a scalable mining method by leveraging the graph structure with images as vertices. Different from most existing approaches that focus on either instance-level similarities or image-level context properties, our method captures both information. In the proposed framework, the instance-level information is integrated during the construction of a sparse instance graph based on the similarity between augmented local features, while the image-level context is explored with a greedy breadth-first search algorithm to discover clusters of visual instances from the graph. This framework can tackle the challenges brought by small visual instances, diverse intra-class variations, as well as noise in large-scale image databases. To further improve the robustness, we integrate two techniques into the basic framework. First, to better cope with the increasing noise of large databases, weak geometric consistency is adopted to efficiently combine the geometric information of local matches into the construction of the instance graph. Second, we propose the layout embedding algorithm, which leverages the algorithm originally designed for graph visualization to fully explore the image database structure. The proposed method was evaluated on four annotated data sets with different characteristics, and experimental results showed the superiority over state-of-the-art algorithms on all data sets. We also applied our framework on a one-million Flickr data set and proved its scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Wang, H., Zhao, G., Yuan, J.: Visual pattern discovery in image and video data: a brief survey. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 4(1), 24–37 (2014)

    Article  Google Scholar 

  2. Zhang, W., Li, H., Ngo, C.W., Chang, S.F.: Scalable visual instance mining with threads of features. In: ACM International Conference on Multimedia, pp. 297–306 (2014)

  3. Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Scalable visual instance mining with instance graph. In: British Machine Vision Conference (BMVC), pp. 98.1–98.11 (2015)

  4. Weng, C., Yuan, J.: Efficient mining of optimal AND/OR patterns for visual recognition. IEEE Trans. Multimed. 17(5), 626–635 (2015)

    Article  Google Scholar 

  5. Zhang, W., Ngo, C.W.: Topological spatial verification for instance search. IEEE Trans. Multimed. 17(8), 1236–1247 (2015)

    Article  Google Scholar 

  6. Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 549–556 (2007)

  7. Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 25–32 (2009)

  8. Dong, W., Wang, Z., Charikar, M., Li, K.: High-confidence near-duplicate image detection. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 1–8 (2012)

  9. Liang, J., Han, Y., Hu, Q.: Semi-supervised image clustering with multi-modal information. Multimed. Syst. 22(2), 149–160 (2016)

    Article  Google Scholar 

  10. Zhu, Z., Xu, C.: Organizing photograghs with geospatial and image semantics. Multimed. Syst. 1–9 (2016)

  11. Wang, X.J., Xu, Z., Zhang, L., Liu, C., Rui, Y.: Towards indexing representative images on the web. In: ACM International Conference on Multimedia, pp. 1229–1238 (2012)

  12. Chen, J., Jin, Q., Bao, S., Su, Z., Chen, S., Yu, Y.: Exploitation and exploration balanced hierarchical summary for landmark images. IEEE Trans. Multimed. 17(10), 1773–1786 (2015)

    Article  Google Scholar 

  13. Rematas, K., Fernando, B., Dellaert, F., Tuytelaars, T.: Dataset fingerprints: exploring image collections through data mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4867–4875 (2015)

  14. Kennedy, L., Chang, S.F.: Internet image archaeology: automatically tracing the manipulation history of photographs on the web. In: ACM International Conference on Multimedia, pp. 349–358 (2008)

  15. Hamzaoui, A., Letessier, P., Joly, A., Buisson, O., Boujemaa, N.: Object Vis. Query Suggest. 68(2), 429–454 (2014)

    Google Scholar 

  16. Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: ARISTA—image search to annotation on billions of web photos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2987–2994 (2010)

  17. Wang, Y., Li, S., Kot, A.: Deepbag: recognizing handbag models. IEEE Trans. Multimed. 17(11), 2072–2083 (2015)

    Article  Google Scholar 

  18. Romberg, S., Pueyo, L., Lienhart, R., van Zwol, R.: Scalable logo recognition in real-world images. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 25.1–25.8 (2011)

  19. Romberg, S., Lienhart, R.: Bundle min-hashing for logo recognition. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 113–120 (2013)

  20. Yuan, J., Wu, Y.: Spatial random partition for common visual pattern discovery. In: IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)

  21. Liu, H., Yan, S.: Common visual pattern discovery via spatially coherent correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1609–1616 (2010)

  22. Quack, T., Ferrari, V., Van Gool, L.: Video mining with frequent itemset configurations. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 360–369 (2006)

  23. Philbin, J., Zisserman, A.: Object mining using a matching graph on very large image collections. In: Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), pp. 738–745 (2008)

  24. Chum, O., Matas, J.: Large scale discovery of spatially related images. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 371–377 (2010)

    Article  Google Scholar 

  25. Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: European Conference on Computer Vision (ECCV), pp. 648–662 (2010)

  26. Chum, O., Matas, J.: Fast computation of min-hash signatures for image collections. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3077–3084 (2012)

  27. Tsai, J.T., Lin, Y.Y., Liao, H.Y.: Per-cluster ensemble kernel learning for multi-modal image clustering with group-dependent feature selection. IEEE Trans. Multimed. 16(8), 2229–2241 (2014)

    Article  Google Scholar 

  28. Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2009)

  29. Xie, L., Tian, Q., Zhou, W., Zhang, B.: Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb. Comput. Vis. Image Understand. 124, 31–41 (2014)

    Article  Google Scholar 

  30. Yan, Y., Liu, G., Wang, S., Zhang, J., Zheng, K.: Graph-based clustering and ranking for diversified image search. Multimed. Syst. 1–12 (2016)

  31. Cao, S., Snavely, N.: Graph-based discriminative learning for location recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 700–707 (2013)

  32. Turcot, P., Lowe, D.: Better matching with fewer features: the selection of useful features in large database recognition problems. In: IEEE International Conference on Computer Vision (ICCV) Workshops (2009)

  33. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  34. Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)

    Article  Google Scholar 

  35. Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)

  36. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)

  37. Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)

    Article  Google Scholar 

  38. Noack, A.: Energy models for graph clustering. J. Graph Algorithm Appl. 11(2), 453–480 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  39. Jacomy, M., Venturini, T., Heymann, S., Bastian, M.: ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for Gephi software. PloS One 9(6), e98,679 (2014)

  40. Zhang, W., Pang, L., Ngo, C.W.: Snap-and-ask: answering multimodal question by naming visual instance. In: ACM International Conference on Multimedia, pp. 609–618 (2012)

  41. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)

  42. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2161–2168 (2006)

  43. FuentesPineda, G., Koga, H., Watanabe, T.: Scalable object discovery: a hash-based approach to clustering co-occurring visual words. IEICE Trans. Inf. Syst. 94(10), 2024–2035 (2011)

    Article  Google Scholar 

  44. Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects. In: ACM International Conference on Multimedia, pp. 599–608 (2012)

  45. Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects (with new experiments). Research report, LIRMM (2013)

  46. Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Partial-duplicate clustering and visual pattern discovery on web scale image database. IEEE Trans. Multimed. 17(7), 967–980 (2015)

    Article  Google Scholar 

  47. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)

  48. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)

  49. Li, Y., Liu, L., Shen, C., van den Hengel, A.: Mid-level deep pattern mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 971–980 (2015)

  50. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: International Conference on Web and Social Media (ICWSM), pp. 361–362 (2009)

  51. Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16(1), 30–34 (1973)

    Article  MathSciNet  Google Scholar 

  52. Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20(4), 364–366 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  53. van Dongen, S.: A cluster algorithm for graphs. Technical Report, CWI (2000)

  54. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems (NIPS), pp. 849–856 (2001)

  55. Schaeffer, S.: Graph Cluster. 1(1), 27–64 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changhu Wang.

Additional information

Communicated by P. Pala.

The work of W. Li, J. Li, and B. Zhang was supported by the National Basic Research Program (973 Program) of China (No. 2013CB329403), and the National Natural Science Foundation of China (Nos. 61332007, 91420201 and 61620106010).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Li, J., Wang, C. et al. Visual instance mining from the graph perspective. Multimedia Systems 24, 147–162 (2018). https://doi.org/10.1007/s00530-016-0533-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-016-0533-6

Keywords

Navigation