Skip to main content

Jointly Discriminating and Frequent Visual Representation Mining

  • Conference paper
  • First Online:
  • 721 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12624))

Abstract

Discovering visual representation in an image category is a challenging issue, because the visual representation should not only be discriminating but also frequently appears in these images. Previous studies have proposed many solutions, but they all separately optimized the discrimination and frequency, which makes the solutions sub-optimal. To address this issue, we propose a method to discover the jointly discriminating and frequent visual representation, named as JDFR. To ensure discrimination, JDFR employs a classification task with cross-entropy loss. To achieve frequency, JDFR uses triplet loss to optimize within-class and between-class distance, then mines frequent visual representations in feature space. Moreover, we propose an attention module to locate the representative region in the image. Extensive experiments on four benchmark datasets (i.e. CIFAR10, CIFAR100-20, VOC2012-10 and Travel) show that the discovered visual representations have better discrimination and frequency than ones mined from five state-of-the-art methods with average improvements of 7.51% on accuracy and 1.88% on frequency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.tripadvisor.com.

References

  1. Li, Y., Liu, L., Shen, C., Van Den Hengel, A.: Mining mid-level visual patterns with deep CNN activations. IJCV 121, 344–364 (2017)

    Article  MathSciNet  Google Scholar 

  2. Chen, Z., Maffra, F., Sa, I., Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: IROS, pp. 9–16 (2017)

    Google Scholar 

  3. Yang, L., Xie, X., Lai, J.: Learning discriminative visual elements using part-based convolutional neural network. Neurocomputing 316, 135–143 (2018)

    Article  Google Scholar 

  4. Memon, I., Chen, L., Majid, A., Lv, M., Hussain, I., Chen, G.: Travel recommendation using geo-tagged photos in social media for tourist. Wirel. Pers. Commun. 80, 1347–1362 (2015)

    Article  Google Scholar 

  5. Vu, H.Q., Li, G., Law, R., Ye, B.H.: Exploring the travel behaviors of inbound tourists to Hong Kong using geotagged photos. Tour. Manage. 46, 222–232 (2015)

    Article  Google Scholar 

  6. Bronner, F., De Hoog, R.: Vacationers and eWOM : who posts, and why, where, and what? J. Travel Res. 50, 15–26 (2011)

    Article  Google Scholar 

  7. Li, H., Ellis, J.G., Zhang, L., Chang, S.F.: Automatic visual pattern mining from categorical image dataset. Int. J. Multimedia Inf. Retrieval 8, 35–45 (2019)

    Article  Google Scholar 

  8. Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, vol. 2, pp. 1150–1157 (1999)

    Google Scholar 

  9. Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes Paris look like Paris? Commun. ACM 58, 103–110 (2015)

    Article  Google Scholar 

  10. Zhang, W., Cao, X., Wang, R., Guo, Y., Chen, Z.: Binarized mode seeking for scalable visual pattern discovery. In: CVPR, pp. 3864–3872 (2017)

    Google Scholar 

  11. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  12. Tan, Z., Liang, W., Wei, F., Pun, C.M.: Image co-saliency detection by propagating superpixel affinities. In: ICASSP (2013)

    Google Scholar 

  13. Chang, K.Y., Liu, T.L., Lai, S.H.: From co-saliency to co-segmentation: an efficient and fully unsupervised energy minimization model. In: CVPR, pp. 2129–2136 (2011)

    Google Scholar 

  14. Zhang, B., Gao, Y., Zhao, S., Liu, J.: Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE TIP 19, 533–544 (2009)

    MathSciNet  MATH  Google Scholar 

  15. Kim, S., Jin, X., Han, J.: Disiclass: discriminative frequent pattern-based image classification. In: KDD Workshop on Multimedia Data Mining (2010)

    Google Scholar 

  16. Lapuschkin, S., Binder, A., Montavon, G., Muller, K.R., Samek, W.: Analyzing classifiers: Fisher vectors and deep neural networks. In: CVPR, pp. 2912–2920 (2016)

    Google Scholar 

  17. Gong, Y., Pawlowski, M., Yang, F., Brandy, L., Bourdev, L., Fergus, R.: Web scale photo hash clustering on a single machine. In: CVPR, pp. 19–27 (2015)

    Google Scholar 

  18. Chum, O., Matas, J.: Large-scale discovery of spatially related images. IEEE TPAMI 32, 371–377 (2009)

    Article  Google Scholar 

  19. Fu, H., Xu, D., Lin, S., Liu, J.: Object-based RGBD image co-segmentation with mutex constraint. In: CVPR, pp. 4428–4436 (2015)

    Google Scholar 

  20. Fu, H., Xu, D., Zhang, B., Lin, S.: Object-based multiple foreground video co-segmentation. In: CVPR, pp. 3166–3173 (2014)

    Google Scholar 

  21. Tang, K., Joulin, A., Li, L.J., Fei-Fei, L.: Co-localization in real-world images. In: CVPR, 1464–1471 (2014)

    Google Scholar 

  22. Wei, L., Zhao, S., Bourahla, O.E.F., Li, X., Wu, F.: Group-wise deep co-saliency detection. arXiv:1707.07381 (2017)

  23. Fu, H., Cao, X., Tu, Z.: Cluster-based co-saliency detection. IEEE TIP 22, 3766–3778 (2013)

    MathSciNet  MATH  Google Scholar 

  24. Cao, X., Tao, Z., Zhang, B., Fu, H., Feng, W.: Self-adaptively weighted co-saliency detection via rank constraint. IEEE TIP 23, 4175–4186 (2014)

    MathSciNet  MATH  Google Scholar 

  25. Huang, R., Feng, W., Sun, J.: Saliency and co-saliency detection by low-rank multiscale fusion. In: ICME (2015)

    Google Scholar 

  26. Luo, Y., Jiang, M., Wong, Y., Zhao, Q.: Multi-camera saliency. IEEE TPAMI 37, 2057–2070 (2015)

    Article  Google Scholar 

  27. Bors, A.G., Papushoy, A.: Image retrieval based on query by saliency content. In: Benois-Pineau, J., Le Callet, P. (eds.) Visual Content Indexing and Retrieval with Psycho-Visual Models. MSA, pp. 171–209. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57687-9_8

    Chapter  Google Scholar 

  28. Ge, C., Fu, K., Liu, F., Bai, L., Yang, J.: Co-saliency detection via inter and intra saliency propagation. Signal Process. Image Commun. 44, 69–83 (2016)

    Article  Google Scholar 

  29. Li, H., Ngan, K.N.: A co-saliency model of image pairs. IEEE TIP 20, 3365–3375 (2011)

    MathSciNet  MATH  Google Scholar 

  30. Nguyen, H.V., Bai, L.: Cosine similarity metric learning for face verification. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6493, pp. 709–720. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19309-5_55

    Chapter  Google Scholar 

  31. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)

    Google Scholar 

  32. Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928 (2016)

  33. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  34. Shetty, S.: Application of convolutional neural network for image classification on pascal VOC challenge 2012 dataset. arXiv:1607.03785 (2016)

  35. Zhang, K., Li, T., Liu, B., Liu, Q.: Co-saliency detection via mask-guided fully convolutional networks with multi-scale label smoothing. In: CVPR, pp. 3095–3104 (2019)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Science and Technology Plan of Xi’an (20191122015KYPT011JC013), the Fundamental Research Funds of the Central Universities of China (No. JX18001) and the Science Basis Research Program in Shaanxi Province of China (No. 2020JQ-321, 2019JQ-663).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuefeng Liang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 13649 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Q., Zhou, Y., Zhu, Z., Liang, X., Gu, Y. (2021). Jointly Discriminating and Frequent Visual Representation Mining. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69535-4_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69534-7

  • Online ISBN: 978-3-030-69535-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics