Jointly Discriminating and Frequent Visual Representation Mining

Wang, Qiannan; Zhou, Ying; Zhu, Zhaoyan; Liang, Xuefeng; Gu, Yu

doi:10.1007/978-3-030-69535-4_22

Jointly Discriminating and Frequent Visual Representation Mining

Qiannan Wang¹²,
Ying Zhou¹²,
Zhaoyan Zhu¹²,
Xuefeng Liang ORCID: orcid.org/0000-0002-1448-0477¹² &
…
Yu Gu¹²

Conference paper
First Online: 25 February 2021

721 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12624))

Abstract

Discovering visual representation in an image category is a challenging issue, because the visual representation should not only be discriminating but also frequently appears in these images. Previous studies have proposed many solutions, but they all separately optimized the discrimination and frequency, which makes the solutions sub-optimal. To address this issue, we propose a method to discover the jointly discriminating and frequent visual representation, named as JDFR. To ensure discrimination, JDFR employs a classification task with cross-entropy loss. To achieve frequency, JDFR uses triplet loss to optimize within-class and between-class distance, then mines frequent visual representations in feature space. Moreover, we propose an attention module to locate the representative region in the image. Extensive experiments on four benchmark datasets (i.e. CIFAR10, CIFAR100-20, VOC2012-10 and Travel) show that the discovered visual representations have better discrimination and frequency than ones mined from five state-of-the-art methods with average improvements of 7.51% on accuracy and 1.88% on frequency.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.tripadvisor.com.

References

Li, Y., Liu, L., Shen, C., Van Den Hengel, A.: Mining mid-level visual patterns with deep CNN activations. IJCV 121, 344–364 (2017)
Article MathSciNet Google Scholar
Chen, Z., Maffra, F., Sa, I., Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: IROS, pp. 9–16 (2017)
Google Scholar
Yang, L., Xie, X., Lai, J.: Learning discriminative visual elements using part-based convolutional neural network. Neurocomputing 316, 135–143 (2018)
Article Google Scholar
Memon, I., Chen, L., Majid, A., Lv, M., Hussain, I., Chen, G.: Travel recommendation using geo-tagged photos in social media for tourist. Wirel. Pers. Commun. 80, 1347–1362 (2015)
Article Google Scholar
Vu, H.Q., Li, G., Law, R., Ye, B.H.: Exploring the travel behaviors of inbound tourists to Hong Kong using geotagged photos. Tour. Manage. 46, 222–232 (2015)
Article Google Scholar
Bronner, F., De Hoog, R.: Vacationers and eWOM : who posts, and why, where, and what? J. Travel Res. 50, 15–26 (2011)
Article Google Scholar
Li, H., Ellis, J.G., Zhang, L., Chang, S.F.: Automatic visual pattern mining from categorical image dataset. Int. J. Multimedia Inf. Retrieval 8, 35–45 (2019)
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, vol. 2, pp. 1150–1157 (1999)
Google Scholar
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes Paris look like Paris? Commun. ACM 58, 103–110 (2015)
Article Google Scholar
Zhang, W., Cao, X., Wang, R., Guo, Y., Chen, Z.: Binarized mode seeking for scalable visual pattern discovery. In: CVPR, pp. 3864–3872 (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Tan, Z., Liang, W., Wei, F., Pun, C.M.: Image co-saliency detection by propagating superpixel affinities. In: ICASSP (2013)
Google Scholar
Chang, K.Y., Liu, T.L., Lai, S.H.: From co-saliency to co-segmentation: an efficient and fully unsupervised energy minimization model. In: CVPR, pp. 2129–2136 (2011)
Google Scholar
Zhang, B., Gao, Y., Zhao, S., Liu, J.: Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE TIP 19, 533–544 (2009)
MathSciNet MATH Google Scholar
Kim, S., Jin, X., Han, J.: Disiclass: discriminative frequent pattern-based image classification. In: KDD Workshop on Multimedia Data Mining (2010)
Google Scholar
Lapuschkin, S., Binder, A., Montavon, G., Muller, K.R., Samek, W.: Analyzing classifiers: Fisher vectors and deep neural networks. In: CVPR, pp. 2912–2920 (2016)
Google Scholar
Gong, Y., Pawlowski, M., Yang, F., Brandy, L., Bourdev, L., Fergus, R.: Web scale photo hash clustering on a single machine. In: CVPR, pp. 19–27 (2015)
Google Scholar
Chum, O., Matas, J.: Large-scale discovery of spatially related images. IEEE TPAMI 32, 371–377 (2009)
Article Google Scholar
Fu, H., Xu, D., Lin, S., Liu, J.: Object-based RGBD image co-segmentation with mutex constraint. In: CVPR, pp. 4428–4436 (2015)
Google Scholar
Fu, H., Xu, D., Zhang, B., Lin, S.: Object-based multiple foreground video co-segmentation. In: CVPR, pp. 3166–3173 (2014)
Google Scholar
Tang, K., Joulin, A., Li, L.J., Fei-Fei, L.: Co-localization in real-world images. In: CVPR, 1464–1471 (2014)
Google Scholar
Wei, L., Zhao, S., Bourahla, O.E.F., Li, X., Wu, F.: Group-wise deep co-saliency detection. arXiv:1707.07381 (2017)
Fu, H., Cao, X., Tu, Z.: Cluster-based co-saliency detection. IEEE TIP 22, 3766–3778 (2013)
MathSciNet MATH Google Scholar
Cao, X., Tao, Z., Zhang, B., Fu, H., Feng, W.: Self-adaptively weighted co-saliency detection via rank constraint. IEEE TIP 23, 4175–4186 (2014)
MathSciNet MATH Google Scholar
Huang, R., Feng, W., Sun, J.: Saliency and co-saliency detection by low-rank multiscale fusion. In: ICME (2015)
Google Scholar
Luo, Y., Jiang, M., Wong, Y., Zhao, Q.: Multi-camera saliency. IEEE TPAMI 37, 2057–2070 (2015)
Article Google Scholar
Bors, A.G., Papushoy, A.: Image retrieval based on query by saliency content. In: Benois-Pineau, J., Le Callet, P. (eds.) Visual Content Indexing and Retrieval with Psycho-Visual Models. MSA, pp. 171–209. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57687-9_8
Chapter Google Scholar
Ge, C., Fu, K., Liu, F., Bai, L., Yang, J.: Co-saliency detection via inter and intra saliency propagation. Signal Process. Image Commun. 44, 69–83 (2016)
Article Google Scholar
Li, H., Ngan, K.N.: A co-saliency model of image pairs. IEEE TIP 20, 3365–3375 (2011)
MathSciNet MATH Google Scholar
Nguyen, H.V., Bai, L.: Cosine similarity metric learning for face verification. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6493, pp. 709–720. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19309-5_55
Chapter Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)
Google Scholar
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928 (2016)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Shetty, S.: Application of convolutional neural network for image classification on pascal VOC challenge 2012 dataset. arXiv:1607.03785 (2016)
Zhang, K., Li, T., Liu, B., Liu, Q.: Co-saliency detection via mask-guided fully convolutional networks with multi-scale label smoothing. In: CVPR, pp. 3095–3104 (2019)
Google Scholar

Download references

Acknowledgments

This work is supported by the Science and Technology Plan of Xi’an (20191122015KYPT011JC013), the Fundamental Research Funds of the Central Universities of China (No. JX18001) and the Science Basis Research Program in Shaanxi Province of China (No. 2020JQ-321, 2019JQ-663).

Author information

Authors and Affiliations

School of Artificial Intelligence, Xidian University, Xi’an, China
Qiannan Wang, Ying Zhou, Zhaoyan Zhu, Xuefeng Liang & Yu Gu

Authors

Qiannan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Liang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Gu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuefeng Liang .

Editor information

Editors and Affiliations

Waseda University, Tokyo, Japan
Hiroshi Ishikawa
Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Pajdla
University of Pennsylvania, Philadelphia, PA, USA
Jianbo Shi

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 13649 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Q., Zhou, Y., Zhu, Z., Liang, X., Gu, Y. (2021). Jointly Discriminating and Frequent Visual Representation Mining. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-69535-4_22
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69534-7
Online ISBN: 978-3-030-69535-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics