Boosting Zero-Shot Image Classification via Pairwise Relationship Learning

Li, Hanhui; Wu, Hefeng; Lin, Shujin; Lin, Liang; Luo, Xiaonan; Izquierdo, Ebroul

doi:10.1007/978-3-319-54181-5_6

Hanhui Li^17,18,
Hefeng Wu^17,19,
Shujin Lin¹⁷,
Liang Lin¹⁸,
Xiaonan Luo^17,20 &
…
Ebroul Izquierdo²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10111))

Included in the following conference series:

Asian Conference on Computer Vision

3213 Accesses
1 Citations

Abstract

Zero-shot image classification (ZSIC) is one of the emerging challenges in the communities of computer vision, artificial intelligence and machine learning. In this paper, we propose to exploit the pairwise relationships between test instances to increase the performance of conventional methods, e.g. direct attribute prediction (DAP), for the ZSIC problem. To infer pairwise relationships between test instances, we introduce two different methods, a binary classification based method and a metric learning based method. Based on the inferred relationships, we construct a similarity graph to represent test instances, and then employ an adaptive graph anchors voting method to refine the results of DAP iteratively: In each iteration, we partition the similarity graph with the normalized spectral clustering method, and determine the class label of each cluster via the voting of graph anchors. Extensive experiments validate the effectiveness of our method: with the properly learned pairwise relationships, we successfully boost the mean class accuracy of DAP on two standard benchmarks for the ZSIC problem, Animal with Attribute and aPascal-aYahoo, from \(57.46\%\) to \(84.43\%\) and \(26.59\%\) to \(70.09\%\), respectively. Besides, experimental results on the SUN Attribute also suggest our method can obtain considerable performance improvement for the large-scale ZSIC problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 951–958 (2009)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1778–1785(2009)
Google Scholar
Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised Learning. MIT Press, Cambridge (2006)
Book Google Scholar
Hu, J., Lu, J., Tan, Y.: Discriminative deep metric learning for face verification in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1875–1882(2014)
Google Scholar
Li, H., Li, D., Luo, X.: BAP: bimodal attribute prediction for zero-shot image categorization. In: Proceedings of the ACM International Conference on Multimedia, pp. 1013–1016 (2014)
Google Scholar
Maggini, M., Melacci, S., Sarti, L.: Learning from pairwise constraints by similarity neural networks. Neural Netw. 26, 141–158 (2012)
Article Google Scholar
Li, Z., Liu, J., Tang, X.: Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML), pp. 576–583 (2008)
Google Scholar
Baghshah, M.S., Shouraki, S.B.: Semi-supervised metric learning using pairwise constraints. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), pp. 1217–1222 (2009)
Google Scholar
Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 461–470 (2010)
Google Scholar
Hong, S., Choi, J., Feyereisl, J., Han, B., Davis, L.S.: Joint image clustering and labeling by matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 38, 1411–1424 (2016)
Article Google Scholar
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 819–826 (2013)
Google Scholar
Romera-Paredes, B., Torr, P.H.S.: An embarrassingly simple approach to zero-shot learning. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 2152–2161 (2015)
Google Scholar
Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 3464–3472 (2014)
Google Scholar
Fu, Z., Xiang, T.A., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic manifold distance. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2635–2644 (2015)
Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 4166–4174 (2015)
Google Scholar
Mensink, T., Gavves, E., Snoek, C.G.M.: COSTA: co-occurrence statistics for zero-shot classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2441–2448 (2014)
Google Scholar
Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2584–2591 (2013)
Google Scholar
Da, Q., Yu, Y., Zhou, Z.: Learning with augmented class by exploiting unlabeled data. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1760–1766 (2014)
Google Scholar
Guo, Y., Ding, G., Jin, X., Wang, J.: Transductive zero-shot recognition via shared model space learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 3434–3500 (2016)
Google Scholar
Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 2145–2151 (2016)
Google Scholar
Gan, C., Lin, M., Yang, Y., de Melo, G., Hauptmann, A.G.: Concepts not alone: exploring pairwise relationships for zero-shot video activity recognition. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 3487–3493 (2016)
Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6034–6042 (2016)
Google Scholar
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 935–943 (2013)
Google Scholar
Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2332–2345 (2015)
Article Google Scholar
Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R.J., Darrell, T., Saenko, K.: YouTube2Text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2712–2719 (2013)
Google Scholar
Cheng, H., Griss, M.L., Davis, P., Li, J., You, D.: Towards zero-shot learning for human activity recognition using semantic attribute sequence model. In: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 355–358 (2013)
Google Scholar
Chang, X., Yang, Y., Hauptmann, A.G., Xing, E.P., Yu, Y.: Semantic concept discovery for large-scale zero-shot event detection. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pp. 2234–2240 (2015)
Google Scholar
Wu, S., Bondugula, S., Luisier, F., Zhuang, X., Natarajan, P.: Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, pp. 2665–2672, 23–28 June 2014
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 453–465 (2014)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, United States, pp. 1106–1114, 3–6 December 2012
Google Scholar
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML), Corvallis, Oregon, USA, pp. 209–216, 20–24 June 2007
Google Scholar
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)
Article MathSciNet Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, pp. 849–856, 3–8 December 2001
Google Scholar
Liu, W., He, J., Chang, S.: Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, pp. 679–686, 21–24 June 2010
Google Scholar
Patterson, G., Xu, C., Su, H., Hays, J.: The SUN attribute database: beyond categories for deeper scene understanding. Int. J. Comput. Vision 108, 59–81 (2014)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Li, X., Guo, Y., Schuurmans, D.: Semi-supervised zero-shot classification with label representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4211–4219 (2015)
Google Scholar
Escorcia, V., Niebles, J.C., Ghanem, B.: On the relationship between visual attributes and convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1256–1264 (2015)
Google Scholar

Download references

Acknowledgement

This research is supported by National Natural Science Foundation of China (61320106008, 61232011, 61402120, 61572531, 61622214), Educational Commission of Guangdong Province (2013CXZDB001), and Natural Science Foundation of Guangdong Province (2014A030310348). The corresponding author is Hefeng Wu.

Author information

Authors and Affiliations

National Engineering Research Center of Digital Life, Sun Yat-sen University, Guangzhou, 510006, China
Hanhui Li, Hefeng Wu, Shujin Lin & Xiaonan Luo
School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510006, China
Hanhui Li & Liang Lin
School of Informatics, Guangdong University of Foreign Studies, Guangzhou, 510006, China
Hefeng Wu
Beijing Key Laboratory of Multimedia and Intelligent Software Technology, College of Metropolitan Transportation, Beijing University of Technology, Beijing, 100124, China
Xiaonan Luo
Queen Mary, University of London, London, UK
Ebroul Izquierdo

Authors

Hanhui Li
View author publications
You can also search for this author in PubMed Google Scholar
Hefeng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shujin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Liang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xiaonan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Ebroul Izquierdo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hefeng Wu .

Editor information

Editors and Affiliations

National Tsing Hua University, Hsinchu, Taiwan
Shang-Hong Lai
Graz University of Technology, Graz, Austria
Vincent Lepetit
Drexel University, Philadelphia, Pennsylvania, USA
Ko Nishino
The University of Tokyo, Tokyo, Japan
Yoichi Sato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Wu, H., Lin, S., Lin, L., Luo, X., Izquierdo, E. (2017). Boosting Zero-Shot Image Classification via Pairwise Relationship Learning. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10111. Springer, Cham. https://doi.org/10.1007/978-3-319-54181-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-54181-5_6
Published: 10 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54180-8
Online ISBN: 978-3-319-54181-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics