Skip to main content

Tri-level Combination for Image Representation

  • Conference paper
  • First Online:
  • 2284 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9916))

Abstract

The context of objects can provide auxiliary discrimination beyond objects. However, this effective information has not been fully explored. In this paper, we propose Tri-level Combination for Image Representation (TriCoIR) to solve the problem at three different levels: object intrinsic, strongly-related context and weakly-related context. Object intrinsic excludes external disturbances and more focuses on the objects themselves. Strongly-related context is cropped from the input image with a more loose bound to contain surrounding context. Weakly-related one is recovered from the image other than object for global context. First, strongly and weakly-related context are constructed from input images. Second, we make cascade transformations for more intrinsical object information, which depends on the consistency between generated global context and input images in the regions other than object. Finally, a joint representation is acquired based on these three level features. The experiments on two benchmark datasets prove the effectiveness of TriCoIR.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chai, Y., Rahtu, E., Lempitsky, V., Gool, L., Zisserman, A.: TriCoS: a tri-level class-discriminative co-segmentation method for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 794–807. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33718-5_57

    Chapter  Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

    Google Scholar 

  3. Oliva, A., Torralba, A.: The role of context in object recognition. Trends Cogn. Sci. 11(12), 520–527 (2007)

    Article  Google Scholar 

  4. Heitz, G., Koller, D.: Learning spatial context: using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88682-2_4

    Chapter  Google Scholar 

  5. Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: CVPR, pp. 1–8 (2008)

    Google Scholar 

  6. Nguyen, M.H., Torresani, L., de la Torre, F., Rother, C.: Weakly supervised discriminative localization, classification: a joint learning process. In: ICCV, pp. 1925–1932 (2009)

    Google Scholar 

  7. Bilen, H., Namboodiri, V.P., Van Gool, L.J.: Object and action classification with latent variables. In: BMVC, p. 3 (2011)

    Google Scholar 

  8. Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: CVPR, pp. 1271–1278 (2009)

    Google Scholar 

  9. Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 1–15. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_1

    Chapter  Google Scholar 

  10. Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 392–407. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10584-0_26

    Google Scholar 

  11. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)

    Google Scholar 

  12. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, p. 4 (2014)

    Google Scholar 

  13. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset, Technical report (2011)

    Google Scholar 

  14. Nilsback, M.-E., Zisserman, A.: A visual vocabulary for flower classification. In: CVPR, pp. 1447–1454 (2006)

    Google Scholar 

  15. Yao, B., Khosla, A., Li, F.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR, pp. 1577–1584 (2011)

    Google Scholar 

  16. Chen, Q., Song, Z., Hua, Y., Huang, Z., Yan, S.: Hierarchical matching with side information for image classification. In: CVPR, pp. 3426–3433 (2012)

    Google Scholar 

  17. Khan, F.S., Weijer, J., Bagdanov, A.D., Vanrell, M.: Portmanteau vocabularies for multi-cue image representation. In: NIPS, pp. 1323–1331 (2011)

    Google Scholar 

  18. Deng, J., Krause, J., Fei-Fei, L.: Fine-grained crowdsourcing for fine-grained recognition. In: CVPR, pp. 580–587 (2013)

    Google Scholar 

  19. Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: NIPS, pp. 244–252 (2010)

    Google Scholar 

  20. Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: NIPS, pp. 3122–3130 (2012)

    Google Scholar 

  21. Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV, pp. 729–736 (2013)

    Google Scholar 

  22. Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: ICCV, pp. 221–228 (2009)

    Google Scholar 

  23. Awais, M., Yan, F., Mikolajczyk, K., Kittler, J.: Two-stage augmented kernel matrix for object recognition. In: Sansone, C., Kittler, J., Roli, F. (eds.) MCS 2011. LNCS, vol. 6713, pp. 137–146. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21557-5_16

    Chapter  Google Scholar 

  24. Yan, F., Mikolajczyk, K., Barnard, M., Cai, H., Kittler, J.: p norm multiple kernel Fisher discriminant analysis for object and image categorisation. In: CVPR, pp. 3626–3632 (2010)

    Google Scholar 

  25. Awais, M., Yan, F., Mikolajczyk, K., Kittler, J.: Augmented kernel matrix vs classifier fusion for object recognition. In: BMVC, p. 60.1 (2011)

    Google Scholar 

Download references

Acknowledgement

This work is supported by National Basic Research Program of China (973 Program): 2012CB316400 and 2015CB351802, National Natural Science Foundation of China: 61303154 and 61332016, the Open Project of Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chunjie Zhang or Qingming Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Li, R., Zhang, C., Huang, Q. (2016). Tri-level Combination for Image Representation. In: Chen, E., Gong, Y., Tie, Y. (eds) Advances in Multimedia Information Processing - PCM 2016. PCM 2016. Lecture Notes in Computer Science(), vol 9916. Springer, Cham. https://doi.org/10.1007/978-3-319-48890-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48890-5_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48889-9

  • Online ISBN: 978-3-319-48890-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics