Skip to main content
Log in

Boosting scene understanding by hierarchical pachinko allocation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Scene understanding is a popular research direction. In this area, many attempts focus on the problem of naming objects in the complex natural scene, and visual semantic integration model (VSIM) is the representative. This model consists of two parts: semantic level and visual level. In the first level, it uses a four-level pachinko allocation model (PAM) to capture the semantics behind images. However, this four-level PAM is inflexible and lacks of considerations of common subtopics that represent the background semantics. To address these problems, we use hierarchical PAM (hPAM) to replace PAM. Since hPAM is flexible, we investigate two variations of hPAM to boost VSIM in this paper. We derive the Gibbs sampler to learn the proposed models. Empirical results validate that our works can obtain better performance than the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. downloaded at http://cs.brown.edu/pff/

  2. downloaded at http://vision.stanford.edu/projects/totalscene/

  3. downloaded at http://people.csail.mit.edu/myungjin/HContext.html

References

  1. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  2. Boureau YL, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognitionral scene categories. In: Conference on Computer Vision and Pattern Recognition, pp 2559–2566

  3. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recog 37(9):1757–1771

    Article  Google Scholar 

  4. Chakraborty I, Elgammal A (2013) Visual-semantic scene understanding by sharing labels in a context network. CoRR

  5. Choi MJ, Lim JJ, Torralba A, Willsky AS (2010) Exploiting hierarchical context on a large database of object categories. In: Conference on Computer Vision and Pattern Recognition, pp 129–136

  6. Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: Conference on Computer Vision and Pattern Recognition, pp 524–531

  7. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  8. Frnkranz J, Hllermeier E, Menca EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153

    Article  Google Scholar 

  9. Griffiths TL, Steyvers M (2004) Finding scientific topics. In: National academy of Sciences of the United States of America, vol. 101, pp 5228–5235

  10. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Conference on Computer Vision and Pattern Recognition, pp 2169–2178

  11. Li LJ, Socher R, Li FF (2009) Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: Conference on Computer Vision and Pattern Recognition, pp 2036–2043

  12. Li W, McCallum A (2006) Pachinko allocation: Dag-structured mixture models of topic correlations. In: International Conference on Machine Learning, pp 577–584

  13. Liu L, Wang L, Liu X (2011) In defense of soft-assignment coding. In: International Conference on Computer Vision, pp 2486–2493

  14. Malisiewicz TJ, Huang JC, Efros AA (2006) Detecting objects via multiple segmentations and latent topic models. Carnegie Mellon University Tech Report

  15. Mimno D, Li W, McCallum A (2007) Mixtures of hierarchical topics with pachinko allocation. In: International Conference on Machine Learning, pp 633–640

  16. Rasiwasia N, Vasconcelos N (2013) Latent Dirichlet allocation models for image classification. IEEE Trans Pattern Anal Mach Intell 35(11):2665–2679

    Article  Google Scholar 

  17. Russakovsky O, Lin Y, Yu K, Fei-Fei L (2012) Locality-constrained linear coding for image classification. In: European conference on Computer Vision, pp 1–15

  18. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Conference on Computer Vision and Pattern Recognition, pp 3360–3367

  19. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Conference on Computer Vision and Pattern Recognition, pp 1794–1801

  20. Yang Y, Huang Z, Shen HT, Zhou X (2011) Mining multi-tag association for image tagging. World Wide Web J 14(2):133–156

    Article  Google Scholar 

  21. Yang Y, Huang Z, Yang Y, Shen HT, Luo J (2013) Local image tagging via graph regularized joint group sparsity. Pattern Recog 46(5):1358–1368

    Article  MATH  Google Scholar 

  22. Yang Y, Yang Y, Shen HT (2013) Effective transfer tagging from image to video. ACM Trans Multimedia Comput Commun Appl 9 (2). Article No. 14

  23. Yang Y, Zha ZJ, Gao Y, Zhu X, Chua TS (2014) Exploiting web images for robust semantic video indexing via sample-specific loss. IEEE Trans Multimedia 16(6):1677–1689

    Article  Google Scholar 

  24. Zhang L, Gao Y, Hong C, Feng Y, Zhu J, Cai D (2014) Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans Cybernetics 44(8): 1408–1419

    Article  Google Scholar 

  25. Zhang L, Gao Y, Xia Y, Dai Q, Li X (2014) A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. IEEE Transactions on Industrial Electronics

  26. Zhang L, Han Y, Yang Y, Song M, Yan S, Tian Q (2013) Discovering discrminative graphlets for aerial image categories recognition. IEEE Trans Image Process 22 (12):5071–5084

    Article  MathSciNet  Google Scholar 

  27. Zhang L, Ji R, Xia Y, Zhang Y, Li X (2014) Learning a probabilistic topology discovering model for scene categorization. IEEE Transactions on Neural Networks and Learning Systems PP(99)

  28. Zhang L, Song M, Deng X, Bu J, Chen C (2011) Large-scale outdoor scene classification by boosting a set of highly discriminative and low redundant graphlets. In: IEEE International Conference on Data Mining Workshops, pp 847–852

Download references

Acknowledgments

This work was supported by National Nature Science Foundation of China (NSFC) under the Grant No. 61170092, 61133011, and 61103091.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongtu Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouyang, J., Li, X. & Li, H. Boosting scene understanding by hierarchical pachinko allocation. Multimed Tools Appl 75, 12581–12595 (2016). https://doi.org/10.1007/s11042-014-2414-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2414-3

Keywords

Navigation