Skip to main content

Multi-view Recognition Using Weighted View Selection

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9006))

Included in the following conference series:

Abstract

In this paper, we present an algorithm for multi-view recognition in a distributed camera setting that learns which viewpoints are most discriminative for particular instances of ambiguity. Our method is built on top of 2D recognition algorithms and casts view selection as the problem of optimizing kernel weights in multiple kernel learning. The main contribution is a locality-sensitive meta-training step to learn a disambiguation function to select the relative weighting of available viewpoints needed to classify a 2D input example. Our method outperforms related approaches on benchmark multi-view action recognition data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For illustration, Fig. 2 depicts a 2-dimensional feature space, but each step of our procedure can be kernelized, so a direct feature vector representation of the data is not necessary.

  2. 2.

    We were unable to directly compare these results with those previously reported. Unlike with the IXMAS set, there is little agreement in the literature on an experimental protocol for 3DO.

References

  1. Wang, X.: Intelligent multi-camera video surveillance: A review. Pattern Recogn. Lett. 34(1), 3–19 (2012)

    Article  Google Scholar 

  2. Dutta Roy, S., Chaudhury, S., Banerjee, S.: Active recognition through next view planning: a survey. Pattern Recogn. 37, 429–446 (2004)

    Article  Google Scholar 

  3. Arbel, T., Ferrie, F.: Viewpoint selection by navigation through entropy maps. In: Proceedings of the International Conference on Computer Vision, vol. 1, pp. 248–254 (1999)

    Google Scholar 

  4. Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115, 224–241 (2011)

    Article  Google Scholar 

  5. Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104, 249–257 (2006)

    Article  Google Scholar 

  6. Turaga, P., Veeraraghavan, A., Chellappa, R.: Statistical analysis on stiefel and grassmann manifolds with applications in computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

    Google Scholar 

  7. Rudoy, D., Zelnik-Manor, L.: Viewpoint selection for human actions. Int. J. Comput. Vis. 97, 243–254 (2012)

    Article  Google Scholar 

  8. Li, R., Zickler, T.: Discriminative virtual views for cross-view action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2855–2862 (2012)

    Google Scholar 

  9. Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S., Shi, C.: Cross-view action recognition via a continuous virtual path. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2690–2697 (2013)

    Google Scholar 

  10. Souvenir, R., Babbs, J.: Learning the viewpoint manifold for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7 (2008)

    Google Scholar 

  11. Farhadi, A., Tabrizi, M., Endres, I., Forsyth, D.: A latent model of discriminative aspect. In: Proceedings of the International Conference on Computer Vision, pp. 948–955 (2009)

    Google Scholar 

  12. Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 32, 288–303 (2010)

    Article  Google Scholar 

  13. Sharma, A., Kumar, A., Daume, H., Jacobs, D.: Generalized multiview analysis: A discriminative latent space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2160–2167 (2012)

    Google Scholar 

  14. Dhillon, I., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–556. ACM (2004)

    Google Scholar 

  15. Zhang, K., Tsang, I., Kwok, J.: Maximum margin clustering made practical. IEEE Trans. Neural Netw. 20, 583–596 (2009)

    Article  MATH  Google Scholar 

  16. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)

    Google Scholar 

  17. Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268 (2011)

    MATH  MathSciNet  Google Scholar 

  18. Wu, X., Xu, D., Duan, L., Luo, J.: Action recognition using context and appearance distribution features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 489–496 (2011)

    Google Scholar 

  19. Levinboim, T., Sha, F.: Learning the kernel matrix with low-rank multiplicative shaping. In: Proceedings of the National Conference on Artificial Intelligence (2012)

    Google Scholar 

  20. Cortes, C., Mohri, M., Rostamizadeh, A.: Learning non-linear combinations of kernels. In: Advances in Neural Information Processing Systems 22, pp. 396–404 (2009)

    Google Scholar 

  21. Yang, J., Li, Y., Tian, Y., Duan, L., Gao, W.: Group-sensitive multiple kernel learning for object categorization. In: Proceedings of the International Conference on Computer Vision, pp. 436–443 (2009)

    Google Scholar 

  22. Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)

    Article  Google Scholar 

  23. Gkalelis, N., Kim, H., Hilton, A., Nikolaidis, N., Pitas, I.: The i3dpost multi-view and 3d human action/interaction database. In: Conference for Visual Media Production, pp. 159–168. IEEE (2009)

    Google Scholar 

  24. Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 548–561. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  25. Ling, H., Okada, K.: Diffusion distance for histogram comparison. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 246–253. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  26. Lin, H.T., Lin, C.J., Weng, R.C.: A note on platts probabilistic outputs for support vector machines. Mach. Learn. 68, 267–276 (2007)

    Article  Google Scholar 

  27. Zhu, F., Shao, L., Lin, M.: Multi-view action recognition using local similarity random forests and sensor fusion. Pattern Recogn. Lett. 24, 20–24 (2012)

    Google Scholar 

  28. Parrigan, K., Souvenir, R.: Aggregating low-level features for human action recognition. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammoud, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) ISVC 2010, Part I. LNCS, vol. 6453, pp. 143–152. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  29. Holte, M.B., Chakraborty, B., Gonzalez, J., Moeslund, T.B.: A local 3-d motion descriptor for multi-view human action recognition from 4-d spatio-temporal interest points. IEEE J. Sel. Top. Sig. Process. 6, 553–565 (2012)

    Article  Google Scholar 

  30. Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: Proceedings of the International Conference on Computer Vision, pp. 1–7 (2007)

    Google Scholar 

  31. Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: Proceedings of the International Conference on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Scott Spurlock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Spurlock, S., Wu, H., Souvenir, R. (2015). Multi-view Recognition Using Weighted View Selection. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16817-3_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16816-6

  • Online ISBN: 978-3-319-16817-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics