Skip to main content
Log in

Viewpoint Selection for Human Actions

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In many scenarios a dynamic scene is filmed by multiple video cameras located at different viewing positions. Visualizing such multi-view data on a single display raises an immediate question—which cameras capture better views of the scene? Typically, (e.g. in TV broadcasts) a human producer manually selects the best view. In this paper we wish to automate this process by evaluating the quality of a view, captured by every single camera. We regard human actions as three-dimensional shapes induced by their silhouettes in the space-time volume. The quality of a view is then evaluated based on features of the space-time shape, which correspond with limb visibility. Resting on these features, two view quality approaches are proposed. One is generic while the other can be trained to fit any preferred action recognition method. Our experiments show that the proposed view selection provide intuitive results which match common conventions. We further show that it improves action recognition results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Assa, J., Cohen-Or, D., Yeh, I. C., & Lee, T. Y. (2008). Motion overview of human actions. In International conference on computer graphics and interactive techniques, New York, NY, USA. New York: ACM Press.

    Google Scholar 

  • Assa, J., Wolf, L., & Cohen-Or, D. (2010). The virtual director: a correlation-based online viewing of human motion. In Eurographics.

    Google Scholar 

  • Attneave, F. (1954). Some informational aspects of visual perception. Psychological Review, 61(3), 183–193.

    Article  Google Scholar 

  • Ballan, L., Brostow, G. J., Puwein, J., & Pollefeys, M. (2010). Unstructured video-based rendering: interactive exploration of casually captured videos. In ACM SIGGRAPH 2010 papers (pp. 1–11). New York: ACM Press.

    Chapter  Google Scholar 

  • Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 257–267.

    Article  Google Scholar 

  • Bordoloi, U., & Shen, H. W. (2005). View selection for volume rendering. In IEEE Visualization (Vol. 5, pp. 487–494). New York: IEEE Press.

    Google Scholar 

  • Christie, M., Olivier, P., & Normand, J. M. (2008). Camera control in computer graphics. Computer Graphics Forum, 27, 2197–2218.

    Article  Google Scholar 

  • El-Alfy, H., Jacobs, D., & Davis, L. (2009). Assigning cameras to subjects in video surveillance systems. In Proceedings of the 2009 IEEE international conference on robotics and automation (pp. 3623–3629). New York: IEEE Press.

    Google Scholar 

  • Feldman, J., & Singh, M. (2005). Information along contours and object boundaries. Psychological Review, 112(1), 243–252.

    Article  Google Scholar 

  • Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2247–2253.

    Article  Google Scholar 

  • Goshorn, R., Goshorn, J., Goshorn, D., & Aghajan, H. (2007). Architecture for cluster-based automated surveillance network for detecting and tracking multiple persons. In 1st int. conf. on distributed smart cameras (ICDSC).

    Google Scholar 

  • IXMAS (2006). http://charibdis.inrialpes.fr.

  • Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. In Perceiving events and objects.

    Google Scholar 

  • Junejo, I. N., Dexter, E., Laptev, I., & Pérez, P. (2010). View-independent action recognition from temporal self-similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 172–185. doi:10.1109/TPAMI.2010.68.

    Article  Google Scholar 

  • Kindlmann, G., Whitaker, R., Tasdizen, T., & Moller, T. (2003). Curvature-based transfer functions for direct volume rendering: methods and applications. In Proceedings of the 14th IEEE visualization 2003 (p. 67). Los Alamitos: IEEE Computer Society.

    Google Scholar 

  • Krahnstoever, N., Yu, T., Lim, S. N., Patwardhan, K., & Tu, P. (2008). Collaborative real-time control of active cameras in large scale surveillance systems. In Workshop on multi-camera and multi-modal sensor fusion algorithms and applications-M2SFA2.

    Google Scholar 

  • Lee, C. H., Varshney, A., & Jacobs, D. W. (2005). Mesh saliency. ACM Transactions on Graphics, 24(3), 659–666.

    Article  Google Scholar 

  • Mudge, M., Ryan, N., & Scopigno, R. (2005). Viewpoint quality and scene understanding. In Vast 2005 (p. 67).

    Google Scholar 

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge: MIT Press.

    Google Scholar 

  • Tran, D., & Sorokin, A. (2008). Human activity recognition with metric learning. In Proceedings of the 10th European conference on computer vision: part I (p. 561). Berlin: Springer.

    Google Scholar 

  • Vazquez, P. P., Feixas, M., Sbert, M., & Heidrich, W. (2003). Automatic view selection using viewpoint entropy and its application to image-based modelling. Computer Graphics Forum, 22, 689–700.

    Article  Google Scholar 

  • Vieira, T., Bordignon, A., Peixoto, A., Tavares, G., Lopes, H., Velho, L., & Lewiner, T. (2009). Learning good views through intelligent galleries. Computer Graphics Forum, 28, 717–726.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitry Rudoy.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(AVI 4.97 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rudoy, D., Zelnik-Manor, L. Viewpoint Selection for Human Actions. Int J Comput Vis 97, 243–254 (2012). https://doi.org/10.1007/s11263-011-0484-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0484-5

Keywords

Navigation