Compact representation for large-scale unconstrained video analysis

Wang, Sen; Pan, Pingbo; Long, Guodong; Chen, Weitong; Li, Xue; Sheng, Quan Z.

doi:10.1007/s11280-015-0354-0

Compact representation for large-scale unconstrained video analysis

Published: 10 June 2015

Volume 19, pages 231–246, (2016)
Cite this article

World Wide Web Aims and scope Submit manuscript

Sen Wang¹,
Pingbo Pan²,
Guodong Long³,
Weitong Chen¹,
Xue Li¹ &
…
Quan Z. Sheng⁴

653 Accesses
3 Citations
Explore all metrics

Abstract

Recently, newly invented features (e.g. Fisher vector, VLAD) have achieved state-of-the-art performance in large-scale video analysis systems that aims to understand the contents in videos, such as concept recognition and event detection. However, these features are in high-dimensional representations, which remarkably increases computation costs and correspondingly deteriorates the performance of subsequent learning tasks. Notably, the situation becomes even worse when dealing with large-scale video data where the number of class labels are limited. To address this problem, we propose a novel algorithm to compactly represent huge amounts of unconstrained video data. Specifically, redundant feature dimensions are removed by using our proposed feature selection algorithm. Considering unlabeled videos that are easy to obtain on the web, we apply this feature selection algorithm in a semi-supervised framework coping with a shortage of class information. Different from most of the existing semi-supervised feature selection algorithms, our proposed algorithm does not rely on manifold approximation, i.e. graph Laplacian, which is quite expensive for a large number of data. Thus, it is possible to apply the proposed algorithm to a real large-scale video analysis system. Besides, due to the difficulty of solving the non-smooth objective function, we develop an efficient iterative approach to seeking the global optimum. Extensive experiments are conducted on several real-world video datasets, including KTH, CCV, and HMDB. The experimental results have demonstrated the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

https://www.youtube.com/yt/press/statistics.html

References

Chang, X., Nie, F., Ma, Z., Yang, Y., Zhou, X.: A convex formulation for spectral shrunk clustering. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Chang, X., Nie, F., Yang, Y., Huang, H.: A convex formulation for semi-supervised multi-label feature selection. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Chang, X., Shen, H., Wang, S., Liu, J., Li, X.: Semi-supervised feature analysis for multimedia annotation by mining label correlation. In: The 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 74–85 (2014)
Chen, D., Cao, X., Wen, F., Sun, J.: Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3025–3032 (2013)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Son (2012)
Han, Y., Wu, F., Tao, D., Shao, J., Zhuang, Y., Jiang, J.: Sparse unsupervised dimensionality reduction for multiple view data. IEEE Trans. Circuits Syst. Video Technol. 22(10), 1485–1496 (2012)
Article Google Scholar
Han, Y., Yang, Y., Yan, Y., Ma, Z., Sebe, N., Zhou, X.: Semisupervised feature selection via spline regression for video semantic recognition. IEEE Transactions on Neural Networks and Learning Systems 26(2), 252–264 (2015)
Article Google Scholar
Han, Y., Yang, Y., Zhou, X.: Co-regularized ensemble for feature selection. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 1380–1386 (2013)
Han, Y., Zhang, J., Xu, Z., Yu, S.: Discriminative multi-task feature selection. In: Late-Breaking Developments in the Field of Artificial Intelligence, AAAI (2013)
Jėgou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Article Google Scholar
Jėgou, H., Perronnin, F., Douze, M., Sȧnchez, J., Pėrez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1704–1716 (2012)
Article Google Scholar
Jiang, Y., Ye, G., Chang, S., Ellis, D.P.W., Loui, A.C.: Consumer video understanding: a benchmark database and an evaluation of human and machine performance. In: International Conference on Multimedia Retrieval, p. 29 (2011)
Kong, D., Ding, C.H.Q.: Efficient algorithms for selecting features with arbitrary group constraints via group lasso. In: IEEE 13th International Conference on Data Mining, pp. 379–388 (2013)
Kong, D., Ding, C.H.Q., Huang, H., Zhao, H.: Multi-label relieff and f-statistic feature selections for image annotation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2352–2359 (2012)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A large video database for human motion recognition. In: IEEE International Conference on Computer Vision, pp. 2556–2563 (2011)
Ma, Z., Nie, F., Yang, Y., Uijlings, J.R.R., Sebe, N.: Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans. Multimedia 14(4), 1021–1030 (2012)
Article Google Scholar
Ma, Z., Nie, F., Yang, Y., Uijlings, J.R.R., Sebe, N., Hauptmann, A.G.: Discriminating joint feature analysis for multimedia data understanding. IEEE Trans. Multimedia 14(6), 1662–1672 (2012)
Article Google Scholar
Ma, Z., Yang, Y., Nie, F., Sebe, N., Yan, S., Hauptmann, A.G.: Harnessing lab knowledge for real-world action recognition. Int. J. Comput. Vis. 109 (1-2), 60–73 (2014)
Article Google Scholar
Ma, Z., Yang, Y., Sebe, N., Hauptmann, A.G.: Knowledge adaptation with partially shared features for event detection using few exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1789–1802 (2014)
Article Google Scholar
Ma, Z., Yang, Y., Xu, Z., Yan, S., Sebe, N., Hauptmann, A.G.: Complex event detection via multi-source video attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2627–2633 (2013)
Neufeld, J., Yu, Y., Zhang, X., Kiros, R., Schuurmans, D.: Regularizers versus losses for nonlinear dimensionality reduction: A factored view with new convex relaxations. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2, 1-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)
Oneata, D., Verbeek, J.J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: IEEE International Conference on Computer Vision, pp. 1817–1824 (2013)
Sȧnchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1665–1672 (2011)
Schu̇ldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: International Conference on Pattern Recognition, pp. 32–36 (2004)
Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human detection using partial least squares analysis. In: IEEE International Conference on Computer Vision, pp. 24–31 (2009)
Shao, L., Mattivi, R.: Feature detector and descriptor evaluation in human action recognition. In: ACM International Conference on Image and Video Retrieval, pp. 477–484 (2010)
Soares, R.G.F., Chen, H., Yao, X.: Semisupervised classification with cluster regularization. IEEE Transactions on Neural Networks and Learning Systems 23(11), 1779–1792 (2012)
Article Google Scholar
Vedaldi, A., Zisserman, A.: Sparse kernel approximations for efficient classification and detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2320–2327 (2012)
Wang, D., Nie, F., Huang, H.: Unsupervised feature selection via unified trace ratio formulation and k-means clustering (TRACK). In: European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 306–321 (2014)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Wang, S., Chang, X., Li, X., Shen, Q.Z., Chen, W.: Multi-task support vector machines for feature selection. Signal Process. (2015). doi:10.1016/j.sigpro.2014.12.012
Google Scholar
Wang, S., Ma, Z., Yang, Y., Li, X., Pang, C., Hauptmann, A.G.: Semi-supervised multiple feature analysis for action recognition. IEEE Trans. Multimedia 16(2), 289–298 (2014)
Article Google Scholar
Yan, Y., Liu, G., Wang, S., Zhang, J., Zheng, K.: Graph-based clustering and ranking for diversified image search. Multimedia Systems, 1–12 (2014)
Yang, Y., Ma, Z., Hauptmann, A.G., Sebe, N.: Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimedia 15(3), 661–669 (2013)
Article Google Scholar
Yang, Y., Ma, Z., Xu, Z., Yan, S., Hauptmann, A.G.: How related exemplars help complex event detection in web videos?. In: IEEE International Conference on Computer Vision, pp. 2104–2111 (2013)
Yang, Y., Xu, D., Nie, F., Luo, J., Zhuang, Y.: Ranking with local regression and global alignment for cross media retrieval. In: Proceedings of the 17th ACM international conference on Multimedia, pp. 175–184. ACM (2009)
Yang, Y., Xu, D., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19(10), 2761–2773 (2010)
Article MathSciNet Google Scholar
Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans. Multimedia 10(3), 437–446 (2008)
Article Google Scholar
Zhang, X., Yu, Y., White, M., Huang, R., Schuurmans, D.: Convex sparse coding, subspace learning, and semi-supervised extensions. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence (2011)
Zhang, Y., Wu, J., Cai, J.: Compact representation for image classification: To choose or to compress?. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 907–914 (2014)
Zhao, Z., Liu, H.: Semi-supervised feature selection via spectral analysis. In: SIAM International Conference on Data Mining, pp. 641–646 (2007)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning, pp. 1151–1157 (2007)
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: The 11th European Conference on Computer Vision, pp. 141–154 (2010)
Zhu, X.: Semi-supervised learning literature survey, Tech. rep., University of WisconsinMadison (2005)

Download references

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
Sen Wang, Weitong Chen & Xue Li
College of Computer Science, Zhejiang University, Zhejiang, China
Pingbo Pan
Centre for Quantum Computation and Intelligent Systems, University of Technology, Sydney, Sydney, Australia
Guodong Long
School of Computer Science, The University of Adelaide, Adelaide, Australia
Quan Z. Sheng

Authors

Sen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pingbo Pan
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Long
View author publications
You can also search for this author in PubMed Google Scholar
Weitong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xue Li
View author publications
You can also search for this author in PubMed Google Scholar
Quan Z. Sheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sen Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S., Pan, P., Long, G. et al. Compact representation for large-scale unconstrained video analysis. World Wide Web 19, 231–246 (2016). https://doi.org/10.1007/s11280-015-0354-0

Download citation

Received: 17 December 2014
Revised: 30 April 2015
Accepted: 14 May 2015
Published: 10 June 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s11280-015-0354-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compact representation for large-scale unconstrained video analysis

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition and Prediction: A Survey

Foundation Model for Endoscopy Video Analysis via Large-Scale Self-supervised Pre-train

Sparse semi-supervised multi-label feature selection based on latent representation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compact representation for large-scale unconstrained video analysis

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition and Prediction: A Survey

Foundation Model for Endoscopy Video Analysis via Large-Scale Self-supervised Pre-train

Sparse semi-supervised multi-label feature selection based on latent representation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation