Abstract
Patternrecognitionmodels are usually used in a variety of applications ranging from video concept annotation to event detection. In this paper we propose a new framework called the max-margin adaptive (MMA) model for complex video pattern recognition, which can utilize a large number of unlabeled videos to assist the model training. The MMA model considers the data distribution consistence between labeled training videos and unlabeled auxiliary ones from the statistical perspective by learning an optimal mapping function which also broadens the margin between positive labeled videos and negative labeled videos to improve the robustness of the model. The experiments are conducted on two public datasets including CCV for video object/event detection and HMDB for action recognition. Our results demonstrate that the proposed MMA model is very effective on complex video pattern recognition tasks, and outperforms the state-of-the-art algorithms.






Similar content being viewed by others
References
Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimedia Tools Appl 51(1):279–302
Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. In: NIPS, pp 129–136
Borgwardt KM, Gretton A, Rasch MJ, Kriegel HP, Schlkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14):49–57
Brefeld U, Gärtner T, Scheffer T, Wrobel S (2006) Efficient co-regularised least squares regression. In: ICML, pp 137–144
Charles J, Pfister T, Magee D, Hogg D, Zisserman A (2013) Domain adaptation for upper body pose tracking in signed tv broadcasts. In: Proceedings of the British machine vision conference
Chen B, Lam W, Tsang IW, Wong TL (2013) Discovering low-rank shared concept space for adapting text mining models. IEEE Trans Pattern Anal Mach Intell 35(6):1284–1297
Cortes C, Mohri M, Rostamizadeh A (2009) L2 regularization for learning kernels. In: UAI, pp 109–116
Diane C, Feuz KD, Krishnan NC (2013) Transfer learning for activity recognition: a survey. Knowl Inf Syst 36(3):537–556
Duan L, Tsang I, Xu D (2012) Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3):465–479
Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: ICCV, pp 221–228
Jiang YG, Ye G, Chang SF, Ellis D, Loui AC (2011) Consumer video understanding: a benchmark database and an evaluation of human and machine performance. In: ICMR, pp 29:1–29:8
Jiang YG, Bhattacharya S, Chang SF, Shah M (2013) High-level event recognition in unconstrained videos. Int J Multimedia Inf Retrieval 2(2):73–101
Jie L, Tommasi T, Caputo B (2011) Multiclass transfer learning from unconstrained priors. In: Computer Vision (ICCV), pp 1863–1870
Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, vol 99, pp 200–209
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: ICCV
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: AAAI
Liang F, Tang S, Wang Y, Han Q, Li J (2013) A sparse coding based transfer learning framework for pedestrian detection. In: Advances in multimedia modeling, vol 7733, pp 272-282
Lin W, Sun MT, Poovendran R, Zhang Z (2010) Group event detection with a varying number of group members for video surveillance. IEEE Trans Circ Syst Video Technol 20(8):1057–1067
Lin YY, Liu TL, Fuh CS (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33(6):1147–1160
Ma Z, Yang Y, Cai Y, Sebe N, Hauptmann AG (2012) Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In: ACM multimedia, pp 469–478
Ma Z, Yang Y, Sebe N, Zheng K, Hauptmann A (2013a) Multimedia event detection using a classifier-specific intermediate representation. IEEE Trans 15(7):1628–1637
Ma Z, Yang Y, Xu Z, Yan S, Sebe N, Hauptmann A (2013b) Complex event detection via multi-source video attributes. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR), pp 2627–2633
Merler M, Huang B, Xie L, Hua G, Natsev A (2012) Semantic model vectors for complex video event recognition. IEEE Trans Multimed 14(1):88–101
Natarajan P, Wu S, Vitaladevuni S, Zhuang X, Tsakalidis S, Park U, Prasad R (2012) Multimodal feature fusion for robust event detection in web videos. In: Computer vision and pattern recognition (CVPR), pp 1298–1305
Obozinski G, Taskar B, Jordan M (2010) Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 20(2):231–252
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Pan SJ, Ni X, Sun JT, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: WWW, pp 751–760
Quattoni A, Collins M, Darrell T (2008) Transfer learning for image classification with sparse prototype representations. In: Computer vision and pattern recognition (CVPR), pp 1–8
Rohrbach M, Ebert S, Schiele B (2013) Transfer learning in a transductive setting. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in neural information processing systems, vol 26, pp 46–54
Sugiyama M, Id T, Nakajima S, Sese J (2010) Semi-supervised local fisher discriminant analysis for dimensionality reduction. Mach Learn 78(1–2):35–61
Tamrakar A, Ali S, Yu Q, Liu J, Javed O, Divakaran A, Cheng H, Sawhney H (2012) Evaluation of low-level features and their combinations for complex event detection in open source videos. In: Computer vision and pattern recognition (CVPR), pp 3681–3688
Tang K, Fei-Fei L, Koller D (2012) Learning latent temporal structure for complex event detection. In: Computer vision and pattern recognition (CVPR), pp 1250–1257
Tjondronegoro D, Chen YP (2010) Knowledge-discounted event detection in sports video. IEEE Trans Syst, Man Cybern, Part A: Syst Hum 40(5):1009–1024
Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Eighth international workshop on frontiers in handwriting recognition, pp 195–200
Wang S, Ma Z, Yang Y, Li X, Pang C, Hauptmann A (2014) Semi-supervised multiple feature analysis for action recognition. IEEE Trans Multimed 16(2):289–298
Xiao M, Guo Y (2012) Semi-supervised kernel matching for domain adaptation. In: AAAI
Xu Z, Yang Y, Tsang I, Sebe N (2013) Feature weighting via optimal thresholding for video analysis. In: The IEEE international conference on computer vision (ICCV)
Yang J, Yan R, Hauptmann AG (2007) Cross-domain video concept detection using adaptive svms. In: ACM Proceedings of the 15th international conference on Multimedia, pp 188–197
Yang Y, Shah M (2012) Complex events detection using data-driven concepts. In: Computer vision–ECCV 2012. Springer, pp 722–735
Yang Y, Song J, Huang Z, Ma Z, Sebe N, Hauptmann A (2013a) Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Trans Multimed 15(3):572–581
Yang Y, Yang Y, Shen HT (2013b) Effective transfer tagging from image to video. ACM Trans Multimed Comput Commun, Appl 9(2):1–20
Yao Y, Doretto G (2010) Boosting for transfer learning with multiple sources. In: Computer vision and pattern recognition (CVPR), pp 1855–1862
Younessian E, Quinn M, Mitamura T, Hauptmann A (2013) Multimedia event detection using visual concept signatures. Proc SPIE 8667(1)
Zeng Z, Pantic M, Roisman G, Huang T (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Analysis Mach Intell 31(1):39–58
Zhang T, Xu C, Zhu G, Liu S, Lu H (2010) A generic framework for event detection in various video domains. In: ACM multimedia, pp 103–112
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, L., Shao, J., Xu, XS. et al. Max-margin adaptive model for complex video pattern recognition. Multimed Tools Appl 74, 505–521 (2015). https://doi.org/10.1007/s11042-014-2010-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2010-6