Abstract
Tensor representation is helpful to reduce the small sample size problem in discriminative subspace selection. As pointed by this paper, this is mainly because the structure information of objects in computer vision research is a reasonable constraint to reduce the number of unknown parameters used to represent a learning model. Therefore, we apply this information to the vector-based learning and generalize the vector-based learning to the tensor-based learning as the supervised tensor learning (STL) framework, which accepts tensors as input. To obtain the solution of STL, the alternating projection optimization procedure is developed. The STL framework is a combination of the convex optimization and the operations in multilinear algebra. The tensor representation helps reduce the overfitting problem in vector-based learning. Based on STL and its alternating projection optimization procedure, we generalize support vector machines, minimax probability machine, Fisher discriminant analysis, and distance metric learning, to support tensor machines, tensor minimax probability machine, tensor Fisher discriminant analysis, and the multiple distance metrics learning, respectively. We also study the iterative procedure for feature extraction within STL. To examine the effectiveness of STL, we implement the tensor minimax probability machine for image classification. By comparing with minimax probability machine, the tensor version reduces the overfitting problem.
Similar content being viewed by others
References
Amini R, Gallinari P (2005) Semi-supervised learning with an imperfect supervisor. Knowl Inf Syst 8(4):385–413
Bartlett P, Shawe-Taylor J (1998) Generalization performance of support vector machines and other pattern classifiers. In: Scholkopf B, Burges CJ, Smola AJ (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge, MA
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge, UK
Boyd S, Kim SJ, Vandenberghe L, Hassibi A (2006) A tutorial on geometric programming. Optim Eng
Burges JC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Etemad K, Chellappa R (1998) Discriminant analysis for recognition of human face images. J Opt Soc Am A 14(8):1,724–1,733
Fisher RA (1938) The statistical utilization of multiple measurements. Ann Eugenics 8:376–386
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic, New York
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, pp 77–86
Girgensohn A, Foote J (1999) Video classification using transform coefficients. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol. 6, pp 3045–3048
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1,254–1,259
Itti L, Koch C (2001) Computational modeling of visual attention. Nat Rev Neurosci 2(3):194–203
Kim SJ, Magnani A, Boyd S (2005) Robust Fisher discriminant analysis. In: Advances in neural information processing systems. Vancouver and Whistler, British Columbia, Canada
Lanckriet G, Cristianini N, Bartlett P, Ghaoui L, Jordan M (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72
Lanckriet G, Ghaoui L, Bhattacharyya C, Jordan M (2002) A robust minimax approach to classification. J Mach Learn Res 3:555–582
Lathauwer LD (1997) Signal processing based on multilinear algebra. Ph.D. Thesis, Katholike Universiteit Leuven, Leuven, Belgium
Li T, Ogihara M (2005) Semisupervised learning from different information sources. Knowl Inf Syst 7(3):289–309
Lobo M, Vandenberghe L, Boyd S, Lebret H (1998) Applications of second-order cone programming. Linear Algebr Appl 284:193–228
Marshall A, Olkin I (1960) Multivariate Chebyshev inequalities. Ann Math Stat 31(4):1,001–1,014
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, Berlin
Pedroso JP, Murata N (1999) Support vector machines for linear programming: motivation and formulations. BSIS Technical Report 99-2. Riken Brain Science Institute, Wako-shi, Saitama, Japan
Popescu I, Bertsimas D (2000) Optimal inequalities in probability theory: a convex optimization approach. Technique Report TM62, Insead
Prasad BG, Biswas KK, Gupta SK (2004) Region-based image retrieval using integrated color, shape, and location index. Comput Vis Image Underst 94(1–3):192–233
Rui Y, Huang TS, Chang SE (1999) Image retrieval: Current techniques, promising directions and open issues. J Vis Commun Image Represent 10:39–62
Salmenkivi M, Mannila H (2005) Using Markov chain Monte Carlo and dynamic programming for event sequence data. Knowl Inf Syst 7(3):267–288
Scholkopf B, Smola A, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12:1,207–1,245
Scholkopf B, Smola A (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond (Adaptive computation and machine learning). MIT Press, Cambridge, MA
Shashua A, Levin A (2001) Linear image coding for regression and classification using the tensor-rank principle. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, Hawai, vol. 1, pp 42–49
Smola A, Friess TT, Scholkopf B (1999) Semiparametric support vector and linear programming machines. Neural Inf Process Syst 11:585–591
Strohmann TR, Belitski A, Grudic GZ, DeCoste D (2003) Sparse greedy minimax probability machine classification. In: Advances in neural information processing systems. Vancouver and Whistler, British Columbia, Canada
Sun Y, Fisher R (2003) Object-based visual attention for computer vision. Artif Intell 146(1):77–123
Sun J, Tao D, Faloutsosy C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, USA
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Suykens JAK, van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore
Tao D, Li X, Hu W, Maybank SJ, Wu X (2005) Supervised tensor learning. In: Proceedings of the IEEE international conference on data mining, Houston, Texas, USA, pp 450–457
Tao D (2006) Discriminative linear and multilinear subspace methods. PhD Thesis, University of London, London
Tao D, Li X, Wu X, Maybank SJ (2006) Human carrying status in visual surveillance. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, New York, NY, USA, pp 1,670–1,677
Tao D, Li X, Wu X, Maybank SJ (2006) Elapsed time in human gait recognition: a new approach. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Toulouse, France
Tao D, Li X, Maybank SJ (2007) Negative samples analysis in relevance feedback. IEEE Trans Knowl Data Eng
Torralba AB, Oliva A (1999) Semantic organization of scenes using discriminant structural templates. In: Proceedings of the IEEE international conference on computer vision, Kerkyra, Greece, pp 1,253–1,258
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136
Vandenberghe L, Boyd S (1996) Semidefinite programming. SIAM Rev 1(38):49–95
Vanderbei R (2001) Linear programming: foundations and extensions, 2nd edn. Springer, Berlin
Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York
Vasilescu MAO, Terzopoulos D (2003) Multilinear subspace analysis for image ensembles. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, vol. 2, Madison, WI, pp 93–99
Wang JZ, Li L, Wiederhold G (2001) SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell 23(9):947–963
Wechslet H, Phillips J, Bruse V, Soulie F, Hauhg T (eds) (1998). Face recognition: from theory to application. Springer-Verlag, Berlin
Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. Neural Inf Process Syst 18:
Winston WL, Goldberg JB, Venkataramanan M (2002) Introduction to mathematical programming: operations research, 4th edn. Duxbury, Pacific Grove, CA, USA
Xu D, Yan S, Zhang L, Zhang H-J, Liu Z, Shum H-Y (2005) Concurrent subspaces analysis. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, San Diego, CA, USA, vol. 2, pp 203–208
Ye J, Janardan R, Li Q (2005) Two-dimensional linear discriminant analysis. In: Advances in neural information processing systems. Vancouver and Whistler, British Columbia, Canada, pp 1,569–1,576
Ye J, Li Q (2005) A two-stage linear discriminant analysis via QR-decomposition. IEEE Trans Pattern Anal Mach Intell 27(6):929–941
Zangwill WI (1969) Nonlinear programming: a unified approach. Prentice-Hall, Englewood Cliffs, NJ
Zhang X (2004) Matrix analysis and applications. Springer, Berlin
Author information
Authors and Affiliations
Corresponding author
Additional information
We focus on the convex optimization-based binary classification learning algorithms in this paper. This is because the solution to a convex optimization-based learning algorithm is unique.
Dacheng Tao received the B.Eng. degree from the University of Science and Technology of China (USTC), the MPhil degree from the Chinese University of Hong Kong (CUHK) and the PhD from the University of London (Birkbeck). He will join the Department of Computing in the Hong Kong Polytechnic University as an assistant professor. His research interests include biometric research, discriminant analysis, support vector machine, convex optimization for machine learning, multilinear algebra, multimedia information retrieval, data mining, and video surveillance. He published extensively at TPAMI, TKDE, TIP, TMM, TCSVT, CVPR, ICDM, ICASSP, ICIP, ICME, ACM Multimedia, ACM KDD, etc. He gained several Meritorious Awards from the Int’l Interdisciplinary Contest in Modeling, which is the highest level mathematical modeling contest in the world, organized by COMAP. He is a guest editor for special issues of the Int’l Journal of Image and Graphics (World Scientific) and the Neurocomputing (Elsevier).
Xuelong Li works at the University of London. He has published in journals (IEEE T-PAMI, T-CSVT, T-IP, T-KDE, TMM, etc.) and conferences (IEEE CVPR, ICASSP, ICDM, etc.). He is an Associate Editor of IEEE T-SMC, Part C, Neurocomputing, IJIG (World Scientific), and Pattern Recognition (Elsevier). He is also an Editor Board Member of IJITDM (World Scientific) and ELCVIA (CVC Press). He is a Guest Editor for special issues of IJCM (Taylor and Francis), IJIG (World Scientific), and Neurocomputing (Elsevier). He co-chaired the 5th Annual UK Workshop on Computational Intelligence and the 6th the IEEE Int’l Conf. on Machine Learning and Cybernetics. He was also a publicity chair of the 7th IEEE Int’l Conf. on Data Mining and the 4th Int’l Conf. on Image and Graphics. He has been on the program committees of more than 50 conferences and workshops.
Xindong Wu is a Professor and the Chair of the Department of Computer Science at the University of Vermont. He holds a Ph.D. in Artificial Intelligence from the University of Edinburgh, Britain. His research interests include data mining, knowledge-based systems, and Web information exploration. He has published extensively in these areas in various journals and conferences, including IEEE TKDE, TPAMI, ACM TOIS, IJCAI, AAAI, ICML, KDD, ICDM, and WWW, as well as 12 books and conference proceedings. Dr. Wu is the Editor-in-Chief of the IEEE Transactions on Knowledge and Data Engineering (by the IEEE Computer Society), the Founder and current Steering Committee Chair of the IEEE International Conference on Data Mining (ICDM), an Honorary Editor-in-Chief of Knowledge and Information Systems (by Springer), and a Series Editor of the Springer Book Series on Advanced Information and Knowledge Processing (AIKP). He is the 2004 ACM SIGKDD Service Award winner.
Weiming Hu received the Ph.D. degree from the Department of Computer Science and Engineering, Zhejiang University. From April 1998 to March 2000, he was a Postdoctoral Research Fellow with the Institute of Computer Science and Technology, Founder Research and Design Center, Peking University. Since April 1998, he has been with the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. Now he is a Professor and a Ph.D. Student Supervisor in the laboratory. His research interests are in visual surveillance, neural networks, filtering of Internet objectionable information, retrieval of multimedia, and understanding of Internet behaviors. He has published more than 80 papers on national and international journals, and international conferences.
Stephen J. Maybank received a BA in Mathematics from King’s college, Cambridge in 1976 and a PhD in Computer Science from Birkbeck College, University of London in 1988. He was a research scientist at GEC from 1980 to 1995, first at MCCS, Frimley and then, from 1989, at the GEC Marconi Hirst Research Centre in London. In 1995 he became a lecturer in the Department of Computer Science at the University of Reading and in 2004 he became a professor in the School of Computer Science and Information Systems at Birkbeck College, University of London. His research interests include camera calibration, visual surveillance, tracking, filtering, applications of projective geometry to computer vision and applications of probability, statistics and information theory to computer vision. He is the author of more than 90 scientific publications and one book. He is a Fellow of the Institute of Mathematics and its Applications, a Fellow of the Royal Statistical Society and a Senior Member of the IEEE. For further information see http://www.dcs.bbk.ac.uk/~sjmaybank.
Rights and permissions
About this article
Cite this article
Tao, D., Li, X., Wu, X. et al. Supervised tensor learning. Knowl Inf Syst 13, 1–42 (2007). https://doi.org/10.1007/s10115-006-0050-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-006-0050-6