Abstract
Image classification is a key problem in computer vision community. Most of the conventional visual recognition systems usually train an image classifier in an offline batch mode with all training data provided in advance. Unfortunately in many practical applications, usually only a small amount of training samples are available in the initialization stage and many more would come sequentially during the online process. Because the image data characteristics could dramatically change over time, it is important for the classifier to adapt to the new data incrementally. In this chapter, we present an online metric learning model to address the online image classification/scene recognition problem via adaptive similarity measurement. Given a number of labeled samples followed by a sequential input of unseen testing samples, the similarity metric is learned to maximize the margin of the distance among different classes of samples. By considering the low-rank constraint, our online metric learning model not only provides competitive performance compared with the state-of-the-art methods, but also guarantees to converge. A bi-linear graph is also applied to model the pair-wise similarity, and an unseen sample is labeled depending on the graph-based label propagation, while the model can also self-update using the new samples that are more confident labeled. With the ability of online learning, our methodology can well handle the large-scale streaming video data with the ability of incremental self-update. We also demonstrate that the low-rank property widely exists in natural data. In the experiments, we evaluate our model to online scene categorization and experiments on various benchmark datasets and comparisons with state-of-the-art methods demonstrate the effectiveness and efficiency of our algorithm.
\(\copyright \) [2013] IEEE. Reprinted, with permission, from Yang Cong, Ji Liu, Junsong Yuan, Jiebo Luo “Self-supervised online metric learning with low rank constraint for scene categorization”, IEEE Transactions on Image Processing, Vol. 22, No. 8, August 2013, pp. 3179–3191.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
F. Perronnin, Z. Akata, Z. Harchaoui, C. Schmid, Towards good practice in large-scale learning for image classification, in CVPR IEEE, pp. 3482–3489 (2012)
S. McCann, D.G. Lowe, Local naive bayes nearest neighbor for image classification, in CVPR IEEE, pp. 3650–3656 (2012)
B. Fernando, E. Fromont, T. Tuytelaars, Mining mid-level features for image classification. Int. J. Comput. Vision, pp. 1–18 (2014)
J. Sánchez, F. Perronnin, T. Mensink, J. Verbeek, Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vision 105(3), 222–245 (2013)
O. Russakovsky, Y. Lin, K. Yu, L. Fei-Fei, Object-centric spatial pooling for image classification, in Computer Vision-ECCV. (Springer), 1–15 (2012)
K. Simonyan, A. Vedaldi, A. Zisserman, Deep fisher networks for large-scale image classification, in Advances in Neural Information Processing Systems, pp. 163–171 (2013)
H. Kekre, S. Thepade, R. K. K. Das, S. Ghosh, Image classification using block truncation coding with assorted color spaces. Int. J. Comput. Appl., vol. 44 (2012)
H. Grabner, H. Bischof, On-line boosting and vision, in CVPR, vol. 1, pp. 260–267 (2006)
G. Chechik, V. Sharma, U. Shalit, S. Bengio, An online algorithm for large scale image similarity learning. NIPS 21, 306–314 (2009)
F. Wang, C. Yuan, X. Xu, P. van Beek, Supervised and semi-supervised online boosting tree for industrial machine vision application, in Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data. ACM, pp. 43–51 (2011)
G. Cauwenberghs, T. Poggio, Incremental and decremental support vector machine learning, in NIPS, pp. 409–415 (2001)
B. Liu, S. Mahadevan, J. Liu, Regularized off-policy td-learning, in NIPS (2012)
Y. Cong, J. Yuan, Y. Tang, Object tracking via online metric learning, in ICIP, pp. 417–420 (2012)
Y. Cong, J. Liu, J. Yuan, J. Luo, Self-supervised Online Metric Learning with Low Rank Constraint for Scene Categorization. IEEE Trans. Image Process. 22(8), 3179–3191 (2013)
K. van de Sande, T. Gevers, C. Snoek, Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)
M. Szummer, R. Picard, Indoor-outdoor image classification, in IEEE International Workshop on Content-Based Access of Image and Video Database (1998), pp. 42–51
D. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
A.E. Abdel-Hakim, A.A. Farag, Csift: a sift descriptor with color invariant characteristics, CVPR, vol. 2006 (IEEE, 2006), pp. 1978–1983
A. Bosch, A. Zisserman, X. Muoz, Scene classification using a hybrid generative/discriminative approach. Pattern Anal. Mach. Intell, IEEE Trans. 30(4), 712–727 (2008)
M. Brown, S. Susstrunk, Multi-spectral sift for scene category recognition, in CVPR (IEEE, 2011), pp. 177–184 (2011)
J. Van De Weijer, T. Gevers, A.D. Bagdanov, Boosting color saliency in image feature detection. Pattern Anal. Mach. Intell, IEEE Trans. 28(1), 150–156 (2006)
S. Lazebnik, C. Schmid, J. Ponce, Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, in CVPR, vol. 2 (2006)
J.V. Gemert, J. Geusebroek, C. Veenman, A. Smeulders, Kernel codebooks for scene categorization, in ECCV, pp. 696–709 (2008)
A. Quattoni, A. Torralba, Recognizing indoor scenes, in CVPR (2009)
J. Wu, H. Christensen, J. Rehg, Visual place categorization: Problem, dataset, and algorithm, in IROS (2009)
J. Wu, J. Rehg, CENTRIST: A visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1489–1501 (2011)
Y. Xiao, J. Wu, J. Yuan, mcentrist: A multi-channel feature generation mechanism for scene categorization (2014)
Y. Cong, J. Yuan, J. Luo, Towards scalable summarization of consumer videos via sparse dictionary selection. Multimedia, IEEE Trans. 14(1), 66–75 (2012)
L. Li, L. Fei-Fei, What, where and who? Classifying events by scene and object recognition, in ICCV, vol. 2(4), 8 (2007)
D. Walther, E. Caddigan, L. Fei-Fei, D. Beck, Natural scene categories revealed in distributed patterns of activity in the human brain. J. Neurosci. 29(34), 10573 (2009)
L. Li, R. Socher, L. Fei-Fei, Towards total scene understanding: classification, annotation and segmentation in an automatic framework, in CVPR, pp. 2036–2043 (2009)
J. Liu, M. Shah, Scene modeling using co-clustering, in ICCV (2007)
P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez, T. Tuytelaars, A thousand words in a scene. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1575–1589 (2007)
A. Bosch, A. Zisserman, M. Pujol, Scene classification using a hybrid generative/discriminative approach, IEEE transactions on pattern analysis and machine intelligence, vol. 30 (2008)
J. Kivinen, E. Sudderth, M. Jordan, Learning multiscale representations of natural scenes using Dirichlet processes, in ICCV (2007)
J. Vogel, B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vision 72(2), 133–157 (2007)
P. Utgoff, N. Berkman, J. Clouse, Decision tree induction based on efficient tree restructuring. Mach. Learn 29(1), 5–44 (1997)
S. Avidan, Ensemble tracking, in CVPR, vol. 2, pp. 494–501 (2005)
X. Liu, T. Yu, Gradient feature selection for online boosting, in ICCV, pp. 1–8 (2007)
N. Oza, S. Russell, Online bagging and boosting, in Artif. Intell. Stat (2001)
E. Lughofer, On-line evolving image classifiers and their application to surface inspection. Image Vis. Comput. 28(7), 1065–1079 (2010)
F. Gayubo, J. Gonzalez, E.D.L. Fuente, F. Miguel, J. Peran, On-line machine vision system for detect split defects in sheet-metal forming processes, in ICPR, vol. 1, pp. 723–726 (2006)
O. Camoglu, T. Yu, L. Bertelli, D. Vu, V. Muralidharan, S. Gokturk, An efficient fashion-driven learning approach to model user preferences in on-line shopping scenarios, in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 28–34 (2010)
K. Weinberger, L. Saul, Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
P. Jain, B. Kulis, I. Dhillon, K. Grauman, Online metric learning and fast similarity search, in NIPS, pp. 761–768 (2008)
A. Globerson, S. Roweis., Metric learning by collapsing classes, in NIPS (2006)
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, Y. Singer, Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 585 (2006)
G. Chechik, V. Sharma, U. Shalit, S. Bengio, Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)
H. Cheng, Z. Liu, J. Yang, Sparsity Induced Similarity Measure for Label Propagation, in ICCV (2009)
K.Weinberger, J. Blitzer, L. Saul, Distance metric learning for large margin nearest neighbor classification, in NIPS (2006)
G. Griffin, A. Holub, P. Perona, Caltech-256 object category dataset(2007)
N. Jacobson, Y. Freund, T. Nguyen, An online learning approach to occlusion boundary detection. Image Proc. IEEE Trans. 99, 1–1 (2010)
Y. Wu, J. Cheng, J. Wang, H. Lu, J. Wang, H. Ling, E. Blasch, L. Bai, Real-time probabilistic covariance tracking with efficient model update. Image Proc. IEEE Trans. 21(5), 2824–2837 (2012)
Acknowledgments
This work was supported in part by Natural Science Foundation of China (61105013, 61375014).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 Proof of Theorem 1
Proof
Since \(W\) is a PSD matrix, it can be decomposed as \(W=UU^T\) where \(U\in \mathbb {R}^{d\times d}\). Consider the following equation \(X^TV = X^TU\) with respect to \(V\). Define \(B\in \mathbb {R}^{d\times (d-r)}\) with linear dependent columns \(B_{.i}\)’s in the null space of \(X^T\). One can obtain the solution as \(V=U+BZ\) where \(Z\in \mathbb {R}^{(d-r)\times d}\). Split \(U\) and \(B\) into two parts \(U={U1 \atopwithdelims ()U2} \) and \(B={B1\atopwithdelims ()B2}\) where \(U1\in \mathbb {R}^{(d-r)\times d}\), \(U2\in \mathbb {R}^{r\times d}\), \(B1\in \mathbb {R}^{(d-r)\times (d-r)}\), and \(B2\in \mathbb {R}^{r\times r}\). Define \(Z=-B1^{-1}U1\). One verifies that \(V={0 \atopwithdelims ()U2-B2B1^{-1}U1} \) and its rank is at most \(r\). Since \(X^TU=X^TV\), we obtain \(X^TWX=X^TQX\) and the rank of \(Q\) is r by letting \(Q=VV^T\). \(\blacksquare \)
1.2 Proof of Theorem 2
Proof
Decompose \(C\) into the symmetric space and the skew symmetric space, i.e., \(C = C_y + C_k\) where \(C_y = {1 \over 2}(C+C^T)\) and \( C_k = {1\over 2}(C-C^T)\). Note that \(\langle C_y, C_k\rangle = 0\). Consider \(W\succeq 0\) (\(W\) must be symmetric) in the following
Thus, we obtain \(prox_{\gamma P,\Omega }(C)=prox_{\gamma P,\Omega }(C_y)\).
The first equality uses the dual form of the trace norm of a PSD matrix, where \(S\mathbb {R}\) denotes the symmetric space. The second equality is due to Von Neumann theorem. The last equality uses the result that the projection from a symmetric matrix \(X\) onto the SDP cone is \(X^+\), which also implies that \(W=(C_y-Z)^+\).
It follows that
From the last formulation, we obtain the optimal \(Z^*=\mathcal {T}_\gamma (C_y)\) and the optimal \(W^*=(C_y-Z^*)^+=(C_y-\mathcal {T}_\gamma (C_y))^+=\mathcal {D}_\gamma (C_y)^+=D_\gamma (C_y)\). It completes our proof. \(\blacksquare \)
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Cong, Y., Liu, J., Yuan, J., Luo, J. (2014). Low-Rank Online Metric Learning. In: Fu, Y. (eds) Low-Rank and Sparse Modeling for Visual Analysis. Springer, Cham. https://doi.org/10.1007/978-3-319-12000-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-12000-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11999-1
Online ISBN: 978-3-319-12000-3
eBook Packages: Computer ScienceComputer Science (R0)