Classification using distances from samples to linear manifolds

Liu, Yiguang; Cao, Xiaochun; Liu, Jian Guo

doi:10.1007/s10044-011-0242-x

Classification using distances from samples to linear manifolds

Short Paper
Published: 30 October 2011

Volume 16, pages 417–430, (2013)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Yiguang Liu¹,
Xiaochun Cao² &
Jian Guo Liu³

374 Accesses
2 Citations
Explore all metrics

Abstract

A classifier is proposed wherein the distances from samples to linear manifolds (DSL) are used to perform classification. For each class, a linear manifold is built, whose dimension is high enough to pass all the training samples of the class. The distance from a query sample to a linear manifold is converted to the distance from a point to a linear subspace. And a simple and stable formula is derived to calculate the distance by virtue of the geometrical fundamental of the Gram matrix as well as the regularization technique. The query sample is assigned into the class whose linear manifold is the nearest. On one synthetic data set, thirteen binary-class data sets as well as six multi-class data sets, the experimental results show that the classification performance of DSL is of competence. On most of the data sets, DSL outperforms the comparing classifiers based on k nearest samples or subspaces, and is even superior to support vector machines on some data sets. Further experiment demonstrates that the test efficiency of DSL is also competitive to kNN and the related state-of-the-art classifiers on many data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust linear classification from limited training data

Article 18 November 2021

Mixtures of Large Margin Nearest Neighbor Classifiers

High-Dimensional Classification

Notes

Available at http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/neural/bench/cmu/bench.tgz.

Abbreviations

I _m :: An identity matrix with m dimension. Especially, I denotes an identity matrix with an appropriate dimension
n :: The dimension of the input space
$\mathcal{X}_{i,j}$ :: The matrix [x _i, …, x _j] with x _i, …, x _j ∈ R ⁿ
k(x, y):: a kernel function
|A|:: The determinant of a square matrix A or the absolute value of a scalar A
h :: The total number of classes
m _i :: The training sample number of the ith class
l _i :: An (m _i − 1)-dimensional vector with entries 1
z _i,j :: The jth training sample of the ith class, 1 < i ≤ h and 1 ≤ j ≤ m _i
$\mathcal{Z}_{i}$ :: The matrix $[z_{i,1},\ldots,z_{i,m_{i}}]$
$\mathcal{Z}_{i,j,k}$ :: The matrix [z _i,j, …, z _i,k] with j ≤ k
s _q :: A query sample

References

Fix E, Hodges J (1951) Discriminatory analysis, non-parametric discrimination: consistency properties. Tech. rep. USAF School of Aviation and Medicine, Randolph Field, p 4
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Transactions Inform. Theory, pp. 21–27
Li B, Chen YW, Chen YQ (2008) The nearest neighbor algorithm of local probability centers. In: IEEE Trans Syst Man Cybern Part B 38:141–154
Article Google Scholar
Wang L, Suter D (2007) Learning and matching of dynamic shape manifolds for human action recognition. IEEE Trans Image Proc 16:1646–1661
Article MathSciNet Google Scholar
Ge SS, Yang Y, Lee TH (2008) Hand gesture recognition and tracking based on distributed locally linear embedding. Image Vision Comput 26: 1607–1620
Article Google Scholar
García-Pedrajas N (2009) Constructing ensembles of classifiers by means of weighted instance selection. IEEE Trans Neural Netw 20:258–277
Article Google Scholar
Sánchez JS, Mollineda RA, Sotoca JM (2007) An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal Appl 10(3):189–201
Article MathSciNet Google Scholar
Wang J, Neskovic P, Cooper LN (2006) Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence. Pattern Recognit 39:417–423
Article MATH Google Scholar
Fayed HA, Atiya AF (2009) A novel template reduction approach for the k-nearest neighbor method. IEEE Trans Neural Netw 20:890–896
Article Google Scholar
Athitsos V, Alon J, Sclaroff S, Kollios G (2008) Boostmap: an embedding method for efficient nearest neighbor retrieval. IEEE Trans Pattern Anal Mach Intell 30:89–104
Article Google Scholar
Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell 24:1281–1285
Article Google Scholar
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 18:607–616
Article Google Scholar
Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: In NIPS, MIT Press, Cambridge
Zuo W, Zhang D, Wang K (2008) On kernel difference-weighted k-nearest neighbor classification. Pattern Anal Appl 11(3–4):247–257
Article MathSciNet Google Scholar
Alkoot FM, Kittler J (2002) Moderating k-nn classifiers. Pattern Anal Appl 5(3):326–332
Article MathSciNet Google Scholar
García V, Mollineda RA, Sánchez JS (2008) On the k-nn performance in a challenging scenario of imbalance and overlapping. Pattern Anal Appl 11:269–280
Article Google Scholar
Zhang P, Peng J, Domeniconi C (2005) Kernel pooled local subspaces for classification. IEEE Trans Syst Man Cybern Part B 35:489–502.
Google Scholar
Balachander T, Kothari R (1999) Kernel based subspace pattern classification. Proc. Int. Joint conf. Neural Netw 5:3119–3122
Google Scholar
Nalbantov GI, Groenen PJF, Bioch JC (2007) Nearest convex hull classification. Tech. Rep. EI 2006-50, Econometric Institute.
Kumar MP, Torr P, Zisserman A (2007) An invariant large margin nearest neighbour classifier. In: IEEE 11th International Conference on Computer Vision. ICCV 2007, vol 2. pp 1–8
Vincent P, Bengio Y. K-local hyperplane and convex distance nearest neighbor algorithms. In: NIPS, 2001
Cevikalp H, Larlus D, Neamtu M, Triggs B, Jurie F (2010) Manifold based local classifiers: linear and nonlinear approaches. J Signal Proc Syst 61(1):61–73
Article Google Scholar
Cevikalp H, Triggs B, Polikar R (2008) Nearest hyperdisk methods for high-dimensional classification. In: Cohen WW, McCallum A, Roweis ST (eds) ICML, vol. 307. ACM international conference proceeding series, pp 120–127, ACM, Helsinki
Sam H (2008) K-nearest neighbor finding using maxnearestdist. IEEE Trans Pattern Anal Mach Intell 30:243–252
Article Google Scholar
Cristescu R (1977) Topological vector spaces. Editura Academiei, Bucharest
Lee J, Zhang C (2006) Classification of gene-expression data: the manifold-based metric learning way. Pattern Recognit 39:2450–2463
Article MATH Google Scholar
Vapnik VN (1998) Statistical learning theory. A Wiley-Interscience Publication, Wiley, New york.
Barth N (1999) The gramian and k-volume in n-space: some classical results in linear algebra. J Young Investig 2 (Online; accessed 19-July-2011).
Simard PY, LeCun YA, Denker JS, Victorri B (1998) Transformation invariance in pattern recognition c tangent distance and tangent propagation. In: Orr GB, Müller K-R (eds) Neural networks: tricks of the trade, vol 1524. Lecture notes in computer science, Springer, Berlin, pp 239–274
Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transact Pattern Anal Mach Intell 28:69–74
Article Google Scholar
Cawley G, Talbot N (2007) Miscellaneous matlab software (Online; accessed 19-July-2011).
Asuncion A, Newman D (2007) UCI machine learning repository (Online; accessed 19-July-2011)
Gantmacher F, Matrizenrechung I (1958) Veb Deutscher Verlag Der Wissenschaften, Berlin
Meyer CD (2001) Matrix analysis and applied linear algebra. SIAM
Camastra F, Vinciarelli A (2002) Estimating the intrinsic dimension of data with a fractal-based method. IEEE Transact Pattern Anal Mach Intell 24:1404–1407
Article Google Scholar
Aster R, Borchers B, Thurber C (2005) Tikhonov regularization. Int Geophys 90:89–118
Article Google Scholar
Vapnik VN (1999) The nature of statistical learning theory (information science and statistics). Springer, Berlin
Nene SA, Nayar SK, Murase H (1996) Columbia university image library (Online; accessed 20-July-2011)
Keerthi SS, Lin C-J (2003) Asymptotic behaviors of support vector machines with gaussian kernel. Neural Comput 15: 1667–1689
Article MATH Google Scholar
Vieira DAG, Takahashi RHC, Vasconcelos VPJA, Caminhas WM (2008) The Q-norm complexity measure and the minimum gradient method: a novel approach to the machine learning structural risk minimization problem. IEEE Transact Neural Netw 19:1415–1430
Article Google Scholar
Xu Z, Dai M, Meng D (2009) Fast and efficient strategies for model selection of gaussian support vector machine. IEEE Transact Syst Man Cybern Part B Cybern 39:1292–1307
Article Google Scholar
Mu T, Nandi AK (2009) Multiclass classification based on extended support vector data description. IEEE Transact Syst Man Cybern Part B 39:1206–1216
Article Google Scholar
Schölkopf B, Smola AJ (2001) Learning with Kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge
Liu Y, You Z, Cao L (2006) A novel and quick SVM-based multi-class classifier. Pattern Recognit 39:2258–2264
Article MATH Google Scholar
Graf ABA, Smola AJ, Borer S (2003) Classification in a normalized feature space using support vector machines. IEEE Transact Neural Netw 14:597–605
Article Google Scholar

Download references

Acknowledgments

Thank Editors and Reviewers so much for the time and effort spent in processing our paper. This work is supported by NSFC under Grants 61173182 and 61179071 and SRFDP under Grants 20090181110052.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Sichuan University, Chengdu, 610065, Sichuan, China
Yiguang Liu
School of Computer Science and Technology, Tianjin University, Tianjin, 300072, China
Xiaochun Cao
Department of Earth Science and Engineering, Imperial College London, London, SW7 2AZ, UK
Jian Guo Liu

Authors

Yiguang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochun Cao
View author publications
You can also search for this author in PubMed Google Scholar
Jian Guo Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiguang Liu.

Appendix 1: Proof of (16)

Proof:

For a matrix $\mathcal{C}$ with a suitable dimension, there is the following relation according to the Sherman–Morrison–Woodbury formula [34]

$$ {\mathcal{C}}(\mu I+{\mathcal{C}}^{\rm T}{\mathcal{C}})^{-1}{\mathcal{C}}^{\rm T} =I-(I+\mu^{-1}{\mathcal{C}}{\mathcal{C}}^{\rm T})^{-1} $$

(17)

Based on (8) as well as (17), it follows that

$$ \begin{aligned} d_{i}^{s_{q}}&=[\psi(s_{q})-\psi(z_{i,1})]^{\rm T}[\psi(s_{q})-\psi(z_{i,1})]- [\psi(s_{q})-\psi(z_{i,1})]^{\rm T}[\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]\\ &\quad\times\left[[\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]^{\rm T}[\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]+\mu I\right]^{-1}\\ &\quad\times [\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]^{\rm T}[\psi(s_{q})-\psi(z_{i,1})]\\ &=[\psi(s_{q})-\psi(z_{i,1})]^{\rm T}[\psi(s_{q})-\psi(z_{i,1})]- [\psi(s_{q})-\psi(z_{i,1})]^{\rm T}\\ &\quad \times\left[I-\mu\left[[\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}] [\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]^{\rm T}+\mu I\right]^{-1}\right][\psi(s_{q})-\psi(z_{i,1})]\\ &=[\psi(s_{q})-\psi(z_{i,1})]^{\rm T} \mu\left[[\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}] [\psi({\mathcal{Z}}_{i,2,m_{i}})-\psi(z_{i,1})l^{\rm T}_{i}]^{\rm T}+\mu I\right]^{-1}[\psi(s_{q})-\psi(z_{i,1})]\\ \end{aligned} $$

which is (16). $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Cao, X. & Liu, J.G. Classification using distances from samples to linear manifolds. Pattern Anal Applic 16, 417–430 (2013). https://doi.org/10.1007/s10044-011-0242-x

Download citation

Received: 25 February 2011
Accepted: 14 September 2011
Published: 30 October 2011
Issue Date: August 2013
DOI: https://doi.org/10.1007/s10044-011-0242-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification using distances from samples to linear manifolds

Abstract

Access this article

Similar content being viewed by others

Robust linear classification from limited training data

Mixtures of Large Margin Nearest Neighbor Classifiers

High-Dimensional Classification

Notes

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix 1: Proof of (16)

Proof:

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classification using distances from samples to linear manifolds

Abstract

Access this article

Similar content being viewed by others

Robust linear classification from limited training data

Mixtures of Large Margin Nearest Neighbor Classifiers

High-Dimensional Classification

Notes

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix 1: Proof of (16)

Appendix 1: Proof of (16)

Proof:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation