A graph optimization method for dimensionality reduction with pairwise constraints

Zhang, Limei; Qiao, Lishan

doi:10.1007/s13042-014-0321-6

A graph optimization method for dimensionality reduction with pairwise constraints

Original Article
Published: 09 January 2015

Volume 8, pages 275–281, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Limei Zhang¹ &
Lishan Qiao¹

348 Accesses
5 Citations
Explore all metrics

Abstract

Graph is at the heart of many dimensionality reduction (DR) methods. Despite its importance, how to establish a high-quality graph is currently a pursued problem. Recently, a new DR algorithm called graph-optimized locality preserving projections (GoLPP) was proposed to perform graph construction with DR simultaneously in a unified objective function, resulting in an automatically optimized graph rather than pre-specified one as involved in typical LPP. However, GoLPP is unsupervised and can not naturally incorporate supervised information due to a strong sum-to-one constraint of weights of graph in its model. To address this problem, in this paper we give an improved GoLPP model by relaxing the constraint, and then develop a semi-supervised GoLPP (S-GoLPP) algorithm by incorporating pairwise constraint information into its modeling. Interestingly, we obtain a semi-supervised closed-form graph-updating formulation with natural possibility explanation. The feasibility and effectiveness of the proposed method is verified on several publicly available UCI and face data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relative manifold based semi-supervised dimensionality reduction

Article 03 November 2014

Robust automated graph regularized discriminative non-negative matrix factorization

Article 30 January 2021

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Notes

Here, “possibilistic” is used to distinguish from “probabilistic” for denoting the row sum is not always 1.
In fact, such obtained solution is not exact, which is involved in the trace ratio and ratio trace problems and goes beyond our main focus. See [22] for more details.

References

Yan SC, Xu D, Zhang BY, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Article Google Scholar
He XF, Niyogi P (2004) Locality preserving projections. Neural Inf Process Syst (NIPS) 16:153–160
Google Scholar
Wang H, Zheng W (2014) Robust sparsity-preserved learning with application to image visualization. Knowl Inf Syst 39(2):287–304
Article Google Scholar
Matthias Dehmer FE-S (2007) Comparing large graphs efficiently by margins of feature vectors. Appl Math Comput 188(2):1699–1710
MathSciNet MATH Google Scholar
Wan M, Lai Z, Jin Z (2011) Feature extraction using two-dimensional local graph embedding based on maximum margin criterion. Appl Math Comput 217(23):9659–9668
MathSciNet MATH Google Scholar
Kim YG, Song YJ, Chang UD, Kim DW, Yun TS, Ahn JH (2008) Face recognition using a fusion method based on bidirectional 2DPCA. Appl Math Comput 205(2):601–607
MathSciNet MATH Google Scholar
Musa AB (2014) A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression. Int J Mach Learn Cybern 5(6):861–873
Article Google Scholar
Hasan BAS, Gan JQ, Tsui CSL (2014) A filter-dominating hybrid sequential forward floating search method for feature subset selection in high-dimensional space. Int J Mach Learn Cybern 5(3):413–423
Article Google Scholar
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Article MathSciNet Google Scholar
Fang Y, Wang R, Dai B (2012) Graph-oriented learning via automatic group sparsity for data analysis. In: IEEE 12th international conference on data mining (ICDM), pp 251–259
Zhu X (2008) Semi-supervised learning literature survey. Technical report, University of Wisconsin, Madison
Liu W, Chang S-F (2009) Robust multi-class transductive learning with graphs. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 381–388
Maier M, Luxburg U (2008) Influence of graph construction on graph-based clustering measures. Neural Inf Process Syst (NIPS)
Fadi Dornaika AA (2013) Enhanced and parameterless locality preserving projections for face recognition. Neurocomputing 99:448–457
Article Google Scholar
Zhao HT, Wong WK (2012) Supervised optimal locality preserving projection. Pattern Recognit 45:186–197
Article MATH Google Scholar
Bo Yang SC (2010) Sample-dependent graph construction with application to dimensonality reduction. Neurocomputing 74(1–3):301–314
Article Google Scholar
Zhang L, Qiao L, Chen S (2010) Graph-optimized locality preserving projections Pattern Recognit 43(6):1993–2002
Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning : data mining, inference, and prediction, 2nd edn. Springer, New York
Book MATH Google Scholar
Cai D, He XF, Han JW (2007) Semi-supervised discriminant analysis. In: IEEE 11th international conference on computer vision (ICCV), pp 1–7
Mizutani K, Miyamoto S (2005) Possibilistic approach to kernel-based fuzzy c-means clustering with entropy regularization. In: Torra V, Narukawa Y, Miyamoto S (eds) Modeling decisions for artificial intelligence. Springer, Berlin, Heidelberg
Google Scholar
Pal NR, Pal K, Keller JM, Bezdek JC (2005) A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 13(4):517–530
Article Google Scholar
Wang H, Yan SC, Xu D, Tang XO, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Lee KC, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
Article Google Scholar
Qiao L, Zhang L, Chen S (2013) Dimensionality reduction with adaptive graph. Front Comput Sci 7(5):745–753
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was partly supported by National Natural Science Foundations of China and Shandong under Grant Nos: 61300154, 11326182, 61402215 and ZR2012FQ005.

Author information

Authors and Affiliations

Department of Mathematics Science, Liaocheng University, Liaocheng, 252000, China
Limei Zhang & Lishan Qiao

Authors

Limei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lishan Qiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lishan Qiao.

Appendix: How to solve the weight matrix S = (S ij ) n×n in problem (5)

As seen from Step 2 in Sect. 3.3, given W, computing S = (S _ij)_n×n in model (4) is equivalent to solving the following optimization problem:

$$\begin{aligned} & \mathop {\hbox{min} }\limits_{{S_{ij} }} \sum\nolimits_{i,j = 1}^{n} {||W^{T} x_{i} - W^{T} x_{j} ||^{2} S_{ij}^{{}} } + \eta \sum\nolimits_{i,j = 1}^{n} {S_{ij} \ln (S_{ij} /\alpha )} \\ & s.t. \, \sum\nolimits_{j = 1}^{n} {S_{ij} } > 0, \quad i = 1, \ldots ,n \\ &S_{ij} = 1, \quad if \, (x_{i} ,x_{j} ) \in M, \quad i,j = 1, \ldots ,n \\ & S_{ij} = 0, \quad if \, (x_{i} ,x_{j} ) \in C, \quad i,j = 1, \ldots ,n \\ & S_{ij} \ge 0, \quad otherwise \\ \end{aligned}$$

(5)

Obviously, from the constraints of (5) we can get that,

$$\begin{aligned} & S_{ij} = 1, \quad if \, (x_{i} ,x_{j} ) \in M, \quad i,j = 1, \ldots ,n; \\ & S_{ij} = 0, \quad if \, (x_{i} ,x_{j} ) \in C, \quad i,j = 1, \ldots ,n. \\ \end{aligned}$$

So, we only solve the weight value S _ij corresponding to the samples without constraints. That is, we just consider the following problem:

$$\begin{aligned} & \mathop {\hbox{min} }\limits_{{S_{ij} }} \sum\nolimits_{i,j = 1}^{n} {||W^{T} x_{i} - W^{T} x_{j} ||^{2} S_{ij}^{{}} + \eta \sum\nolimits_{i,j = 1}^{n} {S_{ij} \ln (S_{ij} /\alpha )} } \\ & s.t. \, \sum\nolimits_{j = 1}^{n} {S_{ij} } > 0, \quad i = 1, \ldots ,n \\ & S_{ij} \ge 0, \quad i,j = 1, \ldots ,n \\ \end{aligned}$$

(6)

where $(x_{i} ,x_{j} ) \notin M\begin{array}{*{20}c} {} \\ \end{array} {\text{and}}\begin{array}{*{20}c} {} \\ \end{array} (x_{i} ,x_{j} ) \notin C$. To optimize S _ij, we establish the lagrangian function as follows:

$$\begin{array}{*{20}c} {L(S_{ij} ,\lambda_{i} ,\mu_{ij} ) = \sum\nolimits_{i,j = 1}^{n} {||W^{T} x_{i} - W^{T} x_{j} ||^{2} S_{ij}^{{}} + \eta \sum\nolimits_{i,j = 1}^{n} {S_{ij} \ln (S_{ij} /\alpha )} } } \\ \end{array} - \sum\nolimits_{i = 1}^{n} {(\lambda_{i} \sum\nolimits_{j = 1}^{n} {S_{ij} ) - } } \sum\nolimits_{i,j = 1}^{n} {(\mu_{ij} S_{ij} )} .$$

By the KKT condition,

$$\lambda_{i} \sum\nolimits_{j = 1}^{n} {S_{ij} = 0,\mu_{ij} S_{ij} = 0} ,$$

so the lagrangian function is simplified as

$$\begin{array}{*{20}c} {L(S_{ij} ) = \sum\nolimits_{i,j = 1}^{n} {||W^{T} x_{i} - W^{T} x_{j} ||^{2} S_{ij}^{{}} + \eta \sum\nolimits_{i,j = 1}^{n} {S_{ij} \ln (S_{ij} /\alpha )} } } \\ \end{array} .$$

Let $\frac{\partial L}{{\partial S_{ij} }} = ||W^{T} x_{i} - W^{T} x_{j} ||^{2} + \eta (\ln (S_{ij} /\alpha ) + 1) = 0$, then we have

$$\begin{array}{*{20}c} {S_{ij} = \alpha \;\exp ( - 1) \cdot \;\exp ( - ||W^{T} x_{i} - W^{T} x_{j} ||^{2} /\eta )} \\ \end{array}$$

(7)

where $(x_{i} ,x_{j} ) \notin M ,\begin{array}{*{20}c} {} \\ \end{array} (x_{i} ,x_{j} ) \notin C,\begin{array}{*{20}c} {} \\ \end{array} \alpha$ is a positive parameter. Intuitively, one expects the weight S _ij approximates 1 when the distance of two samples tends to 0. With this intuition, we set α = e, and obtain

$$\begin{array}{*{20}c} {S_{ij} = \exp ( - ||W^{T} x_{i} - W^{T} x_{j} ||^{2} /\eta )} \\ \end{array} .$$

Lastly, we sum up the solution of problem (5) as follows:

$$\hat{S}_{ij} = \left\{ \begin{array}{ll} 1, & if \, (x_{i} ,x_{j} ) \in M \\ 0, & if \, (x_{i} ,x_{j} ) \in C \\ {\exp \left( - \tfrac{{||W^{T} x_{i} - W^{T} x_{j} ||^{2} }}{\eta }\right)}, & otherwise \\ \end{array} \right. .$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Qiao, L. A graph optimization method for dimensionality reduction with pairwise constraints. Int. J. Mach. Learn. & Cyber. 8, 275–281 (2017). https://doi.org/10.1007/s13042-014-0321-6

Download citation

Received: 05 December 2013
Accepted: 09 December 2014
Published: 09 January 2015
Issue Date: February 2017
DOI: https://doi.org/10.1007/s13042-014-0321-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A graph optimization method for dimensionality reduction with pairwise constraints

Abstract

Access this article

Similar content being viewed by others

Relative manifold based semi-supervised dimensionality reduction

Robust automated graph regularized discriminative non-negative matrix factorization

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: How to solve the weight matrix S = (S ij ) n×n in problem (5)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A graph optimization method for dimensionality reduction with pairwise constraints

Abstract

Access this article

Similar content being viewed by others

Relative manifold based semi-supervised dimensionality reduction

Robust automated graph regularized discriminative non-negative matrix factorization

Orthogonal Dual Graph Regularized Nonnegative Matrix Factorization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: How to solve the weight matrix S = (S ij ) n×n in problem (5)

Appendix: How to solve the weight matrix S = (S ij ) n×n in problem (5)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Appendix: How to solve the weight matrix S = (S _ij)_n×n in problem (5)