Discriminant component analysis for privacy protection and visualization of big data

Kung, Sun-Yuan

doi:10.1007/s11042-015-2959-9

Discriminant component analysis for privacy protection and visualization of big data

Published: 19 October 2015

Volume 76, pages 3999–4034, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sun-Yuan Kung¹

1150 Accesses
25 Citations
Explore all metrics

Abstract

Big data has many divergent types of sources, from physical (sensor/IoT) to social and cyber (web) types, rendering it messy and, imprecise, and incomplete. Due to its quantitative (volume and velocity) and qualitative (variety) challenges, big data to the users resembles something like “the elephant to the blind men”. It is imperative to enact a major paradigm shift in data mining and learning tools so that information from diversified sources must be integrated together to unravel information hidden in the massive and messy big data, so that, metaphorically speaking, it would let the blind men “see” the elephant. This talk will address yet another vital “V”-paradigm: “Visualization”. Visualization tools are meant to supplement (instead of replace) the domain expertise (e.g. a cardiologist) and provide a big picture to help users formulate critical questions and subsequently postulate heuristic and insightful answers. For big data, the curse of high feature dimensionality is causing grave concerns on computational complexity and over-training. In this talk, we shall explore various projection methods for dimension reduction - a prelude to visualization of vectorial and non-vectorial data. A popular visualization tool for unsupervised learning is Principal Component Analysis (PCA). PCA aims at the best recoverability of the original data in the Euclidean Vector Space (EVS). However, PCA is not effective for supervised and collaborative learning environment. Discriminant Component Analysis (DCA), basically a supervised PCA, can be derived via a notion of Canonical Vector Space (CVS). The signal subspace components of DCA are associated with the discriminant distance/power (related to the classification effectiveness) while the noise subspace components of DCA are tightly coupled with the recoverability and/or privacy protection. DCA enjoys two major merits: First, because the rank of the signal subspace is limited by the number of classes, DCA can effectively support classification using a relatively small dimensionality (i.e. high compression). Second, in DCA, the eigenvalues of the noise-space are ordered according to their corresponding reconstruction errors and can thus be used to control recoverability or anti-recoverability by applying respectively an negative or positive ridge. Via DCA, individual data can be highly compressed before being uploaded to the cloud, and thus better enabling privacy protection. In many practical scenarios, additional privacy protection can be incorporated by allowing individual participants to selectively hide some personal features. The classification of masked data calls for a kernel approach to Incomplete Data Analysis (KAIDA). More specifically, we extend PCA/DCA to their kernel variants. The success of kernel machines hinges upon the kernel function adopted to characterize the similarity of pairs of partially-specified vectors. Simulations on the HAR dataset confirm that DCA far outperforms PCA, both in their conventional or kernelized variants. For the latter, the visualization/classification results suggest favorable performance by the proposed partial correlation kernels over the imputed RBF kernel. In addition, the visualization results further points to a potentially promising approach via multiple kernels such as combining an imputed Gaussian RBF kernel and a non-imputed partial correlation kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning

Article Open access 21 February 2024

Alexandru Telea, Alister Machado & Yu Wang

From Classification to Visualization: A Two Way Trip

CSViz: Class Separability Visualization for high-dimensional datasets

Article 26 December 2023

Marina Cuesta, Carmen Lancho, … Isaac Martín De Diego

Notes

Numerically, it is advisable to solve the generalized eigenvalue problem of the matrix pair \(\{\bar {\textbf {S}},\textbf {S}_{\textit {W}}~\}\), especially when S _W is ill-conditioned. According to the theory of symmetric eigenvalue decomposition, cf. [16], this constraint is always possible to meet and, in fact, it suffices to just re-normalized the columns so that \(\textbf {w}_{i}^{T}\textbf {S}_{\textit {W}}\textbf {w}_{i}=1\), for i = 1,⋯ ,M.
For DCA, all the eigenvalues are generically distinct, thereofore, all the columns of V are canonically orthogonal to each other [16].
It is well-known that PCA in CVS enjoys the validity of LSP in CVS, i.e. \( \tilde {\textbf {U}}= [ \textbf {S}_{\textit {W}}+\rho \textbf {I}]^{-\frac {1}{2}}\bar {\boldsymbol {\Phi }} \textbf {B}\), for some matrix B. The solution in EVS can then be expressed as \(\textbf {U}= [ \textbf {S}_{\textit {W}}+\rho \textbf {I}]^{-\frac {T}{2}} \tilde {\textbf {U}} = [ \textbf {S}_{\textit {W}}+\rho \textbf {I}]^{-1}\bar {\boldsymbol {\Phi }} \textbf {B} = \rho ^{-1} \left [ \textbf {I}- \textbf {S}_{\textit {W}} [\textbf {S}_{\textit {W}}+\rho \textbf {I}]^{-1}\right ]\bar {\boldsymbol {\Phi }} \textbf {B}= \rho ^{-1}\bar {\boldsymbol {\Phi }} \textbf {B} - \textbf {S}_{\textit {W}} \textbf {C}\), where C is obviously defined. The LSP is established because \(\text {SPAN}(\textbf {S}_{\textit {W}}) \in \bar {\boldsymbol {\Phi }}) \).
In fact, K _W is always directly derivable in the empirical space, without explicitly computing S _W.
The pair-dependent “partial norms” of a different pair, say {x,z}, will be governed by a different intersecting index-set: I _{x
z}, resulting in a different “partial norm”: \(\| \textbf {x} \|_{\textbf {I}_{\textbf {x}\textbf {z}}} \equiv \sqrt { \sum \limits _{i \in \textbf {I}_{\textbf {x}\textbf {z}}} (x^{(i)})^{2}}.\)

References

Aizerman M, Braverman EA, Rozonoer L (1964) Theoretical foundation of the potential function method in pattern recognition learning. Automation Remote Control 25:821837
Google Scholar
Auerbach D (2014) The big data paradox. Slate
De La Torre F, Kanade T (2006) Discriminative cluster analysis. In: International conference on machine learning. ACM Press, p 241
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York. (See also “Classification,” Wiley, 2001.)
MATH Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7:179–188
Article Google Scholar
Golub G, Van Loan CF (1996) Matrix computations, 3rd edn. Johns Hopkins University Press , Battimore
MATH Google Scholar
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems, vol 12, pp 55–67
Hotelling H (1993) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417–441
Article Google Scholar
Kung SY, Wu PY A partial-cosine kernel approach to incomplete data analysis. ABDA’14
Kung SY (2014) Kernel methods and machine learning. Cambridge University Press
Laney D (2001) 3D Data Management Controlling data volume, velocity, and variety. APPLICATION DELIVERY STRATEGIES is published by META Group Inc.
Liu B, Jiang Y, Sha F, Govindan R (2012) Cloud-enabled privacy-preserving collaborative learning for mobile sensing. In: ACM SenSys’12, Toronto. 978-1-4503-1169-4
Mayer-Schonberger V, Cukier K (2013) Big data: a revolution transform how we work, and think. Eamon DolanHoughton Mifflin Harcourt
Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Trans London Phil Soc A209:415V446
Google Scholar
Okada T, Tomita S (1985) An optimal orthonormal system for discriminant analysis. Pattern Recogn 18(2):139144
Article Google Scholar
Parlett BN (1980) The symmetric eigenvalue problem. In: Prentice-hall series in computational mathematics, vol 07 632. Prentice-Hall, Inc, Englewood Cliffs
Google Scholar
Rao CR (1948) The utilization of multiple measurements in problems of biological classification. J R Stat Soc Ser B 10(2):159203
MathSciNet Google Scholar
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, Hoboken
Book MATH Google Scholar
Scholkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Google Scholar
Scholkopf B, Smola AJ, Muller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319
Article Google Scholar
Tychonoff AN (1943) On the stability of inverse problems. Dokl Akad Nauk SSSR 39(5):195–198
MathSciNet Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Wang J, Yong X, Zhang D, You J, You J (2010) An efficient method for computing orthogonal discriminant vectors. Elsevier
Yu Y, McKelvey T, Kung SY (2013) A classification scheme for high-dimensional-small-sample-size data using SODA and ridge-SVM with medical applications. In: Proceedings international conference on acoustics, speech, and signal processing
Zhao WY (1999) Robust image-based 3D face recognition. Ph.D. Thesis, University of Maryland

Download references

Acknowledgments

This work was supported in part by DARPA Brandeis Program starting 2015. The author gratefully thanks his colleagues: Yinan Yu, Pei-Yuan We, Daniel Bo-Wei Chen, Ji Wen and Morris Chang, for their invaluable discussion and encouragements.

Author information

Authors and Affiliations

Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA
Sun-Yuan Kung

Authors

Sun-Yuan Kung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sun-Yuan Kung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kung, SY. Discriminant component analysis for privacy protection and visualization of big data. Multimed Tools Appl 76, 3999–4034 (2017). https://doi.org/10.1007/s11042-015-2959-9

Download citation

Received: 22 April 2015
Revised: 26 August 2015
Accepted: 17 September 2015
Published: 19 October 2015
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11042-015-2959-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Discriminant component analysis for privacy protection and visualization of big data

Abstract

Access this article

Similar content being viewed by others

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning

From Classification to Visualization: A Two Way Trip

CSViz: Class Separability Visualization for high-dimensional datasets

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discriminant component analysis for privacy protection and visualization of big data

Abstract

Access this article

Similar content being viewed by others

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning

From Classification to Visualization: A Two Way Trip

CSViz: Class Separability Visualization for high-dimensional datasets

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation