Classification Probabilistic PCA with Application in Domain Adaptation

Cheng, Victor; Li, Chun-Hung

doi:10.1007/978-3-642-20841-6_7

Victor Cheng²² &
Chun-Hung Li²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6634))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1732 Accesses
1 Citations

Abstract

Conventional dimensionality reduction algorithms such as principle component analysis (PCA) and non-negative matrix factorization (NMF) are unsupervised. Supervised probabilistic PCA (SPPCA) can utilize label information. However, this information is usually treated as regression targets rather than discrete nominal labels. We propose a classification probabilistic PCA (CPPCA) which is an extension of probabilistic PCA. Unlike SPPCA, the label class information is turned into a class probabilistic function by using a sigmoidal function. As the posterior distribution of latent variables are non-Gaussian, we use Laplace approximation with Expectation Maximization (EM) to obtain the solution. The formulation is applied to a domain adaptation classification problem where the labeled training data and unlabeled test data come from different but related domains. Experimental results show that the proposed model has accuracy over conventional probabilistic PCA, SPPCA and its semi-supervised version. It has similar performance when compared with popular dedicated algorithms for domain adaptation, the structural correspondence learning (SCL) and its variants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Annual Meeting-Association for Computational Linguistics, pp. 440–447 (2007)
Google Scholar
Blitzer, J., Mcdonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 120–128. Association for Computational Linguistics (July 2006)
Google Scholar
Cai, D., He, X., Han, J.: Semi-supervised discriminant analysis. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–7 (2007)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines: and other kernel-based learning methods, 1st edn. Cambridge University Press, Cambridge (March 2000)
Book MATH Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Article Google Scholar
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample problem. CoRR, abs/0805.2368 (2008)
Google Scholar
Hofmann, T.: Probabilistic latent semantic analysis. In: Proc. of Uncertainty in Artificial Intelligence, UAI 1999, Stockholm (1999)
Google Scholar
Huang, J., Smola, A.J., Gretton, A., Borgwardt, K.M., Schölkopf, B.: Correcting sample selection bias by unlabeled data. In: Advances in Neural Information Processing Systems, vol. 19, pp. 601–608 (2007)
Google Scholar
Jolliffe, I.T. (ed.): Principal Component Analysis. Springer, Heidelberg (2002)
MATH Google Scholar
Kim, H., Howland, P., Park, H.: Dimension reduction in text classification with support vector machines. J. Mach. Learn. Res. 6, 37–53 (2005)
MathSciNet MATH Google Scholar
Lee, D.D., Sebastian Seung, H.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562 (2000)
Google Scholar
Pan, S.J., Kwok, J.T., Yang, Q.: Transfer learning via dimensionality reduction. In: Proc. of the Twenty-Third AAAI Conference on Artificial Intelligence (2008)
Google Scholar
Welling, M.: Fisher linear discriminant analysis, http://www.ics.uci.edu/~welling/classnotes/papers_class/Fisher-LDA.pdf [Online; accessed August 15, 2010]
Yu, S., Yu, K., Tresp, V., Kriegel, H.-P., Wu, M.: Supervised probabilistic principal component analysis. In: Ungar, L. (ed.) 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 464–473. ACM Press, New York (August 2006)
Chapter Google Scholar
Zadrozny, B.Z.: Learning and evaluating classifiers under sample selection bias. In: International Conference on Machine Learning ICML 2004, pp. 903–910 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong
Victor Cheng & Chun-Hung Li

Authors

Victor Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Hung Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shenzhen Institutes of Advanced Technology (SIAT), Chinese Academy of Sciences, 518055, Shenzhen, China
Joshua Zhexue Huang
Faculty of Engineering and Information Technology, Center for Quantum Computation and Intelligent Systems, Data Sciences and Knowledge Discovery Lab, University of Technology Sydney, NSW 2007, Sydney, Australia
Longbing Cao
Department of Computer Science and Engineering, University of Minnesota, MN 55455, Minneapolis, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, V., Li, CH. (2011). Classification Probabilistic PCA with Application in Domain Adaptation. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-20841-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20840-9
Online ISBN: 978-3-642-20841-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics