Skip to main content

Learning Under Data Shift for Domain Adaptation: A Model-Based Co-clustering Transfer Learning Solution

  • Conference paper
  • First Online:
Knowledge Management and Acquisition for Intelligent Systems (PKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9806))

Included in the following conference series:

Abstract

Data shifting in machine learning problems violates the common assumption that the training and testing samples should be drawn from the same distribution. Most of the algorithms which provide the solution for data shifting problems first try to evaluate the distributions and then reweight samples based on their distributions. Due to the difficulty of evaluating a precise distribution, conventional methods cannot achieve good classification performance. In this paper, we introduce two types of data-shift problems and propose a model-based co-clustering transfer learning based solution which consistently deals with both scenarios of data shift. Experimental results demonstrate that our proposed method achieves better generalization and running efficiency compared to traditional methods under data or covariate shift setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rebbapragada, U., Bue, B., Wozniak, P.R.: Time-domain surveys and data shift: case study at the intermediate palomar transient factory. In: American Astronomical Society Meeting Abstracts, vol. 225 (2015)

    Google Scholar 

  2. Sajobi, T.T., et al.: Identifying reprioritization response shift in a stroke caregiver population: a comparison of missing data methods. Qual. Life Res. 24(3), 529–540 (2015)

    Article  Google Scholar 

  3. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  4. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  5. Quionero-Candela, J., et al.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)

    Google Scholar 

  6. Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. stat. plann. infer. 90(2), 227–244 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  7. Liao, X., Xue, Y., Carin, L.: Logistic regression with an auxiliary data source. In: Proceedings of the 22nd International Conference on Machine learning. ACM (2005)

    Google Scholar 

  8. Rosenstein, M.T., et al.: To transfer or not to transfer. In: NIPS 2005 Workshop on Transfer Learning, vol. 898 (2005)

    Google Scholar 

  9. Dai, W., et al.: Boosting for transfer learning. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)

    Google Scholar 

  10. Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  11. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory. ACM (1998)

    Google Scholar 

  12. Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In: Proceedings of the Twenty-First International Conference on Machine Learning. ACM (2004)

    Google Scholar 

  13. Huang, J., et al.: Correcting sample selection bias by unlabeled data. In: Advances in Neural Information Processing Systems (2006)

    Google Scholar 

  14. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14, no. 2 (1995)

    Google Scholar 

  15. Sugiyama, M., et al.: Direct importance estimation with model selection and its application to covariate shift adaptation. In: Advances in Neural Information Processing Systems (2008)

    Google Scholar 

  16. Li, B., Yang, Q., Xue, X.: Transfer learning for collaborative filtering via a rating-matrix generative model. In: Proceedings of the 26th Annual International Conference on Machine Learning. ACM (2009)

    Google Scholar 

  17. Cleuziou, G.: An extended version of the k-means method for overlapping clustering. In: 19th International Conference on Pattern Recognition, ICPR 2008. IEEE (2008)

    Google Scholar 

  18. Park, Y.-J., Tuzhilin, A.: The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM conference on Recommender systems. ACM (2008)

    Google Scholar 

  19. Hotho, A., Steffen, S., Stumme, G.: Ontologies improve text document clustering. In: Third IEEE International Conference on Data Mining, ICDM 2003. IEEE (2003)

    Google Scholar 

  20. Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoying Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kumar, S., Gao, X., Welch, I. (2016). Learning Under Data Shift for Domain Adaptation: A Model-Based Co-clustering Transfer Learning Solution. In: Ohwada, H., Yoshida, K. (eds) Knowledge Management and Acquisition for Intelligent Systems . PKAW 2016. Lecture Notes in Computer Science(), vol 9806. Springer, Cham. https://doi.org/10.1007/978-3-319-42706-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42706-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42705-8

  • Online ISBN: 978-3-319-42706-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics