Abstract
A large majority of existing domain adaptation methods makes an assumption of freely available labeled source and unlabeled target data. They exploit the discrepancy between their distributions and build representations common to both target and source domains. In reality, such a simplifying assumption rarely holds, since source data are routinely a subject of legal and contractual constraints between data owners and data customers. Despite a limited access to source domain data, decision-making procedures might be available in the form of, e.g., classification rules trained on the source and made ready for a direct deployment and later reuse. In other cases, the owner of a source data is allowed to share a few representative examples such as class means. The aim of this chapter is therefore to address the domain adaptation problem in such constrained real world applications, i.e. where the reuse of source domain data is limited to classification rules or a few representative examples. As a solution, we extend recent techniques based on feature corruption and their marginalization, both considering supervised and unsupervised domain adaptation settings. The proposed models are tested and compared on private and publicly available source datasets showing significant performance gains despite the absence of the whole source data and shortage of labeled target data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The minimization with the exponential loss requires a gradient descent technique, the logistic and hinge losses can be approximated by the upper bounds.
- 2.
The framework can be easily generalized to Gaussian, Laplace and Poisson noise [319].
- 3.
It corresponds to the DeCAF6 used in the previous chapters and several papers.
- 4.
We will make the dataset with several set of features available soon.
- 5.
Note however, that the parameters obtained by cross-validation on the source are in general sub-optimal when applied to the target.
- 6.
It performs well in general with small impact when we vary it between 10e-3 and 10e-1.
- 7.
Note that this latter is obtained without using the labeled target set and hence will be used also as baseline in the US scenario.
- 8.
If we assume large labeled target set, there is no more need to use the source data.
Acknowledgements
This work has been supported by Xerox Research Center Europe.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Csurka, G., Chidlovskii, B., Clinchant, S. (2017). What to Do When the Access to the Source Data Is Constrained?. In: Csurka, G. (eds) Domain Adaptation in Computer Vision Applications. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-58347-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-58347-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58346-4
Online ISBN: 978-3-319-58347-1
eBook Packages: Computer ScienceComputer Science (R0)