research-article

Stable Learning via Differentiated Variable Decorrelation

Authors:

Zhitang ChenAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2185 - 2193

https://doi.org/10.1145/3394486.3403269

Published: 20 August 2020 Publication History

Abstract

Recently, as the applications of artificial intelligence gradually seeping into some risk-sensitive areas such as justice, healthcare and autonomous driving, an upsurge of research interest on model stability and robustness has arisen in the field of machine learning. Rather than purely fitting the observed training data, stable learning tries to learn a model with uniformly good performance under non-stationary and agnostic testing data. The key challenge of stable learning in practice is that we do not have any knowledge about the true model and test data distribution as a priori. Under such condition, we cannot expect a faithful estimation of model parameters and its stability over wild changing environments. Previous methods resort to a reweighting scheme to remove the correlations between all the variables through a set of new sample weights. However, we argue that such aggressive decorrelation between all the variables may cause the over-reduced sample size, which leads to the variance inflation and possible underperformance. In this paper, we incorporate the unlabled data from multiple environments into the variable decorrelation framework and propose a Differentiated Variable Decorrelation (DVD) algorithm based on the clustering of variables. Specifically, the variables are clustered according to the stability of their correlations and the variable decorrelation module learns a set of sample weights to remove the correlations merely between the variables of different clusters. Empirical studies on both synthetic and real world datasets clearly demonstrate the efficacy of our DVD algorithm on improving the model parameter estimation and the prediction stability over changing distributions.

References

[1]

Aylin Alin. 2010. Multicollinearity. Wiley Interdisciplinary Reviews Computational Statistics, Vol. 2, 3 (2010), 370--374.

[2]

Susan Athey, Guido W Imbens, and Stefan Wager. 2018. Approximate residual balancing: debiased inference of average treatment effects in high dimensions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 80, 4 (2018), 597--623.

[3]

Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine learning, Vol. 79, 1--2 (2010), 151--175.

[4]

Richard A Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. 2018. Fairness in Criminal Justice Risk Assessments: The State of the Art. Sociological Methods & Research (2018), 004912411878253.

[5]

Steffen Bickel, Michael Brückner, and Tobias Scheffer. 2009. Discriminative learning under covariate shift. Journal of Machine Learning Research, Vol. 10, Sep (2009), 2137--2155.

Digital Library

[6]

Peter Bühlmann. 2018. Invariance, causality and robustness. arXiv preprint arXiv:1812.08233 (2018).

[7]

Sibao Chen, Chris HQ Ding, Bin Luo, and Ying Xie. 2013. Uncorrelated Lasso. In AAAI .

[8]

Miroslav Dud'ik, Steven J Phillips, and Robert E Schapire. 2006. Correcting sample selection bias in maximum entropy density estimation. In Advances in neural information processing systems. 323--330.

[9]

Donald E Farrar and Robert R Glauber. 1967. Multicollinearity in regression analysis: the problem revisited. The Review of Economic and Statistics (1967), 92--107.

[10]

Basura Fernando, Amaury Habrard, Marc Sebban, and Tinne Tuytelaars. 2013. Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE international conference on computer vision. 2960--2967.

Digital Library

[11]

Yaroslav Ganin and Victor Lempitsky. 2014. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495 (2014).

Digital Library

[12]

Jiayuan Huang, Arthur Gretton, Karsten Borgwardt, Bernhard Schölkopf, and Alex J Smola. 2007. Correcting sample selection bias by unlabeled data. In Advances in neural information processing systems. 601--608.

[13]

Brody Huval, T Wang, Sameep Tandon, Jeff Kiske, Will Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Rajpurkar, Toki Migimatsu, Royce Chengyue, et al. 2015. An Empirical Evaluation of Deep Learning on Highway Driving. arXiv: Robotics (2015).

[14]

Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Xiong, and Bo Li. 2018. Stable prediction across unknown environments. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1617--1626.

Digital Library

[15]

Kun Kuang, Peng Cui, Bo Li, Meng Jiang, and Shiqiang Yang. 2017. Estimating treatment effect in the wild via differentiated confounder balancing. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 265--274.

Digital Library

[16]

Kun Kuang, Ruoxuan Xiong, Peng Cui, Susan Athey, and Bo Li. 2020. Stable Prediction with Model Misspecification and Agnostic Distribution Shift. arXiv preprint arXiv:2001.11713 (2020).

[17]

Matja Kukar. 2003. Transductive reliability estimation for medical diagnosis. Artificial Intelligence in Medicine, Vol. 29, 1 (2003), 81--106.

Digital Library

[18]

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. 2017. Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision. 5542--5550.

[19]

Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I Jordan. 2015. Learning transferable features with deep adaptation networks. arXiv preprint arXiv:1502.02791 (2015).

[20]

James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297.

[21]

Luca Martino, V'ictor Elvira, and Francisco Louzada. 2017. Effective sample size for importance sampling based on discrepancy measures. Signal Processing, Vol. 131 (2017), 386--401.

Digital Library

[22]

Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain generalization via invariant feature representation. In International Conference on Machine Learning. 10--18.

[23]

Sinno Jialin Pan, Qiang Yang, et al. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, Vol. 22, 10 (2010), 1345--1359.

Digital Library

[24]

Jonas Peters, Peter Bühlmann, and Nicolai Meinshausen. 2016. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 78, 5 (2016), 947--1012.

[25]

Mateo Rojas-Carulla, Bernhard Schölkopf, Richard Turner, and Jonas Peters. 2018. Invariant models for causal transfer learning. The Journal of Machine Learning Research, Vol. 19, 1 (2018), 1309--1342.

Digital Library

[26]

Cynthia Rudin and Berk Ustun. 2018. Optimized Scoring Systems: Toward Trust in Machine Learning for Healthcare and Criminal Justice. Interfaces, Vol. 48, 5 (2018), 449--466.

Digital Library

[27]

Zheyan Shen, Peng Cui, Tong Zhang, and Kun Kuang. 2019. Stable Learning via Sample Reweighting. arXiv preprint arXiv:1911.12580 (2019).

[28]

Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, Vol. 90, 2 (2000), 227--244.

[29]

Masaaki Takada, Taiji Suzuki, and Hironori Fujisawa. 2018. Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables. In International Conference on Artificial Intelligence and Statistics. 454--463.

[30]

Robert Tibshirani. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Vol. 58, 1 (1996), 267--288.

[31]

Hui Zou and Trevor Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 67, 2 (2005), 301--320.

Cited By

Yuan BZhang JLyu AWu JWang ZYang MLiu KMou MCui P(2024)Emergence and Causality in Complex Systems: A Survey of Causal Emergence and Related Quantitative StudiesEntropy10.3390/e2602010826:2(108)Online publication date: 24-Jan-2024
https://doi.org/10.3390/e26020108
Menglin KWang JPan YZhang HHou MAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)C²DR: Robust Cross-Domain Recommendation based on Causal DisentanglementProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635809(341-349)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3635809
Fan SWang XShi CKuang KLiu NWang B(2024)Debiased Graph Neural Networks With Agnostic Label Selection BiasIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3141260(1-12)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3141260
Show More Cited By

Index Terms

Stable Learning via Differentiated Variable Decorrelation
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Semi-supervised learning settings
    2. Machine learning approaches
      1. Learning linear models

Recommendations

Tackling Non-stationarity in Decentralized Multi-Agent Reinforcement Learning with Prudent Q-Learning
Web Information Systems and Applications
Abstract
Multi-Agent Reinforcement Learning (MARL) is challenging due to the non-stationary issue of an agent’s learning environment caused by multiple co-evolving agents, i.e., the uncertainty rises with multiple agents learning and evolving ...
Stable learning via sparse variable independence
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

The problem of covariate-shift generalization has attracted intensive research attention. Previous stable learning algorithms employ sample reweighting schemes to decorrelate the covariates when there is no explicit domain information about training ...
An adaptable fuzzy reinforcement learning method for non-stationary environments
Abstract
How do we know when a reinforcement learning policy needs to adapt? In non-stationary environments, agents must adapt and learn in environments that change dynamically. We propose a finite-horizon model-free solution using a hierarchical learning ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
706
Total Downloads

Downloads (Last 12 months)52
Downloads (Last 6 weeks)5

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yuan BZhang JLyu AWu JWang ZYang MLiu KMou MCui P(2024)Emergence and Causality in Complex Systems: A Survey of Causal Emergence and Related Quantitative StudiesEntropy10.3390/e2602010826:2(108)Online publication date: 24-Jan-2024
https://doi.org/10.3390/e26020108
Menglin KWang JPan YZhang HHou MAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)C²DR: Robust Cross-Domain Recommendation based on Causal DisentanglementProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635809(341-349)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3635809
Fan SWang XShi CKuang KLiu NWang B(2024)Debiased Graph Neural Networks With Agnostic Label Selection BiasIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3141260(1-12)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3141260
Gao YLi JWang XHe XFeng HZhang Y(2024)Revisiting Attack-Caused Structural Distribution Shift in Graph Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338070936:9(4849-4861)Online publication date: Sep-2024
https://doi.org/10.1109/TKDE.2024.3380709
Wang HKuang KLan LWang ZHuang WWu FYang W(2024)Out-of-Distribution Generalization With Causal Feature SeparationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331225536:4(1758-1772)Online publication date: Apr-2024
https://doi.org/10.1109/TKDE.2023.3312255
Tian SLiu JLiu X(2024)Toward Class-Agnostic Tracking Using Feature Decorrelation in Point CloudsIEEE Transactions on Image Processing10.1109/TIP.2023.334863533(682-695)Online publication date: 2024
https://doi.org/10.1109/TIP.2023.3348635
Yang SJiang TDang QGu LWu X(2024)Stable Learning via Triplex LearningIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34044115:10(5267-5276)Online publication date: Oct-2024
https://doi.org/10.1109/TAI.2024.3404411
Zhang JMa JGuo XLi LHe L(2024)A Speaker Recognition Method Based on Stable LearningICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446329(10221-10225)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10446329
Wang JLi WWang HLyu HThirukumaran CMesfin AYu HLuo J(2024)CRTRE: Causal Rule Generation with Target Trial Emulation Framework2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825951(144-158)Online publication date: 15-Dec-2024
https://doi.org/10.1109/BigData62323.2024.10825951
Wang NWang HYang SChu HDong SViriyasitavat W(2024)Semi-supervised incremental domain generalization learning based on causal invarianceInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02199-z15:10(4815-4828)Online publication date: 24-May-2024
https://doi.org/10.1007/s13042-024-02199-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten