skip to main content
research-article

Causal Feature Selection in the Presence of Sample Selection Bias

Published: 11 August 2023 Publication History

Abstract

Almost all existing causal feature selection methods are proposed without considering the problem of sample selection bias. However, in practice, as data-gathering process cannot be fully controlled, sample selection bias often occurs, leading to spurious correlations between features and the class variable, which seriously deteriorates the performance of those existing methods. In this article, we study the problem of causal feature selection under sample selection bias and propose a novel Progressive Causal Feature Selection (PCFS) algorithm which has three phases. First, PCFS learns the sample weights to balance the treated group and control group distributions corresponding to each feature for removing spurious correlations. Second, based on the sample weights, PCFS uses a weighted cross-entropy model to estimate the causal effect of each feature and removes some irrelevant features from the confounder set. Third, PCFS progressively repeats the first two phases to remove more irrelevant features and finally obtains a causal feature set. Using synthetic and real-world datasets, the experiments have validated the effectiveness of PCFS, in comparison with several state-of-the-art classical and causal feature selection methods.

References

[1]
Constantin F. Aliferis, Ioannis Tsamardinos, and Alexander R. Statnikov. 2003. HITON: A novel Markov blanket algorithm for optimal variable selection. In Proceedings of the American Medical Informatics Association Annual Symposium.
[2]
Susan Athey, Guido W. Imbens, and Stefan Wager. 2016. Efficient Inference of Average Treatment Effects in High Dimensions via Approximate Residual Balancing. Technical Report.
[3]
Giorgos Borboudakis and Ioannis Tsamardinos. 2019. Forward-backward selection with early dropping. Journal of Machine Learning Research 20, 1 (2019), 276–314.
[4]
Ruichu Cai, Jiawei Chen, Zijian Li, Wei Chen, Keli Zhang, Junjian Ye, Zhuozhang Li, Xiaoyan Yang, and Zhenjie Zhang. 2021. Time series domain adaptation via sparse associative structure alignment. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. 6859–6867.
[5]
Ruichu Cai, Jiahao Li, Zhenjie Zhang, Xiaoyan Yang, and Zhifeng Hao. 2020. DACH: Domain adaptation without domain information. IEEE Transactions on Neural Networks and Learning Systems 31, 12 (2020), 5055–5067.
[6]
Ruichu Cai, Jie Qiao, Zhenjie Zhang, and Zhifeng Hao. 2018. SELF: Structural equational likelihood framework for causal discovery. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 1787–1794.
[7]
Ruichu Cai, Siyu Wu, Jie Qiao, Zhifeng Hao, Keli Zhang, and Xi Zhang. 2022. THPs: Topological hawkes processes for learning causal structure on event sequences. IEEE Transactions on Neural Networks and Learning Systems. (2022), 1–15. DOI:https://doi.org/10.1109/TNNLS.2022.3175622
[8]
Ruichu Cai, Zhenjie Zhang, and Zhifeng Hao. 2011. BASSUM: A Bayesian semi-supervised method for classification feature selection. Pattern Recognition 44, 4 (2011), 811–820.
[9]
Janez Demsar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7 (2006), 1–30.
[10]
Xianjie Guo, Kui Yu, Fuyuan Cao, Pei-Pei Li, and Hao Wang. 2022. Error-aware Markov blanket learning for causal feature selection. Information Sciences 589 (2022), 849–877.
[11]
Kun Kuang, Peng Cui, Susan Athey, Ruoxuan Xiong, and Bo Li. 2018. Stable prediction across unknown environments. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1617–1626.
[12]
Kun Kuang, Peng Cui, Bo Li, Meng Jiang, and Shiqiang Yang. 2017. Estimating treatment effect in the wild via differentiated confounder balancing. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 265–274.
[13]
Kun Kuang, Haotian Wang, Yue Liu, Ruoxuan Xiong, Runze Wu, Weiming Lu, Yue Ting Zhuang, Fei Wu, Peng Cui, and Bo Li. 2023. Stable prediction with leveraging seed variable. IEEE Transactions on Knowledge and Data Engineering 35, 6 (2023), 6392–6404.
[14]
Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, and Huan Liu. 2018. Feature selection: A data perspective. Computing Surveys 50, 6 (2018), 94:1–94:45.
[15]
Zhaolong Ling, Kui Yu, Hao Wang, Lin Liu, Wei Ding, and Xindong Wu. 2019. BAMB: A balanced Markov blanket discovery approach to feature selection. ACM Transactions on Intelligent Systems and Technology 10, 5 (2019), 52:1–52:25.
[16]
Dimitris Margaritis and Sebastian Thrun. 1999. Bayesian network induction via local neighborhoods. Advances in Neural Information Processing Systems 12 (1999), 505–511.
[17]
Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
[18]
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2, 11 (1901), 559–572.
[19]
Donald B. Rubin. 1973. Matching to remove bias in observational studies. Biometrics (1973), 159–183.
[20]
Zheyan Shen, Peng Cui, Kun Kuang, Bo Li, and Peixuan Chen. 2018. Causally regularized learning with agnostic data selection bias. In Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference. 411–419.
[21]
Claudia Shi, David M. Blei, and Victor Veitch. 2019. Adapting neural networks for the estimation of treatment effects. In Proceedings of the Advances in Neural Information Processing Systems. 2503–2513.
[22]
Baochen Sun, Jiashi Feng, and Kate Saenko. 2016. Return of frustratingly easy domain adaptation. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2058–2065.
[23]
Chang Tang, Xinzhong Zhu, Jiajia Chen, Pichao Wang, Xinwang Liu, and Jie Tian. 2018. Robust graph regularized unsupervised feature selection. Expert Systems with Applications 96 (2018), 64–76.
[24]
Ioannis Tsamardinos and Constantin F. Aliferis. 2003. Towards principled feature selection: Relevancy, filters and wrappers. In Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics.
[25]
Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Time and sample efficient discovery of Markov blankets and direct causal relations. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 673–678.
[26]
Hao Wang, Zhaolong Ling, Kui Yu, and Xindong Wu. 2020. Towards efficient and effective discovery of Markov blankets for feature selection. Information Sciences 509 (2020), 227–242.
[27]
Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S. Yu. 2018. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the ACM Multimedia Conference on Multimedia Conference. 402–410.
[28]
Xingyu Wu, Bingbing Jiang, Kui Yu, Chunyan Miao, and Huanhuan Chen. 2020. Accurate Markov boundary discovery for causal feature selection. IEEE Transactions on Cybernetics 50, 12 (2020), 4983–4996.
[29]
Shuai Yang, Kui Yu, Fuyuan Cao, Lin Liu, Hao Wang, and Jiuyong Li. 2023. Learning causal representations for robust domain adaptation. IEEE Transactions on Knowledge and Data Engineering 35, 3 (2023), 2750–2764.
[30]
Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang. 2021. A survey on causal inference. ACM Transactions on Knowledge Discovery from Data 15, 5 (2021), 74:1–74:46.
[31]
Sandeep Yaramakala and Dimitris Margaritis. 2005. Speculative Markov blanket discovery for optimal feature selection. In Proceedings of the 5th IEEE International Conference on Data Mining. 4.
[32]
Kui Yu, Xianjie Guo, Lin Liu, Jiuyong Li, Hao Wang, Zhaolong Ling, and Xindong Wu. 2020. Causality-based feature selection: Methods and evaluations. Computing Surveys 53, 5 (2020), 111:1–111:36.
[33]
Kui Yu, Lin Liu, and Jiuyong Li. 2021. A unified view of causal and non-causal feature selection. ACM Transactions on Knowledge Discovery from Data 15, 4 (2021), 63:1–63:46.
[34]
Kui Yu, Lin Liu, Jiuyong Li, Wei Ding, and Thuc Duy Le. 2020. Multi-source causal feature selection. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 9 (2020), 2240–2256.
[35]
Lei Yu and Huan Liu. 2004. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5 (2004), 1205–1224.

Cited By

View all
  • (2024)FedCSLProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i11.29113(12235-12243)Online publication date: 20-Feb-2024
  • (2024)Unbiased Semantic Representation Learning Based on Causal Disentanglement for Domain GeneralizationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365995320:8(1-20)Online publication date: 12-Jun-2024
  • (2024)Deep Causal Reasoning for RecommendationsACM Transactions on Intelligent Systems and Technology10.1145/365398515:4(1-25)Online publication date: 18-Jun-2024
  • Show More Cited By

Index Terms

  1. Causal Feature Selection in the Presence of Sample Selection Bias

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 14, Issue 5
    October 2023
    472 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3615589
    • Editor:
    • Huan Liu
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2023
    Online AM: 17 June 2023
    Accepted: 05 June 2023
    Revised: 05 May 2023
    Received: 18 November 2022
    Published in TIST Volume 14, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Causal feature selection
    2. sample selection bias
    3. causal effect

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Program of China
    • National Natural Science Foundation of China
    • University Synergy Innovation Program of Anhui Province

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)253
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FedCSLProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i11.29113(12235-12243)Online publication date: 20-Feb-2024
    • (2024)Unbiased Semantic Representation Learning Based on Causal Disentanglement for Domain GeneralizationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365995320:8(1-20)Online publication date: 12-Jun-2024
    • (2024)Deep Causal Reasoning for RecommendationsACM Transactions on Intelligent Systems and Technology10.1145/365398515:4(1-25)Online publication date: 18-Jun-2024
    • (2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
    • (2024)Evolving Knowledge Graph Representation Learning with Multiple Attention Strategies for Citation Recommendation SystemACM Transactions on Intelligent Systems and Technology10.1145/363527315:2(1-26)Online publication date: 28-Mar-2024
    • (2024)Fast Shrinking parents-children learning for Markov blanket-based feature selectionInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02108-415:8(3553-3566)Online publication date: 7-Mar-2024
    • (2023)Remote assessment of Parkinson’s disease symptom severity based on group interaction feature assistanceInternational Journal of Machine Learning and Cybernetics10.1007/s13042-023-02050-x15:7(2595-2618)Online publication date: 21-Dec-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media