research-article

Open access

Press ECCS to Doubt (Your Causal Graph)

Authors:

Markos Markakis,

Michael CafarellaAuthors Info & Claims

GUIDE-AI '24: Proceedings of the Conference on Governance, Understanding and Integration of Data for Effective and Responsible AI

Pages 6 - 15

https://doi.org/10.1145/3665601.3669842

Published: 09 June 2024 Publication History

All formats PDF

Abstract

Techniques from the theory of causality have seen extensive use in natural and social sciences, since they allow scientists to explicitly model assumptions and draw quantitative causal conclusions. More recently, causality has also gathered interest in many computer science sub-fields, including machine learning and systems. A causal model is usually represented as a causal graph, often automatically discovered from available data. For problems for which running a full constraint-based causal discovery algorithm and correctly orienting all edges is computationally intractable, automatically generated causal graphs are prone to error, calling for expensive manual graph verification. Understanding which parts of a causal graph have the largest impact on downstream results is essential for expediting this graph verification process.

In this work, we present ECCS – a framework for Exposing Critical Causal Structures within a causal graph, with respect to a given Average Treatment Effect (ATE) calculation. We formalize the Interactive Causal Graph Verification problem, in which user judgments about edges in the causal graph are solicited sequentially, with the goal of minimizing the absolute error in the ATE of interest (without advance access to its ground-truth value). We present three algorithms to solve this problem. Based on a preliminary evaluation, our best-performing algorithm, AdjSetEdit, can solicit a sequence of 10 user judgments that outperforms a randomized such sequence by more than 60%, with time complexity linear in the number of data points and polynomial in the number of variables.

References

[1]

Emile HL Aarts, Jan HM Korst, and Peter JM van Laarhoven. 1988. A Quantitative Analysis of the Simulated Annealing Algorithm: A Case Study for the Traveling Salesman Problem. Journal of Statistical Physics 50 (1988), 187–206.

[2]

Nadia Abd-Alsabour. 2014. A Review on Evolutionary Feature Selection. In 2014 European Modelling Symposium. IEEE Computer Society CPS, Los Alamitos, CA, USA, 20–26. https://doi.org/10.1109/EMS.2014.28

Digital Library

[3]

Pooja Aggarwal, Ajay Gupta, Prateeti Mohapatra, Seema Nagar, Atri Mandal, Qing Wang, and Amit Paradkar. 2021. Localization of Operational Faults in Cloud Applications by Mining Causal Dependencies in Logs Using Golden Signals. In Service-Oriented Computing – ICSOC 2020 Workshops. Springer International Publishing, New York, NY, USA, 137–149.

[4]

OECD AI. 2024. Absolute Relative Error (ARE). Retrieved May 29, 2024 from https://oecd.ai/en/catalogue/metrics/absolute-relative-error-%28are%29

[5]

Kamran Alipour, Aditya Lahiri, Ehsan Adeli, Babak Salimi, and Michael Pazzani. 2022. Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces. arxiv:2206.05257 [cs.CV]

[6]

Abdullah Alomar, Pouya Hamadanian, Arash Nasr-Esfahany, Anish Agarwal, Mohammad Alizadeh, and Devavrat Shah. 2023. CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Berkeley, CA, USA, 1115–1147.

[7]

Steen A. Andersson, David Madigan, and Michael D. Perlman. 1997. A Characterization of Markov Equivalence Classes for Acyclic Digraphs. The Annals of Statistics 25, 2 (1997), 505 – 541. https://doi.org/10.1214/aos/1031833662

[8]

Hengrui Cai, Yixin Wang, Michael Jordan, and Rui Song. 2023. On Learning Necessary and Sufficient Causal Graphs. In Advances in Neural Information Processing Systems, Vol. 36. Curran Associates, Inc., Red Hook, NY, USA, 42148–42160. https://proceedings.neurips.cc/paper_files/paper/2023/file/837b396039248acb08c385bebb6291b4-Paper-Conference.pdf

[9]

Leonid Chindelevitch, Daniel Ziemek, Ahmed Enayetallah, Ranjit Randhawa, Ben Sidders, Christoph Brockel, and Enoch S. Huang. 2012. Causal reasoning on biological networks: interpreting transcriptional changes. Bioinformatics 28, 8 (02 2012), 1114–1121. https://doi.org/10.1093/bioinformatics/bts090 arXiv:https://academic.oup.com/bioinformatics/article-pdf/28/8/1114/48930575/bioinformatics_28_8_1114.pdf

[10]

Davin Choo, Kirankumar Shiragur, and Arnab Bhattacharyya. 2022. Verification and search algorithms for causal DAGs. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., Red Hook, NY, USA, 12787–12799. https://proceedings.neurips.cc/paper_files/paper/2022/file/5340b0c0b76dc0115f5cc91c20c1251d-Paper-Conference.pdf

[11]

Sainyam Galhotra, Amir Gilad, Sudeepa Roy, and Babak Salimi. 2022. HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA, 1598–1611. https://doi.org/10.1145/3514221.3526149

Digital Library

[12]

Yu Gan, Mingyu Liang, Sundar Dev, David Lo, and Christina Delimitrou. 2021. Sage: Practical and Scalable ML-Driven Performance Debugging in Microservices. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Virtual, USA) (ASPLOS ’21). Association for Computing Machinery, New York, NY, USA, 135–151. https://doi.org/10.1145/3445814.3446700

Digital Library

[13]

Maxime Gasse, Damien Grasset, Guillaume Gaudron, and Pierre-Yves Oudeyer. 2021. Causal Reinforcement Learning using Observational and Interventional Data. arxiv:2106.14421 [cs.LG]

[14]

Clark Glymour, Kun Zhang, and Peter Spirtes. 2019. Review of Causal Discovery Methods Based on Graphical Models. Frontiers in Genetics 10 (2019), 524.

[15]

Shantanu Gupta, David Childers, and Zachary Chase Lipton. 2023. Local Causal Discovery for Estimating Causal Effects. In Proceedings of the Second Conference on Causal Learning and Reasoning(Proceedings of Machine Learning Research, Vol. 213). PMLR, 408–447. https://proceedings.mlr.press/v213/gupta23b.html

[16]

Peter E Hart, Nils J Nilsson, and Bertram Raphael. 1968. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE transactions on Systems Science and Cybernetics 4, 2 (1968), 100–107.

[17]

Chikara Hashimoto. 2019. Weakly Supervised Multilingual Causality Extraction from Wikipedia. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 2988–2999. https://doi.org/10.18653/v1/D19-1296

[18]

Oktie Hassanzadeh, Debarun Bhattacharjya, Mark Feblowitz, Kavitha Srinivas, Michael Perrone, Shirin Sohrabi, and Michael Katz. 2019. Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts. In IJCAI. 5003–5009.

[19]

Alain Hauser and Peter Bühlmann. 2012. Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs. Journal of Machine Learning Research 13, 1 (Aug 2012), 2409–2464.

[20]

Stefan Heindorf, Yan Scholten, Henning Wachsmuth, Axel-Cyrille Ngonga Ngomo, and Martin Potthast. 2020. CauseNet: Towards a Causality Graph Extracted from the Web. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 3023–3030. https://doi.org/10.1145/3340531.3412763

Digital Library

[21]

Chanelle J Howe, Zinzi D Bailey, Julia R Raifman, and John W Jackson. 2022. Recommendations for Using Causal Diagrams to Study Racial Health Disparities. Am J Epidemiol 191, 12 (Nov. 2022), 1981–1989.

[22]

Huining Hu, Zhentao Li, and Adrian R Vetta. 2014. Randomized Experimental Design for Causal Graph Discovery. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc., Red Hook, NY, USA. https://proceedings.neurips.cc/paper_files/paper/2014/file/e53a0a2978c28872a4505bdb51db06dc-Paper.pdf

[23]

Antti Hyttinen, Frederick Eberhardt, and Patrik O Hoyer. 2013. Experiment Selection for Causal Discovery. Journal of Machine Learning Research 14, 1 (2013), 3041–3071.

Digital Library

[24]

Guido W. Imbens. 2020. Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics. Journal of Economic Literature 58, 4 (December 2020), 1129–79. https://doi.org/10.1257/jel.20191597

[25]

Intel Corporation. 2019. Intel Xeon Gold 6230 CPU. Intel Corporation. Retrieved May 29, 2024 from https://ark.intel.com/content/www/us/en/ark/products/192437/intel-xeon-gold-6230-processor-27-5m-cache-2-10-ghz.html

[26]

Emre Kıcıman, Robert Ness, Amit Sharma, and Chenhao Tan. 2023. Causal Reasoning and Large Language Models: Opening a New Frontier for Causality. arxiv:2305.00050 [cs.AI]

[27]

Murat Kocaoglu, Alex Dimakis, and Sriram Vishwanath. 2017. Cost-Optimal Learning of Causal Graphs. In Proceedings of the 34th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 70). PMLR, 1875–1884. https://proceedings.mlr.press/v70/kocaoglu17a.html

[28]

Sindy Löwe, David Madras, Richard Zemel, and Max Welling. 2022. Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data. In Proceedings of the First Conference on Causal Learning and Reasoning(Proceedings of Machine Learning Research, Vol. 177). PMLR, 509–525. https://proceedings.mlr.press/v177/lowe22a.html

[29]

Pingchuan Ma, Rui Ding, Shuai Wang, Shi Han, and Dongmei Zhang. 2023. XInsight: eXplainable Data Analysis Through The Lens of Causality. Proceedings of the ACM on Management of Data 1, 2 (2023), 1–27.

Digital Library

[30]

Marloes H. Maathuis, Markus Kalisch, and Peter Bühlmann. 2009. Estimating High-dimensional Intervention Effects from Observational Data. The Annals of Statistics 37, 6A (2009), 3133 – 3164. https://doi.org/10.1214/09-AOS685

[31]

Markos Markakis, An Bo Chen, Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahout, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella. 2024. Sawmill: From Logs to Causal Diagnosis of Large Systems. In Companion of the 2024 International Conference on Management of Data (SIGMOD-Companion ’24), June 9–15, 2024, Santiago, AA, Chile. Association for Computing Machinery, New York, NY, USA, 4 pages. https://doi.org/10.1145/3626246.3654731

Digital Library

[32]

Judea Pearl. 1985. Bayesian Networks: A Model of Self-activated Memory for Evidential Reasoning. In Proceedings of the 7th Annual Conference of the Cognitive Science Society. 329–334.

[33]

Judea Pearl. 2009. Causality: Models, Reasoning and Inference. Cambridge University Press, New York, NY, USA.

Digital Library

[34]

Judea Pearl and Dana Mackenzie. 2018. The Book of Why: The New Science of Cause and Effect. Basic Books, New York, NY, USA.

Digital Library

[35]

Chen Peng, Di Zhang, and Urbashi Mitra. 2024. Graph Identification and Upper Confidence Evaluation for Causal Bandits with Linear Models. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7165–7169. https://doi.org/10.1109/ICASSP48485.2024.10445823

[36]

Vitalii I Rodionov. 1992. On the number of labeled acyclic digraphs. Discrete Mathematics 105, 1-3 (1992), 319–321.

Digital Library

[37]

Karen Sachs, Omar Perez, Dana Pe’er, Douglas A. Lauffenburger, and Garry P. Nolan. 2005. Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science 308, 5721 (2005), 523–529. https://doi.org/10.1126/science.1105809 arXiv:https://www.science.org/doi/pdf/10.1126/science.1105809

[38]

Babak Salimi, Johannes Gehrke, and Dan Suciu. 2018. Bias in OLAP Queries: Detection, Explanation, and Removal. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD ’18). Association for Computing Machinery, New York, NY, USA, 1021–1035. https://doi.org/10.1145/3183713.3196914

Digital Library

[39]

Pedro Sanchez, Jeremy P Voisey, Tian Xia, Hannah I Watson, Alison Q O’Neil, and Sotirios A Tsaftaris. 2022. Causal machine learning for healthcare and precision medicine. Royal Society Open Science 9, 8 (2022), 220638.

[40]

Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, and Yoshua Bengio. 2021. Towards Causal Representation Learning. arxiv:2102.11107 [cs.LG]

[41]

Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G Dimakis, and Sriram Vishwanath. 2015. Learning Causal Graphs with Small Interventions. In Advances in Neural Information Processing Systems, Vol. 28. Curran Associates, Inc., Red Hook, NY, USA. https://proceedings.neurips.cc/paper_files/paper/2015/file/b865367fc4c0845c0682bd466e6ebf4c-Paper.pdf

[42]

Xu Shi, Ziyang Pan, and Wang Miao. 2023. Data integration in causal inference. Wiley Interdisciplinary Reviews: Computational Statistics 15, 1 (2023), e1581.

[43]

Shohei Shimizu, Patrik O. Hoyer, Aapo Hyvärinen, Antti Kerminen, and Michael Jordan. 2006. A Linear Non-Gaussian Acyclic Model for Causal Discovery.Journal of Machine Learning Research 7, 10 (2006), 2003 – 2030. https://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=23240079&site=eds-live&scope=site

[44]

N. J. A. Sloane. [n. d.]. A003024: Number of acyclic digraphs (or DAGs) with n labeled nodes.Retrieved May 29, 2024 from https://oeis.org/A003024

[45]

Michael E. Sobel. 2000. Causal Inference in the Social Sciences. J. Amer. Statist. Assoc. 95, 450 (01 Jun 2000), 647–651. https://doi.org/10.1080/01621459.2000.10474243

[46]

Arjun Sondhi and Ali Shojaie. 2019. The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks. Journal of Machine Learning Research 20, 164 (2019), 1–31. http://jmlr.org/papers/v20/17-601.html

[47]

Peter Spirtes and Clark Glymour. 1991. An Algorithm for Fast Recovery of Sparse Causal Graphs. Social Science Computer Review 9, 1 (1991), 62–72.

[48]

Peter Spirtes, Clark N Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search. MIT Press, Cambridge, MA, USA.

[49]

Jin Tian, Azaria Paz, and Judea Pearl. 1998. Finding Minimal D-separators.

[50]

Benito van der Zander and Maciej Liskiewicz. 2016. Separators and Adjustment Sets in Markov Equivalent DAGs. Proceedings of the AAAI Conference on Artificial Intelligence 30, 1 (Mar. 2016). https://doi.org/10.1609/aaai.v30i1.10424

[51]

Benito van der Zander, Maciej Liśkiewicz, and Johannes Textor. 2014. Constructing Separators and Adjustment Sets in Ancestral Graphs. In Proceedings of the UAI 2014 Conference on Causal Inference: Learning and Prediction - Volume 1274 (Quebec City, Canada) (CI’14). CEUR-WS.org, Aachen, DEU, 11–24.

[52]

Benito van der Zander, Maciej Liśkiewicz, and Johannes Textor. 2019. Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework. Artificial Intelligence 270 (2019), 1–40. https://doi.org/10.1016/j.artint.2018.12.006

Digital Library

[53]

Thomas Verma and Judea Pearl. 1990. Equivalence and Synthesis of Causal Models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence(UAI ’90). Elsevier Science Inc., USA, 255–270.

Digital Library

[54]

Matthew J. Vowels, Necati Cihan Camgoz, and Richard Bowden. 2022. D’ya Like DAGs? A Survey on Structure Learning and Causal Discovery. ACM Comput. Surv. 55, 4, Article 82 (Nov 2022), 36 pages. https://doi.org/10.1145/3527154

Digital Library

[55]

Changzhang Wang, You Zhou, Qiang Zhao, and Zhi Geng. 2014. Discovering and orienting the edges connected to a target variable in a DAG via a sequential local learning approach. Computational statistics & data analysis 77 (2014), 252–266.

[56]

Marco A Wiering 2002. Evolving causal neural networks. In Proceedings of the Twelfth Belgian-Dutch Conference on Machine Learning. 103–108.

[57]

Thomas C. Williams, Cathrine C. Bach, Niels B. Matthiesen, Tine B. Henriksen, and Luigi Gagliardi. 2018. Directed acyclic graphs: a tool for causal studies in paediatrics. Pediatric Research 84, 4 (01 Oct 2018), 487–493. https://doi.org/10.1038/s41390-018-0071-3

[58]

Jianxin Yin, You Zhou, Changzhang Wang, Ping He, Cheng Zheng, and Zhi Geng. 2008. Partial orientation and local structural learning of causal networks for prediction. In Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008(Proceedings of Machine Learning Research, Vol. 3). PMLR, 93–105. http://proceedings.mlr.press/v3/yin08a.html

[59]

Brit Youngmann, Michael Cafarella, Babak Salimi, and Anna Zeng. 2023. Causal Data Integration. Proc. VLDB Endow. 16, 10 (jun 2023), 2659–2665. https://doi.org/10.14778/3603581.3603602

Digital Library

[60]

Jiaming Zeng, Michael F Gensheimer, Daniel L Rubin, Susan Athey, and Ross D Shachter. 2022. Uncovering interpretable potential confounders in electronic medical records. Nature Communications 13, 1 (2022), 1014.

[61]

Yan Zeng, Ruichu Cai, Fuchun Sun, Libo Huang, and Zhifeng Hao. 2023. A Survey on Causal Reinforcement Learning. arxiv:2302.05209 [cs.AI]

[62]

Jiongli Zhu, Sainyam Galhotra, Nazanin Sabri, and Babak Salimi. 2023. Consistent Range Approximation for Fair Predictive Modeling. arxiv:2212.10839 [cs.LG]

[63]

Shengyu Zhu, Ignavier Ng, and Zhitang Chen. 2020. Causal Discovery with Reinforcement Learning. arxiv:1906.04477 [cs.LG]

Index Terms

Press ECCS to Doubt (Your Causal Graph)
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Causal reasoning and diagnostics

Recommendations

Utilizing Expert Knowledge and Contextual Information for Sample-Limited Causal Graph Construction
Database Systems for Advanced Applications
Abstract
This paper focuses on causal discovery, which aims at inferring the underlying causal relationships from observational samples. Existing methods of causal discovery rely on a large number of samples. So when the number of samples is limited, they ...
Semi-automatic Causal Graph Construction System
CSSE '20: Proceedings of the 3rd International Conference on Computer Science and Software Engineering

As a kind of special relation, causality demonstrates the direct impact for causes on effects. Especially in the medical field, causal graph plays an essential role, which can help in the diagnosis and analysis of diseases. Some researchers have tried to ...
Causal Explanation of Graph Neural Networks
Intelligent Data Engineering and Automated Learning – IDEAL 2024
Abstract
Graph Neural Networks (GNNs) are currently used in many real-world applications. With this notable spread, the development of sophisticated techniques for explaining their decisions becomes highly necessary. Although many works have been proposed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GUIDE-AI '24: Proceedings of the Conference on Governance, Understanding and Integration of Data for Effective and Responsible AI

June 2024

67 pages

ISBN:9798400706943

DOI:10.1145/3665601

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Defense Advanced Research Projects Agency (DARPA)

Conference

SIGMOD/PODS '24

Sponsor:

SIGMOD

SIGMOD/PODS '24: International Conference on Management of Data

June 9 - 15, 2024

AA, Santiago, Chile

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
671
Total Downloads

Downloads (Last 12 months)671
Downloads (Last 6 weeks)70

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten