skip to main content
10.1145/3221269.3221292acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Order-independent constraint-based causal structure learning for gaussian distribution models using GPUs

Published:09 July 2018Publication History

ABSTRACT

Learning the causal structures in high-dimensional datasets allows deriving advanced insights from observational data, thus creating the potential for new applications. One crucial limitation of state-of-the-art methods for learning causal relationships, such as the PC algorithm, is their long execution time. While, in the worst case, the execution time is exponential to the dimension of a given dataset, it is polynomial if the underlying causal structures are sparse. To address the long execution time, parallelized extensions of the algorithm have been developed addressing the Central Processing Unit (CPU) as the primary execution device. While modern multicore CPUs expose a decent level of parallelism, coprocessors, such as Graphics Processing Units (GPUs), are specifically designed to process thousands of data points in parallel, providing superior parallel processing capabilities compared to CPUs.

In our work, we leverage the parallel processing power of GPUs to address the drawback of the long execution time of the PC algorithm and develop an efficient GPU-accelerated implementation for Gaussian distribution models. Based on an experimental evaluation of various high-dimensional real-world gene expression datasets, we show that our GPU-accelerated implementation outperforms existing CPU-based versions, by factors up to 700.

References

  1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI'16). USENIX Association, Berkeley, CA, USA, 265--283. http://dl.acm.org/citation.cfm?id=3026877.3026899 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Ahrens. 1975. Dempster, A. P.: Elements of Continuous Multivariate Analysis. Addison-Wesley Publ. Co., Reading, Mass. 1969. XII, 388 S. Biometrische Zeitschrift 17, 7 (1975), 468--468.Google ScholarGoogle Scholar
  3. Steen A. Andersson, David Madigan, and Michael D. Perlman. 1997. A Characterization of Markov Equivalence Classes for Acyclic Digraphs. The Annals of Statistics 25, 2 (1997), 505--541. http://www.jstor.org/stable/2242556Google ScholarGoogle ScholarCross RefCross Ref
  4. Joshua Buckner, Justin Wilson, Mark Seligman, Brian Athey, Stanley Watson, and Fan Meng. 2010. The gputools package enables GPU computing in R. Bioinformatics 26, 1 Jan. 2010), 134--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Cano, M. Gómez-Olmedo, and S. Moral. 2008. A Score Based Ranking of the Edges for the PC Algorithm. In Proceedings of the Fourth European Workshop on Probabilistic Graphical Models, Manfred Jaeger and Thomas D. Nielsen (Eds.). 41--48.Google ScholarGoogle Scholar
  6. Bryan Catanzaro, Narayanan Sundaram, and Kurt Keutzer. 2008. Fast Support Vector Machine Training and Classification on Graphics Processors. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, New York, NY, USA, 104--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs/1410.0759 (2014). arXiv:1410.0759 http://arxiv.org/abs/1410.0759Google ScholarGoogle Scholar
  8. David Maxwell Chickering. 2002. Learning Equivalence Classes of Bayesian-network Structures. J. Mach. Learn. Res. 2 (March 2002), 445--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Diego Colombo and Marloes H. Maathuis. 2014. Order-independent Constraint-based Causal Structure Learning. J. Mach. Learn. Res. 15, 1 (Jan. 2014), 3741--3782. http://dl.acm.org/citation.cfm?id=2627435.2750365 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Diego Colombo, Marloes H. Maathuis, Markus Kalisch, and Thomas S. Richardson. 2011. Learning High-dimensional DAGs with Latent and Selection Variables. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI'11). AUAI Press, Arlington, Virginia, United States, 850--850. http://dl.acm.org/citation.cfm?id=3020548.3020648 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An Industry-Standard API for Shared-Memory Programming. IEEE Comput. Sci. Eng. 5, 1 (Jan. 1998), 46--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. P. Dawid. 1979. Conditional Independence in Statistical Theory. Journal of the Royal Statistical Society. Series B (Methodological) 41, 1 (1979), 1--31. http://www.jstor.org/stable/2984718Google ScholarGoogle ScholarCross RefCross Ref
  13. Dirk Eddelbuettel and Conrad Sanderson. 2014. RcppArmadillo: Accelerating R with High-performance C++ Linear Algebra. Comput. Stat. Data Anal. 71 (March 2014), 1054--1063.Google ScholarGoogle Scholar
  14. Marloes H Maathuis, Diego Colombo, Markus Kalisch, and Peter Bühlmann. 2010. Predicting causal effects in large-scale systems from observational data. 7 (04 2010), 247--8.Google ScholarGoogle Scholar
  15. J. Abellán and M. Gómez-Olmedo and S. Moral. 2006. Some Variations on the PC Algorithm. In Proceedings of the Third European Workshop on Probabilistic Graphical Models (PGM' 06). 1--8.Google ScholarGoogle Scholar
  16. Markus Kalisch and Peter Bühlmann. 2007. Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm. J. Mach. Learn. Res. 8 (May 2007), 613--636. http://dl.acm.org/citation.cfm?id=1248659.1248681 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Markus Kalisch, Martin Mächler, Diego Colombo, Marloes H. Maathuis, and Peter Bühlmann. 2012. Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software, Articles 47, 11 (2012), 1--26.Google ScholarGoogle ScholarCross RefCross Ref
  18. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., USA, 1097--1105. http://dl.acm.org/citation.cfm?id=2999134.2999257 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Steffen L Lauritzen. 1996. Graphical models. Vol. 17. Clarendon Press.Google ScholarGoogle Scholar
  20. Thuc Le, Tao Hoang, Jiuyong Li, Lin Liu, Huawen Liu, and Shu Hu. 2015. A fast PC algorithm for high dimensional causal discovery with multi-core PCs. IEEE/ACM Transactions on Computational Biology and Bioinformatics (02 2015).Google ScholarGoogle Scholar
  21. Thuc Duy Le, Lin Liu, Junpeng Zhang, Bing Liu, and Jiuyong Li. 2015. From miRNA regulation to miRNA-TF co-regulation: computational approaches and challenges. Briefings in Bioinformatics 16, 3 (2015), 475--496.Google ScholarGoogle ScholarCross RefCross Ref
  22. Joseph Lee Rodgers and W Alan Nicewander. 1988. Thirteen Ways to Look at the Correlation Coefficient. The American Statistician 42, 1 (1988), 59--66.Google ScholarGoogle ScholarCross RefCross Ref
  23. Erich L Lehmann and Joseph P Romano. 2006. Testing statistical hypotheses. Springer Science & Business Media.Google ScholarGoogle Scholar
  24. Erik Lindholm, John Nickolls, Stuart Oberman, and John Montrym. 2008. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro 28, 2 (March 2008), 39--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ren C Luo, M-H Lin, and Ralph S Scherp. 1988. Dynamic multi-sensor data fusion system for intelligent robots. IEEE Journal on Robotics and Automation 4, 4 (Aug 1988), 386--396.Google ScholarGoogle ScholarCross RefCross Ref
  26. Marloes H Maathuis, Markus Kalisch, and Peter Bühlmann. 2009. Estimating high-dimensional intervention effects from observational data. The Annals of Statistics 37, 6A (12 2009), 3133--3164.Google ScholarGoogle ScholarCross RefCross Ref
  27. Anders L. Madsen, Frank Jensen, Antonio Salmerón, Martin Karlsen, Helge Langseth, and Thomas D. Nielsen. 2014. A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs. In Probabilistic Graphical Models, Linda C. van der Gaag and Ad J. Feelders (Eds.). Springer International Publishing, Cham, 302--317.Google ScholarGoogle Scholar
  28. Anders L. Madsen, Frank Jensen, Antonio Salmerón, Helge Langseth, and Thomas D. Nielsen. 2015. Parallelisation of the PC Algorithm. In Proceedings of the 16th Conference of the Spanish Association for Artificial Intelligence on Advances in Artificial Intelligence - Volume 9422. Springer-Verlag New York, Inc., New York, NY, USA, 14--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Anders L. Madsen, Frank Jensen, Antonio Salmerón, Helge Langseth, and Thomas D. Nielsen. 2017. A Parallel Algorithm for Bayesian Network Structure Learning from Large Data Sets. Know.-Based Syst. 117, C (Feb. 2017), 46--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Katerina Marazopoulou, Rumi Ghosh, Prasanth Lade, and David Jensen. 2016. Causal Discovery for Manufacturing Domains. CoRR abs/1605.04056 (2016). arXiv:1605.04056 http://arxiv.org/abs/1605.04056Google ScholarGoogle Scholar
  31. Daniel Marbach, James C. Costello, Robert Küffner, Nicole M. Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, Manolis Kellis, James J. Collins, Andrej Aderhold, Gustavo Stolovitzky, and et al. 2012. Wisdom of crowds for robust gene network inference. Nature Methods 9, 8 (8 2012), 796--804.Google ScholarGoogle Scholar
  32. Christopher Meek. 1995. Strong Completeness and Faithfulness in Bayesian Networks. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI'95). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 411--418. http://dl.acm.org/citation.cfm?id=2074158.2074205 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nicolai Meinshausen and Peter Bühlmann. 2010. Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 4 (2010), 417--473.Google ScholarGoogle ScholarCross RefCross Ref
  34. John Nickolls, Ian Buck, Michael Garland, and Kevin Skadron. 2008. Scalable Parallel Programming with CUDA., Article 16 (2008), 14 pages.Google ScholarGoogle Scholar
  35. NVIDIA Corporation. 2017. CUDA Math API. Retrieved March 10, 2018 from http://docs.nvidia.com/cuda/pdf/CUDA_Math_API.pdfGoogle ScholarGoogle Scholar
  36. NVIDIA Corporation. 2017. NVIDIA TESLA V100 GPU ACCELERATOR. Retrieved February 12, 2018 from http://www.nvidia.com/content/PDF/Volta-Datasheet.pdfGoogle ScholarGoogle Scholar
  37. NVIDIA Corporation. 2018. Profiler User's Guide. Retrieved March 10, 2018 from http://docs.nvidia.com/cuda/pdf/CUDA_Profiler_Users_Guide.pdfGoogle ScholarGoogle Scholar
  38. Judea Pearl. 2009. Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Judea Pearl and Thomas Verma. 1991. A Theory of Inferred Causation. In Proceedings of the Second International Conference on Principles of Knowledge Representation and Reasoning (KR'91). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 441--452. http://dl.acm.org/citation.cfm?id=3087158.3087202 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Roger Penrose. 1955. A generalized inverse for matrices. In Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 51. Cambridge University Press, Cambridge University Press, 406--413.Google ScholarGoogle ScholarCross RefCross Ref
  41. R Core Team. 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/Google ScholarGoogle Scholar
  42. Andrea Rau, Florence Jaffrézic, and Grégory Nuel. 2013. Joint estimation of causal effects from observational and intervention gene expression data. BMC systems biology 7, 1 (Oct 2013), 111.Google ScholarGoogle Scholar
  43. Thomas Richardson. 1996. A Discovery Algorithm for Directed Cyclic Graphs. In Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence (UAI'96). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 454--461. http://dl.acm.org/citation.cfm?id=2074284.2074338 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Conrad Sanderson and Ryan Curtin. 2016. Armadillo: a template-based C library for linear algebra. Journal of Open Source Software 1, 2 (Oct 2016), 26. https://doi.org/Google ScholarGoogle ScholarCross RefCross Ref
  45. Christian Schwarz, Christopher Schmidt, Michael Hopstock, Werner Sinzig, and Hasso Plattner. 2016. Efficient Calculation and Simulation of Product Cost Leveraging In-Memory Technology and Coprocessors. In The Sixth International Conference on Business Intelligence and Technology (BUSTECH 2016).Google ScholarGoogle Scholar
  46. Marco Scutari. 2010. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, Articles 35, 3 (2010), 1--22.Google ScholarGoogle ScholarCross RefCross Ref
  47. Peter Spirtes. 2010. Introduction to Causal Inference. J. Mach. Learn. Res. 11 (Aug. 2010), 1643--1662. http://dl.acm.org/citation.cfm?id=1756006.1859905 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Peter Spirtes, Clark N Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search. MIT press.Google ScholarGoogle Scholar
  49. Peter Spirtes and Kun Zhang. 2016. Causal discovery and inference: concepts and recent methodological advances. Applied Informatics 3, 1 (18 Feb 2016), 3.Google ScholarGoogle Scholar
  50. Michail Tsagris, Giorgos Borboudakis, Vincenzo Lagani, and Ioannis Tsamardinos. 2018. Constraint-based causal discovery with mixed data. International Journal of Data Science and Analytics (02 Feb 2018).Google ScholarGoogle Scholar
  51. Joe Whittaker. 2009. Graphical Models in Applied Multivariate Statistics. Wiley Publishing. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Order-independent constraint-based causal structure learning for gaussian distribution models using GPUs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          SSDBM '18: Proceedings of the 30th International Conference on Scientific and Statistical Database Management
          July 2018
          314 pages
          ISBN:9781450365055
          DOI:10.1145/3221269

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 July 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SSDBM '18 Paper Acceptance Rate30of75submissions,40%Overall Acceptance Rate56of146submissions,38%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader