Skip to main content

Simulation-Based Inference for Exoplanet Atmospheric Retrieval: Insights from Winning the Ariel Data Challenge 2023 Using Normalizing Flows

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023)

Abstract

Advancements in space telescopes have opened new avenues for gathering vast amounts of data on exoplanet atmosphere spectra. However, accurately extracting chemical and physical properties from these spectra poses significant challenges due to the non-linear nature of the underlying physics.

This paper presents novel machine learning models developed by the AstroAI (AstroAI is hosted by the Center for Astrophysics | Harvard & Smithsonian) (https://astroai.cfa.harvard.edu/) team for the Ariel Data Challenge 2023 (https://www.ariel-datachallenge.space/), where one of the models secured the top position among 293 competitors. Leveraging Normalizing Flows, our models predict the posterior probability distribution of atmospheric parameters under different atmospheric assumptions.

Moreover, we introduce an alternative model that exhibits higher performance potential than the winning model, despite scoring lower in the challenge. These findings highlight the need to reevaluate the evaluation metric and prompt further exploration of more efficient and accurate approaches for exoplanet atmosphere spectra analysis.

Finally, we present recommendations to enhance the challenge and models, providing valuable insights for future applications on real observational data. These advancements pave the way for more effective and timely analysis of exoplanet atmospheric properties, advancing our understanding of these distant worlds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://webb.nasa.gov/.

  2. 2.

    https://arielmission.space/.

  3. 3.

    https://github.com/AstroAI-CfA/Ariel_Data_Challenge_2023_solution.

References

  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)

    Google Scholar 

  2. Al-Refaie, A.F., Changeat, Q., Waldmann, I.P., Tinetti, G.: TauREx 3: a fast, dynamic, and extendable framework for retrievals. ApJ 917(1), 37 (2021). https://doi.org/10.3847/1538-4357/ac0252

    Article  ADS  Google Scholar 

  3. Aubin, M., et al.: Exoplanet Atmospheric Parameter Retrieval: the AstroAI winning model for the 2023 Ariel Data Challenge using Normalizing Flows. in prep (2023)

    Google Scholar 

  4. Barber, D., Agakov, F.: The IM algorithm: a variational approach to information maximization. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, pp. 201–208. NIPS’03, MIT Press, Cambridge, MA, USA (2003)

    Google Scholar 

  5. Barstow, J.K., Aigrain, S., Irwin, P.G.J., Sing, D.K.: A consistent retrieval analysis of 10 hot Jupiters observed in transmission. Astrophys. J. 834(1), 50 (2017). https://doi.org/10.3847/1538-4357/834/1/50

    Article  ADS  Google Scholar 

  6. Boehm, S.: The normalizing flow network. https://siboehm.com/articles/19/normalizing-flow-network. Accessed 11 July 2023

  7. Brogi, M., Line, M.R.: Retrieving temperatures and abundances of exoplanet atmospheres with high-resolution cross-correlation spectroscopy. Astron. J. 157(3), 114 (2019). https://doi.org/10.3847/1538-3881/aaffd3

    Article  ADS  MATH  Google Scholar 

  8. Changeat, Q., Yip, K.H.: ESA-ariel data challenge neurIPS 2022: Introduction to exo-atmospheric studies and presentation of the atmospheric big challenge (ABC) database (2023)

    Google Scholar 

  9. Durkan, C., Bekasov, A., Murray, I., Papamakarios, G.: Neural spline flows (2019)

    Google Scholar 

  10. Excalidraw team: Excalidraw. https://excalidraw.com/

  11. Fisher, C., Heng, K.: Retrieval analysis of 38 WFC3 transmission spectra and resolution of the normalization degeneracy. Mon. Not. R. Astron. Soc. 481(4), 4698–4727 (2018). https://doi.org/10.1093/mnras/sty2550

    Article  ADS  MATH  Google Scholar 

  12. Foreman-Mackey, D.: corner.py: scatterplot matrices in python. J. Open Source Softw. 1(2), 24 (2016).https://doi.org/10.21105/joss.00024

  13. Garrett, J.D.: garrettj403/SciencePlots (Sep 2021). https://doi.org/10.5281/zenodo.4106649

  14. Harris, C.R., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

  15. Huang, D., Bharti, A., Souza, A., Acerbi, L., Kaski, S.: Learning robust statistics for simulation-based inference under model misspecification (2023)

    Google Scholar 

  16. Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55

    Article  MATH  Google Scholar 

  17. Kobyzev, I., Prince, S.J., Brubaker, M.A.: Normalizing flows: an introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. 43(11), 3964–3979 (2021).https://doi.org/10.1109/tpami.2020.2992934

  18. Line, M.R., Parmentier, V.: The influence of nonuniform cloud cover on transit transmission spectra. Astrophys. J. 820(1), 78 (2016). https://doi.org/10.3847/0004-637X/820/1/78

    Article  ADS  MATH  Google Scholar 

  19. Lueckmann, J.M., Boelts, J., Greenberg, D.S., Gonçalves, P.J., Macke, J.H.: Benchmarking simulation-based inference (2021)

    Google Scholar 

  20. Lustig-Yaeger, J., et al.: A JWST transmission spectrum of a nearby earth-sized exoplanet (2023)

    Google Scholar 

  21. MacDonald, R.J., Batalha, N.E.: A catalog of exoplanet atmospheric retrieval codes. Res. Notes AAS 7(3), 54 (2023).https://doi.org/10.3847/2515-5172/acc46a

  22. MacDonald, R.J., Madhusudhan, N.: HD 209458b in new light: evidence of nitrogen chemistry, patchy clouds and sub-solar water. Mon. Not. R. Astron. Soc. 469(2), 1979–1996 (2017). https://doi.org/10.1093/mnras/stx804

    Article  ADS  MATH  Google Scholar 

  23. Wes McKinney: data structures for statistical computing in Python. In: Stéfan van der Walt, Jarrod Millman (eds.) Proceedings of the 9th Python in Science Conference, pp. 56 – 61 (2010). https://doi.org/10.25080/Majora-92bf1922-00a

  24. Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S., Lakshminarayanan, B.: Normalizing flows for probabilistic modeling and inference (2021)

    Google Scholar 

  25. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library (2019)

    Google Scholar 

  26. Pinhas, A., Rackham, B.V., Madhusudhan, N., Apai, D.: Retrieval of planetary and stellar properties in transmission spectroscopy with AURA. Mon. Not. R. Astron. Soc. 480(4), 5314–5331 (2018). https://doi.org/10.1093/mnras/sty2209

    Article  ADS  MATH  Google Scholar 

  27. Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows (2016)

    Google Scholar 

  28. Rozet, F.: Zuko: Normalizing flows in PyTorch (oct 2022).https://doi.org/10.5281/zenodo.7625672, https://pypi.org/project/zuko

  29. Seager, S., Sasselov, D.D.: Theoretical transmission spectra during extrasolar giant planet transits. ApJ 537(2), 916–921 (2000). https://doi.org/10.1086/309088

    Article  ADS  Google Scholar 

  30. Sing, D.K., et al.: A continuum from clear to cloudy hot-Jupiter exoplanets without primordial water depletion. Nature 529(7584), 59–62 (2016). https://doi.org/10.1038/nature16068

  31. Tinetti, G., et al.: A chemical survey of exoplanets with ARIEL. Exp. Astron. 46(1), 135–209 (2018). https://doi.org/10.1007/s10686-018-9598-x

  32. Tsiaras, A., et al.: A population study of gaseous exoplanets. Astron. J. 155(4), 156 (2018). https://doi.org/10.3847/1538-3881/aaaf75

    Article  ADS  MATH  Google Scholar 

  33. Vasist, M., Rozet, F., Absil, O., Mollière, P., Nasedkin, E., Louppe, G.: Neural posterior estimation for exoplanetary atmospheric retrieval. A &A 672, A147 (2023). https://doi.org/10.1051/0004-6361/202245263, https://doi.org/10.1051%2F0004-6361%2F202245263

  34. Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2

  35. Wang, B., Leja, J., Villar, V.A., Speagle, J.S.: SBI: flexible, ultra-fast likelihood-free inference customized for astronomical applications. Astrophys. J. Lett. 952(1), L10 (2023).https://doi.org/10.3847/2041-8213/ace361, https://doi.org/10.3847%2F2041-8213%2Face361

  36. Welbanks, L., Madhusudhan, N.: On degeneracies in retrievals of exoplanetary transmission spectra. Astron. J. 157(5), 206 (2019). https://doi.org/10.3847/1538-3881/ab14de

    Article  ADS  MATH  Google Scholar 

  37. Yip, K.H., et al.: ESA-ariel data challenge neurIPS 2022: Inferring physical properties of exoplanets from next-generation telescopes (2022)

    Google Scholar 

Download references

Acknowledgments

This team was put together, led and supervised by AstroAI at the Center for Astrophysics | Harvard & Smithsonian. Mayeul Aubin and Cecilia Garraffo were partially supported by the Director’s Office at the Center for Astrophysics | Harvard & Smithsonian. AstroAI thanks the Harvard Data Science Initiative for their support. Jeremy J. Drake was supported by NASA contract NAS8-03060 to the Chandra X-ray Center during the course of this research. Iouli Gordon, Robert Hargreaves, and Vladimir Makhnev were supported by NASA PDART grant 80NSSC20K1059 throughout this work.

We thank the NSF AI Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) for providing computational resources through the Faculty of Arts and Science Research Cluster (FASRC) of Harvard.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mayeul Aubin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aubin, M. et al. (2025). Simulation-Based Inference for Exoplanet Atmospheric Retrieval: Insights from Winning the Ariel Data Challenge 2023 Using Normalizing Flows. In: Meo, R., Silvestri, F. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2023. Communications in Computer and Information Science, vol 2137. Springer, Cham. https://doi.org/10.1007/978-3-031-74643-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-74643-7_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-74642-0

  • Online ISBN: 978-3-031-74643-7

  • eBook Packages: Artificial Intelligence (R0)

Publish with us

Policies and ethics