Skip to main content

The Effects of Under and Over Sampling in Exoplanet Transit Identification with Low Signal-to-Noise Ratio Data

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2022)

Abstract

This paper presents the results of experiments with undersampling and oversampling applied to machine learning classifiers used in the identification of exoplanet transits with low signal-to-noise ratio (SNR) data. We start by giving an overview of the most popular method for exoplanet detection, followed by an analysis of the Kepler Object of Interest (KOI) data set, along with an overview of the state of the art machine learning models applied to this problem, and how complex it is to correctly identify exoplanets on low SNR data. We then briefly discuss our signal-to noise ratio reduction procedure, used to generate the low SNR data for our experiments. Finally we use our low SNR data set to train and evaluate some models in scenarios with no sampling strategy and with oversampling and undersampling, using repeated holdout validation. Results show that current classifiers can identify transits in low SNR data sets, with accuracy varying between 69% and 81%, and that sampling strategies can affect simpler classifiers, making them less conservative, but do not show significant effects on more complex classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    MES is a significance metric derived, among other things, from the transit SNR, so that the greater a transit’s SNR, the greater its MES [15].

  2. 2.

    Kepler’s KOI catalog has three possible classes for transit events. “CONFIRMED” is a transit which was identified as a PC and lately confirmed by another method; “FALSE POSITIVES” are transits confirmed to fall into one of the false positive categories from Sect. 1; and “CANDIDATE” are transits identified as PC by the pipeline, but which were not yet confirmed by another method.

  3. 3.

    The test was executed through the f_oneway function of the SciPy package [33].

  4. 4.

    Calculated with the pairwise_tukeyhsd function of the statsmodels package [31], leading to AstroNet \(\times \) SIDRA \(t=-0,0561, p\ll 0,001\); AstroNet \(\times \) ExoplanetSVM \(t=-0,0908, p\ll 0,001\); SIDRA \(\times \) ExoplanetSVM \(t=-0,0347, p\ll 0,001\).

References

  1. Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)

    Google Scholar 

  2. Alshehhi, R., Rodenbeck, K., Gizon, L., Sreenivasan, K.R.: Detection of exomoons in simulated light curves with a regularized convolutional neural network. Astron. Astrophys. 640, A41 (2020). https://doi.org/10.1051/0004-6361/201937059

  3. Amin, R.A., et al.: Detection of exoplanet systems in kepler light curves using adaptive neuro-fuzzy system. In: 2018 International Conference on Intelligent Systems (IS), pp. 66–72. IEEE (2018)

    Google Scholar 

  4. Ansdell, M., et al.: Scientific domain knowledge improves exoplanet transit classification with deep learning. Astrophys. J. 869(1), L7 (2018). https://doi.org/10.3847/2041-8213/aaf23b

  5. Armstrong, D.J., Gamper, J., Damoulas, T.: Exoplanet validation with machine learning: 50 new validated kepler planets (2020)

    Google Scholar 

  6. Armstrong, D.J.: Automatic vetting of planet candidates from ground-based surveys: machine learning with NGTS. Monthly Not. Roy. Astron. Soc. 478(3), 4225–4237 (2018)

    Article  Google Scholar 

  7. Armstrong, D.J., Pollacco, D., Santerne, A.: Transit shapes and self organising maps as a tool for ranking planetary candidates: application to kepler and k2. Monthly Not. Roy. Astron. Soc., stw2881 (2016)

    Google Scholar 

  8. Assembly, I.G.: Resolutions b5 and b6 on the definition of a planet in the solar system and pluto (2014)

    Google Scholar 

  9. Battley, M.P., Pollacco, D., Armstrong, D.J.: A search for young exoplanets in sectors 1–5 of the tess full-frame images. Monthly Not. Roy. Astron. Soc. 496(2), 1197–1216 (2020)

    Article  Google Scholar 

  10. Boss, A.P., et al.: Working group on extrasolar planets. Proc. Int. Astron. Union 1(T26A), 183–186 (2005)

    Article  Google Scholar 

  11. Bugueno, M., Mena, F., Araya, M.: Refining exoplanet detection using supervised learning and feature engineering. In: 2018 XLIV Latin American Computer Conference (CLEI), pp. 278–287. IEEE (2018)

    Google Scholar 

  12. Caceres, G.A., et al.: Autoregressive planet search: application to the kepler mission. Astron. J. 158(2), 58 (2019)

    Article  Google Scholar 

  13. Chaushev, A., et al.: Classifying exoplanet candidates with convolutional neural networks: application to the next generation transit survey. Monthly Not. Roy. Astron. Soc. 488(4), 5232–5250 (2019)

    Article  Google Scholar 

  14. Chintarungruangchai, P., Jiang, G.: Detecting exoplanet transits through machine-learning techniques with convolutional neural networks. Publ. Astron. Soc. Pac. 131(1000), 064502 (2019)

    Article  Google Scholar 

  15. Coughlin, J.L., et al.: Planetary candidates observed by kepler. vii. the first fully uniform catalog based on the entire 48-month data set (q1–q17 dr24). Astrophys. J. Suppl. Ser. 224(1), 12 (2016)

    Google Scholar 

  16. Dattilo, A., et al.: Identifying exoplanets with deep learning. ii. two new super-earths uncovered by a neural network in k2 data. Astron. J. 157(5), 169 (2019)

    Google Scholar 

  17. Grziwa, S., Pätzold, M.: Wavelet-based filter methods to detect small transiting planets in stellar light curves. arXiv preprint arXiv:1607.08417 (2016)

  18. Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)

    MATH  Google Scholar 

  19. Hinners, T.A., Tat, K., Thorp, R.: Machine learning techniques for stellar light curve classification. Astron. J. 156(1), 7 (2018)

    Article  Google Scholar 

  20. Hippke, M., Heller, R.: Optimized transit detection algorithm to search for periodic transits of small planets. Astron. Astrophys. 623, A39 (2019)

    Article  Google Scholar 

  21. Jara-Maldonado, M., Alarcon-Aquino, V., Rosas-Romero, R., Starostenko, O., Ramirez-Cortes, J.M.: Transiting exoplanet discovery using machine learning techniques: a survey (2020)

    Google Scholar 

  22. Jenkins, J.M., et al.: Overview of the kepler science processing pipeline. Astrophysi. J. Lett. 713(2), L87 (2010)

    Article  Google Scholar 

  23. Jenkins, J.M., et al.: Auto-vetting transiting planet candidates identified by the kepler pipeline. Proc. Int. Astron. Union 8(S293), 94–99 (2012)

    Article  Google Scholar 

  24. Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)

    Google Scholar 

  25. McCauliff, S.D., et al.: Automatic classification of kepler planetary transit candidates. Astrophys. J. 806(1), 6 (2015)

    Article  Google Scholar 

  26. Mislis, D., Bachelet, E., Alsubai, K., Bramich, D., Parley, N.: Sidra: a blind algorithm for signal detection in photometric surveys. Monthly Not. Roy. Astron. Soc. 455(1), 626–633 (2016)

    Article  Google Scholar 

  27. Osborn, H.P., et al.: Rapid classification of tess planet candidates with convolutional neural networks. Astron. Astrophys. 633, A53 (2020)

    Article  Google Scholar 

  28. Pearson, K.A., Palafox, L., Griffith, C.A.: Searching for exoplanets using artificial intelligence. Monthly Not. Roy. Astron. Soc. 474(1), 478–491 (2018)

    Article  Google Scholar 

  29. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  30. Schanche, N., et al.: Machine-learning approaches to exoplanet transit detection and candidate validation in wide-field ground-based surveys. Monthly Not. Roy. Astron. Soc. 483(4), 5534–5547 (2019)

    Article  Google Scholar 

  31. Seabold, S., Perktold, J.: Statsmodels: econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference, vol. 57, p. 61. Austin, TX (2010)

    Google Scholar 

  32. Shallue, C.J., Vanderburg, A.: Identifying exoplanets with deep learning: a five-planet resonant chain around kepler-80 and an eighth planet around kepler-90. Astron. J. 155(2), 94 (2018)

    Article  Google Scholar 

  33. Virtanen, P., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)

    Google Scholar 

  34. Weiss, L.M., Petigura, E.A.: The kepler peas in a pod pattern is astrophysical. Astrophys. J. Lett. 893(1), L1 (2020)

    Article  Google Scholar 

  35. Armstrong, D.J., Gamper, J., Damoulas, T.: Exoplanet validation with machine learning: 50 new validated kepler planets (2020)

    Google Scholar 

  36. Yu, L., et al.: Identifying exoplanets with deep learning. iii. automated triage and vetting of tess candidates. Astron. J. 158(1), 25 (2019)

    Google Scholar 

  37. Zucker, S., Giryes, R.: Shallow transits-deep learning. i. feasibility study of deep learning to detect periodic transits of exoplanets. Astron. J. 155(4), 147 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernando Correia Braga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Braga, F.C., Roman, N.T., Falceta-Gonçalves, D. (2022). The Effects of Under and Over Sampling in Exoplanet Transit Identification with Low Signal-to-Noise Ratio Data. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13653. Springer, Cham. https://doi.org/10.1007/978-3-031-21686-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21686-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21685-5

  • Online ISBN: 978-3-031-21686-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics