The Effects of Under and Over Sampling in Exoplanet Transit Identification with Low Signal-to-Noise Ratio Data

Braga, Fernando Correia; Roman, Norton Trevisan; Falceta-Gonçalves, Diego

doi:10.1007/978-3-031-21686-2_8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13653))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

Abstract

This paper presents the results of experiments with undersampling and oversampling applied to machine learning classifiers used in the identification of exoplanet transits with low signal-to-noise ratio (SNR) data. We start by giving an overview of the most popular method for exoplanet detection, followed by an analysis of the Kepler Object of Interest (KOI) data set, along with an overview of the state of the art machine learning models applied to this problem, and how complex it is to correctly identify exoplanets on low SNR data. We then briefly discuss our signal-to noise ratio reduction procedure, used to generate the low SNR data for our experiments. Finally we use our low SNR data set to train and evaluate some models in scenarios with no sampling strategy and with oversampling and undersampling, using repeated holdout validation. Results show that current classifiers can identify transits in low SNR data sets, with accuracy varying between 69% and 81%, and that sampling strategies can affect simpler classifiers, making them less conservative, but do not show significant effects on more complex classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Transiting Exoplanet Discovery Using Machine Learning Techniques: A Survey

Article 05 June 2020

A Multiresolution Machine Learning Technique to Identify Exoplanets

Transiting Exoplanet Hunting Using Convolutional Neural Networks

Notes

1.
MES is a significance metric derived, among other things, from the transit SNR, so that the greater a transit’s SNR, the greater its MES [15].
2.
Kepler’s KOI catalog has three possible classes for transit events. “CONFIRMED” is a transit which was identified as a PC and lately confirmed by another method; “FALSE POSITIVES” are transits confirmed to fall into one of the false positive categories from Sect. 1; and “CANDIDATE” are transits identified as PC by the pipeline, but which were not yet confirmed by another method.
3.
The test was executed through the f_oneway function of the SciPy package [33].
4.
Calculated with the pairwise_tukeyhsd function of the statsmodels package [31], leading to AstroNet $\times $ SIDRA $t=-0,0561, p\ll 0,001$; AstroNet $\times $ ExoplanetSVM $t=-0,0908, p\ll 0,001$; SIDRA $\times $ ExoplanetSVM $t=-0,0347, p\ll 0,001$.

References

Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th $\{$USENIX$\}$ Symposium on Operating Systems Design and Implementation ($\{$OSDI$\}$ 2016), pp. 265–283 (2016)
Google Scholar
Alshehhi, R., Rodenbeck, K., Gizon, L., Sreenivasan, K.R.: Detection of exomoons in simulated light curves with a regularized convolutional neural network. Astron. Astrophys. 640, A41 (2020). https://doi.org/10.1051/0004-6361/201937059
Amin, R.A., et al.: Detection of exoplanet systems in kepler light curves using adaptive neuro-fuzzy system. In: 2018 International Conference on Intelligent Systems (IS), pp. 66–72. IEEE (2018)
Google Scholar
Ansdell, M., et al.: Scientific domain knowledge improves exoplanet transit classification with deep learning. Astrophys. J. 869(1), L7 (2018). https://doi.org/10.3847/2041-8213/aaf23b
Armstrong, D.J., Gamper, J., Damoulas, T.: Exoplanet validation with machine learning: 50 new validated kepler planets (2020)
Google Scholar
Armstrong, D.J.: Automatic vetting of planet candidates from ground-based surveys: machine learning with NGTS. Monthly Not. Roy. Astron. Soc. 478(3), 4225–4237 (2018)
Article Google Scholar
Armstrong, D.J., Pollacco, D., Santerne, A.: Transit shapes and self organising maps as a tool for ranking planetary candidates: application to kepler and k2. Monthly Not. Roy. Astron. Soc., stw2881 (2016)
Google Scholar
Assembly, I.G.: Resolutions b5 and b6 on the definition of a planet in the solar system and pluto (2014)
Google Scholar
Battley, M.P., Pollacco, D., Armstrong, D.J.: A search for young exoplanets in sectors 1–5 of the tess full-frame images. Monthly Not. Roy. Astron. Soc. 496(2), 1197–1216 (2020)
Article Google Scholar
Boss, A.P., et al.: Working group on extrasolar planets. Proc. Int. Astron. Union 1(T26A), 183–186 (2005)
Article Google Scholar
Bugueno, M., Mena, F., Araya, M.: Refining exoplanet detection using supervised learning and feature engineering. In: 2018 XLIV Latin American Computer Conference (CLEI), pp. 278–287. IEEE (2018)
Google Scholar
Caceres, G.A., et al.: Autoregressive planet search: application to the kepler mission. Astron. J. 158(2), 58 (2019)
Article Google Scholar
Chaushev, A., et al.: Classifying exoplanet candidates with convolutional neural networks: application to the next generation transit survey. Monthly Not. Roy. Astron. Soc. 488(4), 5232–5250 (2019)
Article Google Scholar
Chintarungruangchai, P., Jiang, G.: Detecting exoplanet transits through machine-learning techniques with convolutional neural networks. Publ. Astron. Soc. Pac. 131(1000), 064502 (2019)
Article Google Scholar
Coughlin, J.L., et al.: Planetary candidates observed by kepler. vii. the first fully uniform catalog based on the entire 48-month data set (q1–q17 dr24). Astrophys. J. Suppl. Ser. 224(1), 12 (2016)
Google Scholar
Dattilo, A., et al.: Identifying exoplanets with deep learning. ii. two new super-earths uncovered by a neural network in k2 data. Astron. J. 157(5), 169 (2019)
Google Scholar
Grziwa, S., Pätzold, M.: Wavelet-based filter methods to detect small transiting planets in stellar light curves. arXiv preprint arXiv:1607.08417 (2016)
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)
MATH Google Scholar
Hinners, T.A., Tat, K., Thorp, R.: Machine learning techniques for stellar light curve classification. Astron. J. 156(1), 7 (2018)
Article Google Scholar
Hippke, M., Heller, R.: Optimized transit detection algorithm to search for periodic transits of small planets. Astron. Astrophys. 623, A39 (2019)
Article Google Scholar
Jara-Maldonado, M., Alarcon-Aquino, V., Rosas-Romero, R., Starostenko, O., Ramirez-Cortes, J.M.: Transiting exoplanet discovery using machine learning techniques: a survey (2020)
Google Scholar
Jenkins, J.M., et al.: Overview of the kepler science processing pipeline. Astrophysi. J. Lett. 713(2), L87 (2010)
Article Google Scholar
Jenkins, J.M., et al.: Auto-vetting transiting planet candidates identified by the kepler pipeline. Proc. Int. Astron. Union 8(S293), 94–99 (2012)
Article Google Scholar
Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)
Google Scholar
McCauliff, S.D., et al.: Automatic classification of kepler planetary transit candidates. Astrophys. J. 806(1), 6 (2015)
Article Google Scholar
Mislis, D., Bachelet, E., Alsubai, K., Bramich, D., Parley, N.: Sidra: a blind algorithm for signal detection in photometric surveys. Monthly Not. Roy. Astron. Soc. 455(1), 626–633 (2016)
Article Google Scholar
Osborn, H.P., et al.: Rapid classification of tess planet candidates with convolutional neural networks. Astron. Astrophys. 633, A53 (2020)
Article Google Scholar
Pearson, K.A., Palafox, L., Griffith, C.A.: Searching for exoplanets using artificial intelligence. Monthly Not. Roy. Astron. Soc. 474(1), 478–491 (2018)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Schanche, N., et al.: Machine-learning approaches to exoplanet transit detection and candidate validation in wide-field ground-based surveys. Monthly Not. Roy. Astron. Soc. 483(4), 5534–5547 (2019)
Article Google Scholar
Seabold, S., Perktold, J.: Statsmodels: econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference, vol. 57, p. 61. Austin, TX (2010)
Google Scholar
Shallue, C.J., Vanderburg, A.: Identifying exoplanets with deep learning: a five-planet resonant chain around kepler-80 and an eighth planet around kepler-90. Astron. J. 155(2), 94 (2018)
Article Google Scholar
Virtanen, P., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)
Google Scholar
Weiss, L.M., Petigura, E.A.: The kepler peas in a pod pattern is astrophysical. Astrophys. J. Lett. 893(1), L1 (2020)
Article Google Scholar
Armstrong, D.J., Gamper, J., Damoulas, T.: Exoplanet validation with machine learning: 50 new validated kepler planets (2020)
Google Scholar
Yu, L., et al.: Identifying exoplanets with deep learning. iii. automated triage and vetting of tess candidates. Astron. J. 158(1), 25 (2019)
Google Scholar
Zucker, S., Giryes, R.: Shallow transits-deep learning. i. feasibility study of deep learning to detect periodic transits of exoplanets. Astron. J. 155(4), 147 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidade de São Paulo, São Paulo, Brazil
Fernando Correia Braga, Norton Trevisan Roman & Diego Falceta-Gonçalves

Authors

Fernando Correia Braga
View author publications
You can also search for this author in PubMed Google Scholar
Norton Trevisan Roman
View author publications
You can also search for this author in PubMed Google Scholar
Diego Falceta-Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fernando Correia Braga .

Editor information

Editors and Affiliations

Federal University of Rio Grande do Norte, Natal, Brazil
João Carlos Xavier-Junior
Federal University of Bahia, Salvador, Brazil
Ricardo Araújo Rios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Braga, F.C., Roman, N.T., Falceta-Gonçalves, D. (2022). The Effects of Under and Over Sampling in Exoplanet Transit Identification with Low Signal-to-Noise Ratio Data. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13653. Springer, Cham. https://doi.org/10.1007/978-3-031-21686-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-21686-2_8
Published: 19 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21685-5
Online ISBN: 978-3-031-21686-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Effects of Under and Over Sampling in Exoplanet Transit Identification with Low Signal-to-Noise Ratio Data