Single Imputation Methods Applied to a Global Geothermal Database

Alelhí, Román-Flores Mariana; Guillermo, Santamaría-Bonfil; Lorena, Díaz-González; Gustavo, Arroyo-Figueroa

doi:10.1007/978-3-030-04491-6_14

Single Imputation Methods Applied to a Global Geothermal Database

Román-Flores Mariana Alelhí¹⁵,
Santamaría-Bonfil Guillermo¹⁶,
Díaz-González Lorena¹⁷ &
…
Arroyo-Figueroa Gustavo¹⁸

Conference paper
First Online: 03 January 2019

624 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11288))

Abstract

In the exploitation stage of a geothermal reservoir, the estimation of the bottomhole temperature (BHT) is essential to know the available energy potential, as well as the viability of its exploitation. This BHT estimate can be measured directly, which is very expensive, therefore, statistical models used as virtual geothermometers are preferred. Geothermometers have been widely used to infer the temperature of deep geothermal reservoirs from the analysis of fluid samples collected at the soil surface from springs and exploration wells. Our procedure is based on an extensive geochemical data base (n = 708) with measurements of BHT and geothermal fluid of eight main element compositions. Unfortunately, the geochemical database has missing data in terms of some compositions of measured principal elements. Therefore, to take advantage of all this information in the BHT estimate, a process of imputation or completion of the values is necessary.

In the present work, we compare the imputations using medium and medium statistics, as well as the stochastic regression and the support vector machine to complete our data set of geochemical components. The results showed that the regression and SVM are superior to the mean and median, especially because these methods obtained the smallest RMSE and MAE errors.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Díaz-González, L., Santoyo, E., Reyes-Reyes, J.: Tres nuevos geotermómetros mejorados de Na/K usando herramientas computacionales y geoquimiométricas: aplicación a la predicción de temperaturas de sistemas geotérmicos. Revista Mexicana de Ciencias Geológicas 25(3), 465–482 (2008)
Google Scholar
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall/CRC, New York/Boca Raton (1997)
Book Google Scholar
Allison, P.D.: Missing Data, vol. 136. Sage Publications, Thousand Oaks (2001)
MATH Google Scholar
Batista, G.E., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5–6), 519–533 (2003)
Article Google Scholar
Tsai, C.F., Li, M.L., Lin, W.C.: A class center based approach for missing value imputation. Knowl.-Based Syst. 151, 124–135 (2018)
Article Google Scholar
Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)
Article MathSciNet Google Scholar
Norazian, M.N., Shukri, Y.A., Azam, R.N.: Al Bakri, A.M.M.: Estimation of missing values in air pollution data using single imputation techniques. ScienceAsia 34, 341–345 (2008)
Article Google Scholar
Noor, N.M., Abdullah, M.M.A.B., Yahaya, A.S., Ramli, N.A.: Comparison of linear interpolation method and mean method to replace the missing values in environmental data set. Small 5, 10 (2015)
Google Scholar
Razak, N.A., Zubairi, Y.Z., Yunus, R.M.: Imputing missing values in modelling the PM10 concentrations. Sains Malays. 43, 1599–1607 (2014)
Google Scholar
Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., Kolehmainen, M.: Methods for imputation of missing values in air quality data sets. Atmos. Environ. 38, 2895–2907 (2004)
Article Google Scholar
Yahaya, A.S., Ramli, N.A., Ahmad, F., Mohd, N., Muhammad, N., Bahrim, N.H.: Determination of the best imputation technique for estimating missing values when fitting the weibull distribution. Int. J. Appl. Sci. Technol. (2011)
Google Scholar
Jerez, J.M., et al.: Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. Intell. Med. 50, 105–115 (2010)
Article Google Scholar
Engels, J.M., Diehr, P.: Imputation of missing longitudinal data: a comparison of methods. J. Clin. Epidemiol. 56(10), 968–976 (2003)
Article Google Scholar
Shrive, F.M., Stuart, H., Quan, H., Ghali, W.A.: Dealing with missing data in a multi-question depression scale: a comparison of imputation methods. BMC Med. Res. Methodol. 6(1), 57 (2006)
Article Google Scholar
Newman, D.A.: Longitudinal modeling with randomly and systematically missing data: a simulation of ad hoc, maximum likelihood, and multiple imputation techniques. Organ. Res. Methods 6, 328–362 (2003)
Article Google Scholar
Olinsky, A., Chen, S., Harlow, L.: The comparative efficacy of imputation methods for missing data in structural equation modeling. Eur. J. Oper. Res. 151(1), 53–79 (2003)
Article MathSciNet Google Scholar
Aydilek, I.B., Arslan, A.: A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf. Sci. 233, 25–35 (2013)
Article Google Scholar
Wang, X., Li, A., Jiang, Z., Feng, H.: Missing value estimation for DNA microarray gene expression data by support vector regression imputation and orthogonal coding scheme. BMC Bioinformatics 7(1), 32 (2006)
Article Google Scholar
Buuren, S.V., Groothuis-Oudshoorn, K.: MICE: multivariate imputation by chained equations in R. J. Stat. Softw. 1–68 (2010)
Google Scholar
Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7, 147 (2002)
Article Google Scholar
Drucker, H., Burges, C.J., Kaufman, L., Smola, A.J., Vapnik, V.: Support vector regression machines. In: Advances in Neural Information Processing Systems, pp. 155–161 (1997)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Schölkopf, B., Smola, A.J.: Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, p. 644. MIT Press, Cambridge (2002)
Google Scholar
Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)
Article Google Scholar
Baraldi, A.N., Enders, C.K.: An introduction to modern missing data analyses. J. Sch. Psychol. 48(1), 5–37 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Posgrado en Optimización y Cómputo Aplicado, Universidad Autónoma del Estado de Morelos, Avenida Universidad 1001, Chamilpa, 62209, Cuernavaca, Morelos, Mexico
Román-Flores Mariana Alelhí
Instituto Nacional de Electricidad y Energías Limpias, Gerencia de Tecnologías de la Información, Reforma 113 Col. Palmira, 62490, Cuernavaca, Morelos, Mexico
Santamaría-Bonfil Guillermo
Departamento de Computación, Centro de Investigación en Ciencias, Instituto de Investigación en Ciencias Básicas Aplicadas, Universidad Autónoma del Estado de Morelos, Av. Universidad 1001, Chamilpa, 62209, Cuernavaca, Morelos, Mexico
Díaz-González Lorena
Instituto Nacional de Electricidad y Energías Limpias, Av. Reforma # 113, Col. Palmira, 62490, Cuernavaca, Morelos, Mexico
Arroyo-Figueroa Gustavo

Authors

Román-Flores Mariana Alelhí
View author publications
You can also search for this author in PubMed Google Scholar
Santamaría-Bonfil Guillermo
View author publications
You can also search for this author in PubMed Google Scholar
Díaz-González Lorena
View author publications
You can also search for this author in PubMed Google Scholar
Arroyo-Figueroa Gustavo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Román-Flores Mariana Alelhí .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Ildar Batyrshin
Universidad Panamericana, Mexico City, Mexico
María de Lourdes Martínez-Villaseñor
Faculty of Engineering, Universidad Panamericana, Mexico City, Mexico
Hiram Eredín Ponce Espinosa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alelhí, RF.M., Guillermo, SB., Lorena, DG., Gustavo, AF. (2018). Single Imputation Methods Applied to a Global Geothermal Database. In: Batyrshin, I., Martínez-Villaseñor, M., Ponce Espinosa, H. (eds) Advances in Soft Computing. MICAI 2018. Lecture Notes in Computer Science(), vol 11288. Springer, Cham. https://doi.org/10.1007/978-3-030-04491-6_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-04491-6_14
Published: 03 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04490-9
Online ISBN: 978-3-030-04491-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics