Skip to main content
Log in

Comparative study of machine learning algorithms in predicting asphaltene precipitation with a novel validation technique

  • Research
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

Several thermodynamic models have been proposed in literature to estimate the amount of asphaltene precipitation (AAP); however, they usually need several many inputs, and the characterization of the samples is often not accurate enough. This paper compares the performance of four data-driven methods, including Extreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), Least Squares – Support Vector Machine coupled with a Particle Swarm Optimization (LSSVM-PSO), and Multilinear Regression (MLR), to predict the AAP as a function of oil composition, API, SARA fractions, solvent molar mass, dilution ratio, pressure, and temperature. The dataset includes 1703 samples, 20% of which is used for testing. An innovative nested K-fold cross-validation is also proposed to tune the hyperparameters of the data-driven methods. The contributions of this work include a comparison of the performance of different data-driven methods to estimate the AAP and introducing a novel cross-validation technique. Data-driven results are compared with those of the Perturbed Chain – Statistical Associating Fluid Theory Equation of State. The results reveal the superiority of the data-driven methods over the thermodynamic model, except for the MLR. Meanwhile, XGBoost showed the best performance among other data-driven methods. Coefficients of determination of 99.57%, 98.96%, 98.17%, 85.23%, and 90.40% were achieved by the XGBoost, LSSVM-PSO, MLP, MLR, and the thermodynamic model, respectively. Finally, it is shown that the proposed nested K-fold cross-validation positively affected the generalization of the data-driven methods. The findings of this study can help engineers select reliable methods to estimate the AAP and improve their generalization using the nested K-fold cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Data is confidential and cannot be shared.

References

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Jafar Khalighi, Alexey Cheremisin; Methodology: Jafar Khalighi; Formal analysis and investigation: Jafar Khalighi; Writing - original draft preparation: Jafar Khalighi; Writing - review and editing: Alexey Cheremisin; Resources: Jafar Khalighi, Alexey Cheremisin; Supervision: Alexey Cheremisin.

Corresponding author

Correspondence to Jafar Khalighi.

Ethics declarations

Conflicts of interest/competing interests

The authors declare no conflict/competing interest.

Additional information

Communicated by: H. Babaie

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khalighi, J., Cheremisin, A. Comparative study of machine learning algorithms in predicting asphaltene precipitation with a novel validation technique. Earth Sci Inform 16, 3097–3111 (2023). https://doi.org/10.1007/s12145-023-01075-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-023-01075-8

Keywords

Navigation