Random Forest Based Multiclass Classification Approach for Highly Skewed Particle Data

Yalcin Kuzu, Serpil

doi:10.1007/s10915-023-02144-2

Random Forest Based Multiclass Classification Approach for Highly Skewed Particle Data

Published: 22 February 2023

Volume 95, article number 21, (2023)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Serpil Yalcin Kuzu ORCID: orcid.org/0000-0001-8905-8089¹

331 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Data used in particle physics analyses have an imbalanced nature in which the events of interest are rare due to the broad background. These events can be identified from bulk by intensive computational studies including application of sophisticated analysis techniques. Classification algorithms provided by supervised machine learning (ML) approaches can be utilized to interpret skewed particle dataset as an alternative to the classic techniques even for multi particle state analysis. In this study, the ground state of the bottomonium (\(\varUpsilon \)(1 S)) and its excited states (\(\varUpsilon \)(2 S) and \(\varUpsilon \)(3 S)) were studied by application of multiclass classification approach based on random forest classifier (RFC) which is a novel ML approach example in particle analysis with implementation of resampling techniques for preprocessing dataset and modification of the weighting strategy. For this purpose, five widely used oversampling and two hybrid strategies, using over and under resampling together, were adjusted to RFC. Moreover, class weights applied RFC, weighted random forest (WRF), was used in the analysis. Due to the data structure, performance of the applied models was evaluated by the derivatives of confusion matrix. It is revealed that hybrid techniques implemented in RFC is suitable for handling highly imbalanced classes. G-mean and BAcc scores of upsilon states presented that with SMOTETomek strategy the model exhibited highest classification achievement, around 90\(\%\), with high sensitivity implying the success of the application on multiclass classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

Candice Bentéjac, Anna Csörgő & Gonzalo Martínez-Muñoz

A Review on Random Forest: An Ensemble Classifier

References

Susan, S., Kumar, A.: The balancing trick: optimized sampling of imbalanced datasets—a brief survey of the recent state of the art. Eng Rep 3, 12298 (2020)
Google Scholar
Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int J Emerg Technol Adv Eng 02, 42 (2012)
Google Scholar
Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: a review. GESTS Int Trans Comput Sci Eng 30, 25 (2006)
Google Scholar
Visa, S., Ralescu, A.: Issues in mining imbalanced data sets—a review paper. In: Proceedings of the midwest artificial intelligence and cognitive science conference (2005)
Nguyen, G.H., Bouzerdoum, A., Phung, S.L.: Learning pattern classification tasks with imbalanced data sets. In: Pattern Recognition, Peng-Yeng Yin. ISBN 978-953-307-014-8 (2009)
Sun, Y., Wong, A.K., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recognit. Artif. Intell. 23, 687 (2009)
Google Scholar
ALICE Collaboration: Measurements of the dielectron continuum in pp, p-Pb and Pb–Pb collisions with ALICE at the LHC. Nucl. Phys. A 967, 684 (2017)
Alves, A.: Stacking machine learning classifiers to identify Higgs Bosons at the LHC. J. Instrum. 12, T05005 (2017)
Google Scholar
Kuzu, S.Y.: J/\(\psi \) production with machine learning at the LHC. Eur. Phys. J. Plus 137, 392 (2022)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5 (2001)
MATH Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220 (2017)
Google Scholar
Chen, C., Liaw, A., Breiman, L.: Using Random Forest to Learn Imbalanced Data, vol. 666. University of California, Berkeley (2004)
Google Scholar
Chawla, N.V.K., Bowyer, W., Hall, L.O., Kegelmeyer, W.P., Chawla, N.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321 (2002)
MATH Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, p. 1322 (2008)
Han, H., Wang, W., Mao, B.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. Adv. Intell. Comput. 3644, 878 (2005)
Google Scholar
Nguyen, H.M., Coope, E.W., Kamei, K.: Borderline over-sampling for imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradigms 03, 4 (2009)
Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 06, 2 (2004)
Google Scholar
Karsch, F., Laermann, E.: Thermodynamics and in-medium hadron properties from lattice QCD. Quark-Gluon Plasma III, 1 (2004)
MATH Google Scholar
Shuryak, E.V.: Theory of hadronic plasma. Sov. Phys. JETP 47, 212 (1978)
Google Scholar
ALICE Collaboration: Upsilon production in Pb–Pb and p–Pb collisions at forward rapidity with ALICE at the LHC. J. Phys. Conf. Ser. 509, 012112 (2014)
Matsui, T., Satz, H.: J/\(\psi \) suppression by quark-gluon plasma formation. Phys. Lett. B 178, 416 (1986)
Google Scholar
Digal, S., Petreczky, P., Satz, H.: Quarkonium feed down and sequential suppression. Phys. Rev. D 64, 094015 (2001)
Google Scholar
Brambilla, N., Ghiglieri, J., Vairo, A., Petreczky, P.: Static quark-antiquark pairs at finite temperature. Phys. Rev. D 78, 014017 (2008)
Google Scholar
Brambilla, N., Escobedo, M.A., Ghiglieri, J., Soto, J., Vairo, A.: Heavy quarkonium in a weakly-coupled quark-gluon plasma below the melting temperature. JHEP 09, 038 (2010)
MATH Google Scholar
CMS Collaboration: Measurement of nuclear modification factors of \(\Upsilon (1S)\), \(\Upsilon (2S)\), and \(\Upsilon (3S)\) mesons in PbPb collisions at \(\sqrt{s_{NN}}= 5.02\) TeV. Phys. Lett. B 790, 270 (2019)
Collaboration, S.T.A.R.: Suppression of upsilon production in d+Au and Au+Au collisions at \(\sqrt{s_{NN}}=200\) GeV. Phys. Lett. B 735, 127 (2014)
Google Scholar
Collaboration, C.M.S.: Observation of sequential \(\Upsilon \) suppression in Pb–Pb collisions. Phys. Rev. Lett. 109, 222301 (2012)
Google Scholar
CMS Collaboration: Suppression of \(\Upsilon (1S)\), \(\Upsilon (2S)\), and \(\Upsilon (3S)\) production in PbPb collisions at \(\sqrt{s_{NN}}=2.76\) TeV. Phys. Lett. B 770, 357 (2017)
Collaboration, C.L.E.O.: Dielectron widths of the \(\Upsilon (1S)\), \(\Upsilon (2S)\), and \(\Upsilon (3S)\) resonances. Phys. Rev. Lett. 96, 092003 (2006)
Google Scholar
Collaboration, C.L.E.O.: Recent upsilonium results from CLEO III. AIP Conf. Proc. 870, 356 (2006)
Google Scholar
STAR Collaboration: Upsilon production in U+U collisions at 193 GeV with the STAR experiment. Phys. Rev. C 94 (2016)
ALICE Collaboration: \(\Upsilon \) production and nuclear modification at forward rapidity in Pb–Pb collisions at \(\sqrt{s_{NN}}=5.02\) TeV. Phys. Lett. B 822, 136579 (2021)
Olive, K.A., et al.: Particle data group. Chin. Phys. C 38, 090001 (2014)
Google Scholar
Tanabashi, M., et al.: Particle data group. Phys. Rev. D 98, 030001 (2018)
Google Scholar
Nourbakhsh, S.: Studio degli eventi J/\(\Psi \) in due elettroni con i primi dati di CMS. Ph.D. Thesis, Sapienza University of Rome (2010)
STAR Collaboration: \(\Upsilon \) measurement in STAR. Int. J. Mod. Phys. E 16 (2007)
ALICE Collaboration: Differential studies of inclusive J/\(\psi \) and \(\Upsilon (2S)\) production at forward rapidity in Pb–Pb collisions at \(\sqrt{s_{NN}}=2.76\) TeV. JHEP 5, 179 (2016)
ALICE Collaboration: \(\Upsilon \) suppression at forward rapidity in Pb–Pb collisions at \(\sqrt{s_{NN}}=5.02\) TeV. Phys. Lett. B 790, 89 (2019)
Muller, A.C., Guido, S.: Introduction to Machine Learning with Python. O’Reilly Media Inc. ISBN 978-1-449-36941-5 (2016)
Tharwat, A.: Classification assessment methods. Appl. Comput. Inform. 17, 168–172 (2020)
Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41, 15 (2009)
Google Scholar
Krawczyk, B., Galar, M., Jelen, L., Herrera, F.: Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl. Soft Comput. 38, 714 (2016)
Google Scholar
Vuttipittayamongkol, P., Elyan, E.: Overlap-based undersampling method for classification of imbalanced medical datasets. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, p. 358 (2020)
Vuttipittayamongkol, P., Elyan, E.: Improved overlap-based undersampling for imbalanced dataset classification with application to epilepsy and parkinson’s disease. Int. J. Neural Syst. 30, 2050043 (2020)
Google Scholar
Zhang, X., Zhuang, Y., Wang, W., Pedrycz, W.: Transfer boosting with synthetic instances for class imbalanced object recognition. IEEE Trans. Cybern. 48, 357 (2018)
Google Scholar
Elyan, E., Jamieson, L., Ali-Gombe, A.: Deep learning for symbols detection and classification in engineering drawings. Neural Netw. 129, 91 (2020)
Google Scholar
Lin, W., Wu, Z., Lin, L., Wen, A., Li, J.: An ensemble random forest algorithm for insurance big data analysis. IEEE Access 5, 16568 (2017)
Google Scholar
Yi-Hung, L., Yen-Ting, C.: Total margin based adaptive fuzzy support vector machines for multiview face recognition. In: 2005 IEEE International Conference on Systems, Man and Cybernetics (2005)
Li, Y., Sun, G., Zhu, Y.: Data imbalance problem in text classification. In: Third International Symposium on Information Processing, p. 301 (2010)
Sun, Y., Wang, A.K., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recognit Artif. Intell. 23, 687 (2009)
Google Scholar
Daskalaki, S., Kopanas, I., Avouris, N.: Evaluation of classifiers for an uneven class distribution problem. Appl. Artif. Intell. 20, 381 (2006)
Google Scholar
Chen, X.W., Wasikowski, M.: Fast: a ROS-based feature selection metric for small samples and imbalanced data classification problems. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 124 (2008)
Koziarski, M.: Radial-based undersampling for imbalanced data classification. Pattern Recognit. 102, 107262 (2020)
Google Scholar
Trzcinski, T., Graczykowski, L., Glinka, M.: Using random forest classifier for particle identification in the ALICE experiment. In: Advances in Intelligent Systems and Computing (2019)
Trzcinski, T., Deja, K.: Assigning quality labels in the high-energy physics experiment ALICE using machine learning algorithms. Acta Phys. Polon. Suppl. A 11, 647 (2018)
Google Scholar
Azhari, M., Alaoui, A., Achraoui, Z., Ettaki, B., Zerouaoui, J.: Adaptation of the random forest method. Procedia Comput. Sci. 170, 1141 (2020)
Google Scholar
Azhari, M., Alaoui, A., Abarda, A., Ettaki, B., Zerouaoui, J.: Using ensemble methods to solve the problem of pulsar search. In: Farhaoui, Y. (ed.) Big Data and Networks Technologies, Lecture Notes in Networks and Systems. Springer. ISBN 978-3030236717 (2020)
Azhari, M., Alaoui, A., Abarda, A., Ettaki, B., Zerouaoui, J.: A comparison of random forest methods for solving the problem of pulsar search. In: The Fourth International Conference on Smart City Applications. Springer. ISBN 978-3030539283 (2020)
Vuttipittayamongkol, P., Elyan, E., Petrovski, A.: On the class overlap problem in imbalanced data classification. Knowl. Based Syst. 212, 106631 (2021)
Google Scholar
Zhu, M., Xia, J., Yan, M., Cai, G., Yan, J., Ning, G.: Dimensionality reduction in complex medical data: improved self-adaptive niche genetic algorithm. Comput. Math. Methods Med. 2015 (2015)
Xia, B., Jiang, H., Liu, H., Yi, D.: A novel hepatocellular carcinoma image classification method based on voting ranking random forests. Comput. Math. Methods Med. 2016 (2016)
Zhang, C., Li, Y., Yu, Z.,Tian, F.: A weighted random forest approach to improve predictive performance for power system transient stability assessment. In: 2016 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), p. 1259 (2016)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825 (2011)
MathSciNet MATH Google Scholar
Aung, W.T., Myanmar, Y., Saw Hla, K.H.M.: Random forest classifier for multi-category classification of web pages. In: 2009 IEEE Asia-Pacific Services Computing Conference (APSCC), p. 372 (2009)
Gajowniczek, K., Grzegorczyk, I., Zabkowski, T., Bajaj, C.: Weighted random forests to improve arrhythmia classification. Electronics 9, 99 (2020)
Google Scholar
Yang, H., Li, X., Cao, H., Cui, Y., Luo, Y., Liu, J., Zhang, Y.: Using machine learning methods to predict hepatic encephalopathy in cirrhotic patients with unbalanced data. Comput. Methods Programs Biomed. 211, 106420 (2021)
Google Scholar
Thammasiri, D., Delen, D., Meesad, P., Kasap, N.: A critical assessment of imbalanced class distribution problem: the case of predicting freshmen student attrition. Expert Syst. Appl. 41, 321 (2014)
Google Scholar
Lemaitre, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18, 1 (2017)
Google Scholar
Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modelling under imbalanced distributions. arXiv:1505.01658
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for classimbalance learning. IEEE Trans. Syst. Man Cybern. B 39, 539 (2009)
Google Scholar
Fernández, A., García, S.R., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets. Springer, Cham. ISBN 978-3-319-98073-7 (2018)
Kubat, M., Matwin, S.: Addressing the course of imbalanced training-sets: one-sided selection. In: Proceedings of the Fourteenth International Conference on Machine Learning, p. 179 (1997)
Chawla, N.V.: Data mining for imbalanced datasets. An overview. In: Data Mining and Knowledge Discovery Handbook, p. 853. Springer US (2005)
Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2, 42 (2012)
Google Scholar
Ashraf, S., Saleem, S., Ahmed, T., Aslam, Z., Muhammad, D.: Conversion of adverse data corpus to shrewd output using sampling metrics. Visual Comput. Ind. Biomed. Art 3, 19 (2020)
Google Scholar
Blagus, R., Lusa, L.: SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics 14, 106 (2013)
Google Scholar
He, H., Ma, Y. (eds.): Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley-IEEE Press. ISBN 978-1-118-07462-6 (2013)
Cover, M.T., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21 (1967)
MATH Google Scholar
Wang, S., Dai, Y., Shen, J., et al.: Research on expansion and classification of imbalanced data based on SMOTE algorithm. Sci. Rep. 11, 24039 (2021)
Google Scholar
Fernandez, A., Garcia, S., Herrera, F., Chawla, N.V.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61 (2018)
Sevastianov, L.A., Shchetinin, E.Y.: On methods for improving the accuracy of multiclass classification on imbalanced data. Inform. Primen. 14, 63 (2020)
Google Scholar
Mukherjee, M., Khushi, M.: SMOTE-ENC: a novel SMOTE-based method to generate synthetic data for nominal and continuous features. Appl. Syst. Innov. 4, 18 (2021)
Google Scholar
Stanfill, C., Waltz, D.: Toward memory-based reasoning. Commun. ACM 29, 1213 (1986)
Google Scholar
Han, H., Wang, W.Y., Mao, B.H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin. ISBN 9783540282266 (2005)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273 (1995)
MATH Google Scholar
Hussain, S., Raza, Z., Giacomini, G., Goswami, N.: Support vector machine-based classification of vasovagal syncope using head-up tilt test. Biology 10, 1029 (2021)
Google Scholar
Wang, L.: Support Vector Machines: Theory and Applications. Springer, Berlin, Heidelberg. ISBN 978-3-540-24388-5 (2005)
Evgeniou T., Pontil, M.: Support vector machines: theory and applications. In: Machine Learning and Its Applications, Advanced Lectures (2001)
Wong, G.Y., Leung, F.H.F., Ling, S.H.: A hybrid evolutionary preprocessing method for imbalanced datasets. Inf. Sci. 454–455, 161–177 (2018)
MathSciNet Google Scholar
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. C 42, 463–484 (2012)
Google Scholar
Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Syst. 40, 185–197 (2010)
Google Scholar
Zhang, Y., Zhang, D., Mi, G., Ma, D., Li, G., Guo, Y., Li, M., Zhu, M.: Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions. Comput. Biol. Chem. 36, 36–41 (2012)
MathSciNet MATH Google Scholar
Cao L., Zhai Y.: Imbalanced data classification based on a hybrid resampling SVM method. In: 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), pp. 1533–1536 (2015)
Le, T., Lee, M., Park, J., Baik, S.: Oversampling techniques for bankruptcy prediction: novel features from a transaction dataset. Symmetry 10, 79 (2018)
Google Scholar
Le, T., Baik, S.: A robust framework for self-care problem identification for children with disability. Symmetry 11, 89 (2019)
Google Scholar
Le, T., Vo, M.T., Vo, B., Lee, M.Y., Baik, S.W.: A hybrid approach using oversampling technique and cost-sensitive learning for bankruptcy prediction. Complexity 2019, 1 (2019)
Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC 2, 408 (1972)
MathSciNet MATH Google Scholar
Xu, Z., Shen, D., Nie, T., Kou, Y.: A hybrid sampling algorithm combining M-SMOTE and ENN based on Random forest for medical imbalanced data. J. Biomed. Inform. 107, 103465 (2020)
Google Scholar
Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Commun. SMC 6, 769 (1976)
MathSciNet MATH Google Scholar
McCauley, T.: CMS releases open data for Machine Learning. https://cms.cern/news/cms-releases-open-data-machine-learning
McCauley, T.: Events with two electrons from 2010. https://opendata.cern.ch/record/304
McCauley, T.: \(\Upsilon \) to two electrons from 2010. https://opendata.cern.ch/record/305
Racz, A., Bajusz, D., Heberger, K.: Effect of dataset size and train/test split ratios in QSAR/QSPR multiclass classification. Molecules 26, 1111 (2021)
Google Scholar
Probst, P., Boulestei, A.L.: To tune or not to tune the number of trees in random forest. J. Mach. Learn. Res. 18, 6673 (2017)
MathSciNet Google Scholar
Ozigis, M.S., Kaduk, J.D., Jarvis, C.H., Balzter, H.: Detection of oil pollution impacts on vegetation using multifrequency SAR, multispectral images with fuzzy forest and random forest methods. Environ. Pollut. 256 (2020)
NA61/SHINE Collaboration: Two-particle correlations in azimuthal angle and pseudorapidity in inelastic p+p interactions at the CERN Super Proton Synchrotron. Eur. Phys. J. C 77, 59 (2017)
Visa, S.: Fuzzy classifiers for imbalanced data sets. Ph.D. Thesis, Univeristy of Cincinnati: Cincinnati (2006)
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6, 429 (2002)
MATH Google Scholar
Wang, L.X., Mendel, J.M.: Generating fuzzy rules by learning from examples. IEEE Trans. Syst. Man Cybern. 22, 1414 (1992)
MathSciNet Google Scholar
Ali, A., Shamsuddin, S.M., Ralescu, A.L.: Classification with class imbalance problem: a review. Int. J. Adv. Soft Comput. Appl. 7, 176 (2015)
Google Scholar
Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation. Adv. Artif. Intell. 4304, 1015 (2006)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45, 427 (2009)
Google Scholar
Garcia, V., Mollineda, R.A., Sanchez, J.S.: Theoretical analysis of a performance measure for imbalanced data. In: 20th International Conference on Pattern Recognition, p. 617 (2010)
Powers, D.M.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 37 (2011)
Google Scholar
Vuttipittayamongkol, P., Elyan, E.: Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Inf. Sci. 509, 47 (2020)
Google Scholar
Boughorbel, S., Jarray, F., El-Anbari, M.: Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS ONE 12 (2017)
Brodersen, K.H. , Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: 20th International Conference on Pattern Recognition, p. 3121 (2010)
Akosa, J.S.: Predictive accuracy: a misleading performance measure for highly imbalanced data. In: Proceedings of The SAS Global Forum 2017 Conference, p. 942 (2017)

Download references

Acknowledgements

The author acknowledges the support from the Scientific and Technological Research Council of Turkey (TUBITAK) Project No. 119F302.

Author information

Authors and Affiliations

Department of Physics, Faculty of Science, Firat University, 23119, Elazig, Turkey
Serpil Yalcin Kuzu

Authors

Serpil Yalcin Kuzu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serpil Yalcin Kuzu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yalcin Kuzu, S. Random Forest Based Multiclass Classification Approach for Highly Skewed Particle Data. J Sci Comput 95, 21 (2023). https://doi.org/10.1007/s10915-023-02144-2

Download citation

Received: 23 April 2022
Revised: 26 September 2022
Accepted: 01 February 2023
Published: 22 February 2023
DOI: https://doi.org/10.1007/s10915-023-02144-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Random Forest Based Multiclass Classification Approach for Highly Skewed Particle Data

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A comparative analysis of gradient boosting algorithms

A Review on Random Forest: An Ensemble Classifier

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Random Forest Based Multiclass Classification Approach for Highly Skewed Particle Data

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A comparative analysis of gradient boosting algorithms

A Review on Random Forest: An Ensemble Classifier

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation