Skip to main content

Comparative Analysis of Normalizing Techniques Based on the Use of Classification Quality Criteria

  • Conference paper
  • First Online:
Lecture Notes in Computational Intelligence and Decision Making (ISDMCI 2021)

Abstract

The paper presents a comparative analysis of various types of normalization techniques. The accuracy of data classification which was carried out after data normalizing was used as the main criterion for evaluating the quality of the appropriate normalizing method. Four various types of datasets downloaded from the UCI Machine Learning Repository were used as the experimental data during the simulation process. Various normalization techniques available from package clusterSim of R software were applied to the experimental data. The quality of the data normalizing procedure was evaluated based on the use of data classification by the calculation of the accuracy of the objects distribution into classes. The neural network multilayer perceptron was used as the classifier at this step. The simulation results have shown that the data normalizing stage significantly influences the classification accuracy and selection of the normalization method depends on the type of data and, consequently, the selection of the normalizing technique should be carried out in each of the cases separately.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Caret package. https://topepo.github.io/caret/

  2. Clustersim package. http://keii.ue.wroc.pl/clusterSim/

  3. Glass identification database. https://archive.ics.uci.edu/ml/datasets/glass+identification

  4. Seeds dataset. https://archive.ics.uci.edu/ml/datasets/seeds

  5. Uci - machine learning repository. https://archive.ics.uci.edu/ml/datasets.php

  6. Wine recognition data. https://archive.ics.uci.edu/ml/datasets/wine

  7. Babichev, S., Durnyak, B., Zhydetskyy, V., Pikh, I., Senkivskyy, V.: Application of optics density-based clustering algorithm using inductive methods of complex system analysis. In: IEEE 2019 14th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2019 - Proceedings, pp. 169–172 (2019). https://doi.org/10.1109/STC-CSIT.2019.8929869

  8. Babichev, S., Škvor, J.: Technique of gene expression profiles extraction based on the complex use of clustering and classification methods. Diagnostics 10(8), 584 (2020). https://doi.org/10.3390/diagnostics10080584

  9. Bushel, P., Ferguson, S., Ramaiahgari, S., Paules, R., Auerbach, S.: Comparison of normalization methods for analysis of TempO-Seq targeted RNA sequencing data. Front. Genet. 11, 594 (2020). https://doi.org/10.3389/fgene.2020.00594

  10. Carmona-Rodríguez, L., Martínez-Rey, D., Mira, E., Mañes, S.: SOD3 boosts T cell infiltration by normalizing the tumor endothelium and inducing laminin-a4. OncoImmunology 9(1), 1794163 (2020). https://doi.org/10.1080/2162402X.2020.1794163

  11. De Silva, A., De Livera, A., Lee, K., Moreno-Betancur, M., Simpson, J.: Multiple imputation methods for handling missing values in longitudinal studies with sampling weights: comparison of methods implemented in Stata. Biometrical J. 63(2), 354–371 (2021). https://doi.org/10.1002/bimj.201900360

    Article  MathSciNet  Google Scholar 

  12. Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936)

    Article  Google Scholar 

  13. Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)

    Google Scholar 

  14. Johnson, T., Isaac, N., Paviolo, A., González-Suárez, M.: Handling missing values in trait data. Glob. Ecol. Biogeogr. 30(1), 51–62 (2021). https://doi.org/10.1111/geb.13185

    Article  Google Scholar 

  15. Kim, K.H., Kim, K.J.: Missing-data handling methods for lifelogs-based wellness index estimation: comparative analysis with panel data. JMIR Med. Inform. 8(12), e20597 (2020). https://doi.org/10.2196/20597

  16. Marasanov, V., Sharko, A., Sharko, A., Stepanchikov, D.: Modeling of energy spectrum of acoustic-emission signals in dynamic deformation processes of medium with microstructure. In: 2019 IEEE 39th International Conference on Electronics and Nanotechnology, ELNANO 2019 - Proceedings, pp. 718–723 (2019). https://doi.org/10.1109/ELNANO.2019.8783809

  17. Marasanov, V., Stepanchikov, D., Sharko, A., Sharko, A.: Technique of system operator determination based on acoustic emission method. Adv. Intell. Syst. Comput. 1246, 3–22 (2021). https://doi.org/10.1007/978-3-030-54215-3_1

    Article  Google Scholar 

  18. Marasanov, V., Sharko, A., Sharko, A.: Energy spectrum of acoustic emission signals in coupled continuous media. J. Nano- Electron. Phys. 11(3), 03027 (2019). https://doi.org/10.21272/jnep.11(3).03028

  19. Ngueilbaye, A., Wang, H., Mahamat, D., Junaidu, S.: Modulo 9 model-based learning for missing data imputation. Appl. Soft Comput. 103, 107167 (2021). https://doi.org/10.1016/j.asoc.2021.107167

  20. Northoff, G., Mushiake, H.: Why context matters? Divisive normalization and canonical microcircuits in psychiatric disorders. Neurosci. Res. 156, 130–140 (2020). https://doi.org/10.1016/j.neures.2019.10.002

    Article  Google Scholar 

  21. Peterson, R., Cavanaugh, J.: Ordered quantile normalization: a semiparametric transformation built for the cross-validation era. J. Appl. Stat. 47(13–15), 2312–2327 (2020). https://doi.org/10.1080/02664763.2019.1630372

    Article  MathSciNet  Google Scholar 

  22. Sharma, S., Sood, M.: Exploring feature selection technique in detecting sybil accounts in a social network. Adv. Intell. Syst. Comput. 1166, 695–708 (2020). https://doi.org/10.1007/978-981-15-5148-2_61

    Article  Google Scholar 

  23. Turkheimer, F., Selvaggi, P., Mehta, M., et al.: Normalizing the abnormal: do antipsychotic drugs push the cortex into an unsustainable metabolic envelope? Schizophrenia Bull. 46(3), 484–495 (2020). https://doi.org/10.1093/schbul/sbz119

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oleksandr Mishkov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mishkov, O., Zorin, K., Kovtoniuk, D., Dereko, V., Morgun, I. (2022). Comparative Analysis of Normalizing Techniques Based on the Use of Classification Quality Criteria. In: Babichev, S., Lytvynenko, V. (eds) Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 77. Springer, Cham. https://doi.org/10.1007/978-3-030-82014-5_41

Download citation

Publish with us

Policies and ethics