Skip to main content

Slicer: Feature Learning for Class Separability with Least-Squares Support Vector Machine Loss and COVID-19 Chest X-Ray Case Study

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2021)

Abstract

Datasets from real-world applications usually deal with many variables and present difficulties when modeling them with traditional classifiers. There is a variety of feature selection and extraction tools that may help with the dimensionality problem, but most of them do not focus on the complexity of the classes. In this paper, a new autoencoder-based model for addressing class complexity in data is introduced, aiming to extract features that present classes in a more separable fashion, thus simplifying the classification task. This is possible thanks to a combination of the standard reconstruction error with a least-squares support vector machine loss function. This model is then applied to a practical use case: classification of chest X-rays according to the presence of COVID-19, showing that learning features that increase linear class separability can boost classification performance. For this purpose, a specific convolutional autoencoder architecture has been designed and trained using the recently published COVIDGR dataset. The proposed model is evaluated by means of several traditional classifiers and metrics, in order to establish the improvements caused by the extracted features. The advantages of using a feature learner and traditional classifiers are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/

  2. Afshar, P., Heidarian, S., Naderkhani, F., Oikonomou, A., Plataniotis, K.N., Mohammadi, A.: COVID-CAPS: a capsule network-based framework for identification of COVID-19 cases from x-ray images. Pattern Recogn. Lett. 138, 638–643 (2020)

    Article  Google Scholar 

  3. Aggarwal, C.C.: Data Classification, pp. 285–344. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-14142-8_10

  4. Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_27

    Chapter  Google Scholar 

  5. Basu, M., Ho, T.K.: Data Complexity in Pattern Recognition. Springer, Heidelberg (2006). https://doi.org/10.1007/978-1-84628-172-3

    Book  MATH  Google Scholar 

  6. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor’’ meaningful? In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-49257-7_15

    Chapter  Google Scholar 

  7. Charte, D., Charte, F., del Jesus, M.J., Herrera, F.: An analysis on the use of autoencoders for representation learning: fundamentals, learning task case studies, explainability and challenges. Neurocomputing 404, 93–107 (2020). https://doi.org/10.1016/j.neucom.2020.04.057

    Article  Google Scholar 

  8. Charte, D., Charte, F., García, S., del Jesus, M.J., Herrera, F.: A practical tutorial on autoencoders for nonlinear feature fusion: taxonomy, models, software and guidelines. Inform. Fusion 44, 78–96 (2018). https://doi.org/10.1016/j.inffus.2017.12.007

    Article  Google Scholar 

  9. García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining, vol. 72. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-10247-4

    Book  Google Scholar 

  10. Gong, J., et al.: A tool for early prediction of severe coronavirus disease 2019 (COVID-19): a multicenter study using the risk nomogram in Wuhan and Guangdong, China. Clin. Infect. Dis. 71(15), 833–840 (2020)

    Article  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  12. Knight, S.R., et al.: Risk stratification of patients admitted to hospital with COVID-19 using the ISARIC WHO clinical characterisation protocol: development and validation of the 4C mortality score. bmj 370, 1–13 (2020)

    Google Scholar 

  13. Liu, X., et al.: Self-supervised learning: generative or contrastive. arXiv preprint arXiv:2006.082181(2) (2020)

  14. Luengo, J., Fernández, A., García, S., Herrera, F.: Addressing data complexity for imbalanced data sets: analysis of smote-based oversampling and evolutionary undersampling. Soft. Comput. 15(10), 1909–1936 (2011)

    Article  Google Scholar 

  15. Maguolo, G., Nanni, L.: A critic evaluation of methods for COVID-19 automatic detection from x-ray images. Inform. Fusion 76, 1–7 (2021). https://doi.org/10.1016/j.inffus.2021.04.008

    Article  Google Scholar 

  16. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)

  17. Pascual-Triana, J.D., Charte, D., Arroyo, M.A., Fernández, A., Herrera, F.: Revisiting data complexity metrics based on morphology for overlap and imbalance: snapshot, new overlap number of balls metrics and singular problems prospect. Knowl. Inf. Syst. 63, 1961–1989 (2021)

    Article  Google Scholar 

  18. Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  Google Scholar 

  19. Tabik, S., Gómez-Ríos, A., Martín-Rodríguez, J.L., Sevillano-García, I., Rey-Area, M., Charte, D., et al.: COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on chest x-ray images. IEEE J. Biomed. Health Inform. 24(12), 3595–3605 (2020). https://doi.org/10.1109/JBHI.2020.3037127

    Article  Google Scholar 

  20. Wang, L.: Feature selection with kernel class separability. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1534–1546 (2008)

    Article  Google Scholar 

  21. Wang, L., Lin, Z.Q., Wong, A.: COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest x-ray images. Sci. Rep. 10(1), 1–12 (2020)

    Google Scholar 

  22. Yu, X., Chen, Y., Li, T., Liu, S., Li, G.: Multi-mapping image-to-image translation via learning disentanglement. arXiv preprint arXiv:1909.07877 (2019)

  23. Zhang, Y., Li, S., Wang, T., Zhang, Z.: Divergence-based feature selection for separate classes. Neurocomputing 101, 32–42 (2013). https://doi.org/10.1016/j.neucom.2012.06.036

    Article  Google Scholar 

Download references

Acknowledgments

D. Charte is supported by the Spanish Ministry of Science under the FPU National Program (Ref. FPU17/04069). F. Charte is supported by the Spanish Ministry of Science project PID2019-107793GB-I00/AEI/10.13039/501100011033. F. Herrera is supported by the Spanish Ministry of Science project PID2020-119478GB-I00 and the Andalusian Excellence project P18-FR-4961. This work is supported by the project COVID19RX-Ayudas Fundación BBVA a Equipos de Investigación Científica SARS-CoV-2 y COVID-19 2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Charte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Charte, D., Sevillano-García, I., Lucena-González, M.J., Martín-Rodríguez, J.L., Charte, F., Herrera, F. (2021). Slicer: Feature Learning for Class Separability with Least-Squares Support Vector Machine Loss and COVID-19 Chest X-Ray Case Study. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86271-8_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86270-1

  • Online ISBN: 978-3-030-86271-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics