Skip to main content
Log in

Stability Analysis of Supervised Decision Boundary Maps

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Understanding how a machine learning classifier works is an important task in machine learning engineering. However, doing this is for any classifier in general difficult. We propose to leverage visualization methods for this task. For this, we extend a recent technique called Decision Boundary Map (DBM) which graphically depicts how a classifier partitions its input data space into decision zones separated by decision boundaries. We use a supervised, GPU-accelerated technique that computes bidirectional mappings between the data and projection spaces to solve several shortcomings of DBM, such as accuracy and speed. We present several experiments that show that SDBM generates results which are easier to interpret, far less prone to noise, and compute significantly faster than DBM, while maintaining the genericity and ease of use of DBM for any type of single-output classifier. We also show, in addition to earlier work, that SDBM is stable with respect to various types and amounts of changes of the training set used to construct the visualized classifiers. This property was, to our knowledge, not investigated for any comparable method for visualizing classifier decision maps, and is essential for the deployment of such visualization methods in analyzing real-world classification models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availability

Not applicable.

Code Availability

Our implementation, plus all code used in our experiments, are publicly available at github.com/mespadoto/sdbm.

References

  1. Ribeiro MT, Singh S, Guestrin C. Why should i trust you?: Explaining the predictions of any classifier. In: Proc. ACM SIGMOD KDD. 2016. p. 1135–1144.

  2. Garcia R, Telea A, da Silva B, Torresen J, Comba J. A task-and-technique centered survey on visual analytics for deep learning model engineering. Comput Gr. 2018;77:30–49.

    Article  Google Scholar 

  3. Lundberg S.M, Lee S.-I. A unified approach to interpreting model predictions. In: Proc. NIPS. 2017. p. 4768–4777.

  4. Nóbrega C, Marinho L. Towards explaining recommendations through local surrogate models. In: Proc. ACM/SIGAPP symp. on applied computing. 2019. p. 1671–1678.

  5. Rauber PE, Falcao AX, Telea AC. Projections as visual aids for classification system design. Inf Vis. 2017;17(4):282–305.

    Article  Google Scholar 

  6. Rauber PE, Fadel SG, Falcao AX, Telea AC. Visualizing the hidden activity of artificial neural networks. IEEE TVCG. 2017;23(1):101–10.

    Google Scholar 

  7. Rodrigues F, Espadoto M, Hirata R, Telea AC. Constructing and visualizing high-quality classifier decision boundary maps. Information. 2019;10(9):280.

    Article  Google Scholar 

  8. Nonato L, Aupetit M. Multidimensional projection for visual analytics: linking techniques with distortions, tasks, and layout enrichment. IEEE TVCG. 2018. https://doi.org/10.1109/TVCG.2018.2846735.

    Article  Google Scholar 

  9. Oliveira A.A.M, Espadoto M, Hirata R, Telea A. SDBM: supervised decision boundary maps for machine learning classifiers. In: Proc. IVAPP. 2022. p. 77–87.

  10. Rodrigues FCM, Hirata R, Telea AC. Image-based visualization of classifier decision boundaries. In: Proc. IEEE conf. on graphics, patterns and images (SIBGRAPI). 2018. p. 353–360.

  11. Espadoto M, Rodrigues FCM, Telea AC. Visual analytics of multidimensional projections for constructing classifier decision boundary maps. In: Proc. IVAPP. SCITEPRESS. 2019. p. 132–144.

  12. Cox DR. The regression analysis of binary sequences. J R Stat Soc Ser B (Methodological). 1958;20(2):215–32.

    MathSciNet  MATH  Google Scholar 

  13. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.

    Article  MATH  Google Scholar 

  14. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  MATH  Google Scholar 

  15. Amorim E, Brazil EV, Daniels J, Joia P, Nonato L.G, Sousa MC. iLAMP: exploring high-dimensional spacing through backward multidimensional projection. In: Proc. IEEE VAST. 2012. p. 53–62.

  16. Maaten LVD, Hinton G. Visualizing data using t-SNE. JMLR. 2008;9:2579–605.

    MATH  Google Scholar 

  17. McInnes L, Healy J. UMAP: uniform manifold approximation and projection for dimension reduction. 2018. arXiv:1802.03426v1 [stat.ML].

  18. Espadoto M, Rodrigues FCM, Hirata NST, Hirata Jr. R, Telea AC. Deep learning inverse multidimensional projections. In: Proc. EuroVA. Eurographics. 2019.

  19. Espadoto M, Rodrigues FCM, Hirata N, Telea A. OptMap: using dense maps for visualizing multidimensional optimization problems. In: Proc. IVAPP. SciTePress. 2021.

  20. Collaris D, van Wijk JJ. StrategyAtlas: strategy analysis for machine learning interpretability. IEEE TVCG. 2022. https://doi.org/10.1109/TVCG.2022.3146806.

  21. Shepard D. A two-dimensional interpolation function for irregularly-spaced data. In: Proc. ACM national conference. 1968. p. 517–524.

  22. Aupetit M. Visualizing distortions and recovering topology in continuous projection techniques. Neurocomputing. 2007;10(7):1304–30.

    Article  Google Scholar 

  23. Martins R, Coimbra D, Minghim R, Telea A. Visual analysis of dimensionality reduction quality for parameterized projections. Comput Gr. 2014;41:26–42.

    Article  Google Scholar 

  24. Tian Z, Zhai X, van Driel D, van Steenpaal G, Espadoto M, Telea A. Using multiple attribute-based explanations of multidimensional projections to explore high-dimensional data. Comput Gr. 2021;98:93–104.

    Article  Google Scholar 

  25. Venna J, Kaski S. Visualizing gene interaction graphs with local multidimensional scaling. In: Proc. ESANN. 2006. p. 557–562.

  26. Seifert C, Sabol V, Kienreich W. Stress maps: analysing local phenomena in dimensionality reduction based visualisations. In: Proc. IEEE VAST. 2010.

  27. Joia P, Coimbra D, Cuminato JA, Paulovich FV, Nonato LG. Local affine multidimensional projection. IEEE TVCG. 2011;17(12):2563–71.

    Google Scholar 

  28. Espadoto M, Martins RM, Kerren A, Hirata NS, Telea AC. Toward a quantitative survey of dimension reduction techniques. IEEE TVCG. 2019;27(3):2153–73.

    Google Scholar 

  29. Vernier E, Garcia R, Silva I.d, Comba J, Telea A. Quantitative evaluation of time-dependent multidimensional projection techniques. In: Proc. EuroVis. 2020.

  30. Bredius C, Tian Z, Telea A. Visual exploration of neural network projection stability. In: Proc. MLVis. Eurographics. 2022.

  31. Espadoto M, Hirata NST, Telea AC. Deep learning multidimensional projections. Inf Vis. 2020;19(3):247–69.

    Article  Google Scholar 

  32. Espadoto M, Falcao A, Hirata N, Telea A. Improving neural network-based multidimensional projections. In: Proc. IVAPP. 2020.

  33. Hoffman P, Grinstein G. A survey of visualizations for high-dimensional data mining. Inf Vis Data Min Knowl Discov. 2002;104:47–82.

    Google Scholar 

  34. Maaten LVD, Postma E. Dimensionality reduction: a comparative review. Technical report, Tilburg University, Netherlands (2009)

  35. Engel D, Hattenberger L, Hamann B. A survey of dimension reduction methods for high-dimensional data analysis and visualization. In: Proc. IRTG Workshop, vol. 27. Schloss Dagstuhl. 2012. p. 135–149.

  36. Sorzano C, Vargas J, Pascual-Montano A. A survey of dimensionality reduction techniques. 2014. arXiv:1403.2877 [stat.ML].

  37. Liu S, Maljovec D, Wang B, Bremer P-T, Pascucci V. Visualizing high-dimensional data: advances in the past decade. IEEE TVCG. 2015;23(3):1249–68.

    Google Scholar 

  38. Cunningham J, Ghahramani Z. Linear dimensionality reduction: survey, insights, and generalizations. JMLR. 2015;16:2859–900.

    MathSciNet  MATH  Google Scholar 

  39. Xie H, Li J, Xue H. A survey of dimensionality reduction techniques based on random projection. 2017. arXiv:1706.04371 [cs.LG].

  40. Jolliffe IT. Principal component analysis and factor analysis. In: Principal component analysis. Springer. 1986. p. 115–128.

  41. Torgerson WS. Theory and methods of scaling. Oxford: Wiley; 1958.

    Google Scholar 

  42. Tenenbaum JB, Silva VD, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–23.

    Article  Google Scholar 

  43. Roweis ST, Saul LLK. Nonlinear dimensionality reduction by locally linear embedding. Science. 2000;290(5500):2323–6.

    Article  Google Scholar 

  44. Wattenberg M. How to use t-SNE effectively. https://distill.pub/2016/misread-tsne. 2016.

  45. Maaten LVD. Learning a parametric embedding by preserving local structure. In: Proc. AI-STATS. 2009.

  46. Maaten LVD. Accelerating t-SNE using tree-based algorithms. JMLR. 2014;15:3221–45.

    MathSciNet  MATH  Google Scholar 

  47. Pezzotti N, Höllt T, Lelieveldt B, Eisemann E, Vilanova A. Hierarchical stochastic neighbor embedding. Comput Gr Forum. 2016;35(3):21–30.

    Article  Google Scholar 

  48. Pezzotti N, Lelieveldt B, Maaten LVD, Höllt T, Eisemann E, Vilanova A. Approximated and user steerable t-SNE for progressive visual analytics. IEEE TVCG. 2017;23:1739–52.

    Google Scholar 

  49. Pezzotti N, Thijssen J, Mordvintsev A, Hollt T, Lew BV, Lelieveldt B, Eisemann E, Vilanova A. GPGPU linear complexity t-SNE optimization. IEEE TVCG. 2020;26(1):1172–81.

    Google Scholar 

  50. Chan D, Rao R, Huang F, Canny J. T-SNE-CUDA: GPU-accelerated t-SNE and its applications to modern data. In: Proc. SBAC-PAD. 2018. p. 330–338.

  51. Modrakowski TS, Espadoto M, Falcão AX, Hirata NST, Telea A. Improving deep learning projections by neighborhood analysis. Berlin: Springer; 2020.

    Google Scholar 

  52. Espadoto M, Hirata NS, Telea AC. Self-supervised dimensionality reduction with neural networks and pseudo-labeling. In: Proc. IVAPP. SCITEPRESS. 2021. p. 27–37.

  53. Hunter JD. Matplotlib: a 2d graphics environment. Comput Sci Eng. 2007;9(3):90–5.

    Article  Google Scholar 

  54. Xiao H, Rasul K, Vollgraf R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747. 2017.

  55. Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz J.L. Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine. In: Proc. intl. workshop on ambient assisted living. Springer. 2012. p. 216–223.

  56. LeCun Y, Cortes C. MNIST handwritten digits dataset. 2010. http://yann.lecun.com/exdb/mnist.

  57. Thoma M. The reuters dataset. 2017. https://martin-thoma.com/nlp-reuters.

  58. Salton G, McGill MJ. Introduction to modern information retrieval. New York: McGraw-Hill; 1986.

    MATH  Google Scholar 

  59. Kruskal JB. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika. 1964;29(1):1–27.

    Article  MathSciNet  MATH  Google Scholar 

  60. Paulovich FV, Silva CT, Nonato LG. Two-phase mapping for projecting massive datasets. IEEE TVCG. 2010;16(6):1281–90.

    Google Scholar 

  61. Paulovich FV, Minghim R. Text map explorer: a tool to create and explore document maps. In: Proc. intl. conference on information visualisation (IV). IEEE. 2006. p. 245–251.

  62. Vernier EF, Comba J, Telea A. Quantitative comparison of dynamic treemaps for software evolution visualization. In: Proc. IEEE VISSOFT. 2018.

  63. Vernier E, Sondag M, Comba J, Speckmann B, Telea A, Verbeek K. Quantitative comparison of time-dependent treemaps. Comput Gr Forum. 2020;39(3):393–404.

    Article  Google Scholar 

  64. The Authors: SDBM Implementation. 2021. https://github.com/mespadoto/sdbm.

  65. Chollet F. Keras. 2015. https://keras.io

  66. Rahaman M, Li C, Yao Y, Kulwa F, Rahman MA, Wang Q, Qi S, Kong F, Zhu X, Zhao X. Identification of COVID-19 samples from chest X-ray images using deep learning: a comparison of transfer learning approaches. J X-Ray Sci Technol. 2020;28(5):821–39.

    Google Scholar 

  67. Chen H, Li C, Wang G, Li X, Rahaman M, Sun H, Hu W, Li Y, Liu W, Sun C, Ai S, Grzegorzek M. GasHis-transformer: a multi-scale visual transformer approach for gastric histopathological image detection. Pattern Recogn. 2022;130: 108827.

    Article  Google Scholar 

  68. Liu W, Li C, Xu N, Jiang T, Rahaman M, Sun H, Wu X, Hu W, Chen H, Sun C, Yao Y, Grzegorzek M. CVM-Cervix: a hybrid cervical Pap-smear image classification framework using CNN, visual transformer and multilayer perceptron. Pattern Recogn. 2022;130: 108829.

    Article  Google Scholar 

  69. Zhang J, Li C, Kosov S, Grzegorzek M, Shirahamad K, Jiang T, Sun C, Li Z, Li H. LCU-Net: a novel low-cost U-Net for environmental microorganism image segmentation. Pattern Recogn. 2021;115: 107885.

    Article  Google Scholar 

  70. Rahaman M, Li C, Yao Y, Kulwa F, Wu X, Li X, Wang Q. DeepCervix: a deep learning-based framework for the classification of cervical cells using hybrid deep feature fusion techniques. Comput Biol Med. 2021;136: 104649.

    Article  Google Scholar 

  71. Saltelli A, Ratto M, Andres T, Campolongo F, Cariboni J, Gatelli D, Saisana M, Tarantola S. Global sensitivity analysis: the primer. New York: Wiley; 2008.

    MATH  Google Scholar 

Download references

Funding

This study was financed in part by FAPESP grants 2015/22308-2, 2017/25835-9 and 2020/13275-1, and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mateus Espadoto.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances on Computer Vision, Imaging and Computer Graphics Theory and Applications” guest edited by Kadi Bouatouch, Augusto Sousa, Mounia Ziat and Helen Purchase.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oliveira, A.A.A.M., Espadoto, M., Hirata Jr., R. et al. Stability Analysis of Supervised Decision Boundary Maps. SN COMPUT. SCI. 4, 226 (2023). https://doi.org/10.1007/s42979-022-01662-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01662-4

Keywords

Navigation