Geometrically interpretable Variance Hyper Rectangle learning for pattern classification

https://doi.org/10.1016/j.engappai.2022.105494Get rights and content

Highlights

  • VHR has strong geometric interpretability, and is much reliable and trustworthy.

  • VHR can provide a clear range of values in each direction for a category of data.

  • VHR naturally supports incremental learning without any extra processing.

  • VHR has great performance and stability, and is able to handle big data.

Abstract

Many current intrinsically interpretable machine learning models can only handle the data that are linear, low-dimensional, and relatively independent attributes and often with discrete attribute values, while the models that are capable of handling high-dimensional nonlinear data, like deep learning, have very poor interpretability. Based on the geometric characteristics, a new idea of accurately wrapping the data region with minimum-volume geometry is proposed for pattern classification. The Variance Hyper Rectangle (VHR) model presented in this paper is a realization of the idea. The VHR model uses the minimum-volume hyper rectangles, obtained through projection variance calculation, to wrap the regions occupied by a category of data, hence it has strong and clear geometric interpretability. In addition, the VHR model is well suited for large data volume, as it approaches the linear complexity in both time and space. Extensive qualitative and quantitative experiments are performed on seven real-world data sets, demonstrating that VHR outperforms the state-of-the-art interpretable methods while running quickly.

Introduction

Machine learning refers to the process of using certain algorithms to take in some known training data to obtain an appropriate model, and using the model to make judgments on new situations. Machine learning technology has been applied to many application fields, especially in the data processing of video, image, voice, text, sensor, Internet behavior, and other related fields. We know that in many cases, the performance of a machine learning model is the most important factor. In the application domains that require a high level of fairness and safety, such as transportation (Mirnig et al., 2018), health care (Vellido, 2020), law (Rudin and Ustun, 2018), finance (Gogas and Papadimitriou, 2021), military (Xue and Tong, 2019, Xue and Tong, 2020) etc., machine learning models are more demanding. They require that the calculation process and results of the model must be interpretable and trustworthy.

In order to meet the needs of these requirements, the machine learning community also provides many highly interpretable learning models for use. The problem now is that though the current inherently interpretable machine learning models are already mature in theory and practice, their performances are gradually getting farther and farther away from the requirements of technological development in the key areas that emphasize interpretability. These problems are : first, the algorithm results are often not good enough, or not stable enough; second, they are unable to cope with big data. When the amount of data is large enough, the algorithm programs may crash.

To alleviate these problems, we propose a new strongly interpretable geometry-based learning model in this paper. The main contributions can be summarized as follows:

  • A new idea of wrapping data region with geometry is introduced. It is observed that different categories of data are distributed in different regions, as confirmed in the experiments. This makes it possible to implement the wrapping idea and motivate more future work alone the direction.

  • A new interpretable model is proposed for pattern classification in this paper that utilizes hyper rectangles to wrap data regions—the Variance Hyper Rectangle (VHR) model. The VHR model has strong geometric interpretability, making it much easy to understand and gain more trust from algorithm users.

  • The VHR model is able to provide a clear range of values for a category of data in each dimension, which can serve as heuristics for further processing. Moreover,it is possible to find out the quantitative characteristic differences between different things, so as to better understand them.

  • The VHR model naturally supports incremental learning. It is able to construct hyper rectangles for a new category of data in the existing feature space, which contains the already built hyper rectangles of other categories of data.

Organization This paper is arranged as follows. In Section 2, we review the related work. Section 3 introduces the principle of the VHR model proposed in this paper. Section 4 presents the VHR learning and classification algorithms. Section 5 discusses the characteristics and advantages of VHR. In Section 6, a series of experiments are employed to test the performances of the different measures. Section 7 makes the conclusions and provides some suggestions for future work.

Section snippets

Related work

Interpretable machine learning techniques can generally be grouped into two categories: intrinsic interpretability and post-hoc interpretability, depending on the time when the interpretability is obtained.

Principles

In this section, we first introduce some definitions utilized in the paper. Then we present the VHR approach. Table 1 summarizes the notations frequently used throughout the paper.

Method

Now it comes to how to learn hyper rectangles for a data set in the VHR model. The overall framework of VHR is shown in Fig. 2. The first step is to divide the data into several sub-data sets by clustering algorithm so that each subset is convex. the general clustering algorithms can be used in this step, so it will not be discussed in this paper. The second step is to construct a wrapping hyper rectangle for each split sub-data set. After that, the model is established and ready for some

Characteristics and advantages

– Geometric interpretability.

At present, there is no clear definition to quantitatively assess the interpretability of machine learning models. Interpretability analysis from a geometric perspective is not even seen in previous literature. In view of this situation, this paper attempts to qualitatively give the following three levels of geometric interpretability according to the strength of geometric interpretability:

  • (1)

    Level C: Clear geometric features are used in the calculation principle of

Experiment design and datasets

To prove the effectiveness of the VHR model, we designed a series of experiments. The experiments are mainly carried out from four aspects: (1) the validation of the VHR model; (2) the performance of the VHR-based classification algorithm; (3) the VHR parameter setting; (4) and the VHR performance on real application.

The data used in the experiments are taken from data sets publicly available on the Internet. The basic information of them is listed in Table 3. Except for ORL (Cai, 2021) and

Conclusions and discussions

The experimental results well justifies the effectiveness of the VHR model. That is to say, the hyper rectangles of the VHR model can wrap the data region properly, neither smaller nor larger than the data region. This is the guarantee of good performance.

The biggest advantage of the VHR model is that it has both strong interpretability and good performance. Good performance ensures that VHR produces correct results, while strong interpretability makes the results reliable and trustworthy.

CRediT authorship contribution statement

Jie Sun: Data curation, Investigation, Writing – original draft. Huamao Gu: Conceptualization, Methodology, Writing – review & editing. Haoyu Peng: Software, Validation. Yili Fang: Software. Xun Wang: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by Natural Science Foundation of Zhejiang Province of China (grant Grant LY20F030002, LTY21F020001), the National Science Foundation of China (grant number 92046002, 61976188, 61972353, 61976187), the Zhejiang Provincial Basic Public Welfare Research Project, China(grant number LGG20F020006), the Science and Technology Program of Zhejiang Province, China (Key Research and Development Plan, Grant number 2021C01120).

References (46)

  • CaiD.

    Four face databases in matlab format

    (2021)
  • CaoH. et al.

    Learning explainable decision rules via maximum satisfiability

    IEEE Access

    (2020)
  • DhebarY. et al.

    Interpretable rule discovery through bilevel optimization of split-rules of nonlinear decision trees for classification problems

    IEEE Trans. Cybern.

    (2021)
  • DongA. et al.

    Semi-supervised SVM with extended hidden features

    IEEE Trans. Cybern.

    (2016)
  • FisherA. et al.

    Model class reliance: Variable importance measures for any machine learning model class, from the rashomon perspective

    (2018)
  • FriedmanJ.H. et al.

    Predictive learning via rule ensembles

    Ann. Appl. Stat.

    (2008)
  • GogasP. et al.

    Machine learning in economics and finance

    Comput. Econ.

    (2021)
  • GoldsteinA. et al.

    Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation

    J. Comput. Graph. Stat.

    (2015)
  • GreenwellB.

    Pdp: An r package for constructing partial dependence plots

    R J.

    (2017)
  • HintonG. et al.

    Distilling the knowledge in a neural network

    (2015)
  • HuangS. et al.

    Tuning-free ridge estimators for high-dimensional generalized linear models

    Comput. Stat. Data Anal.

    (2021)
  • Lending clubM.

    Lending club loan data

    (2018)
  • LethamB. et al.

    Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

    Ann. Appl. Stat.

    (2015)
  • Cited by (1)

    View full text