Elsevier

Neural Networks

Volume 21, Issues 2–3, March–April 2008, Pages 450-457
Neural Networks

2008 Special Issue
Robust BMPM training based on second-order cone programming and its application in medical diagnosis

https://doi.org/10.1016/j.neunet.2007.12.051Get rights and content

Abstract

The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say “ill” samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).

Introduction

Classifiers are widely being used in various disciplines with applications such as Information Retrieval (Peng and King, 2006a, Peng and King, 2006b), Bioinformatics (Huang et al., 2006b, Huang et al., 2004c), Text Categorization (Macskassy et al., 2001, Nigam et al., 1999), etc. In particular, biased classifiers, a special kind of classifiers, seek to make the accuracy of the important class, instead of the overall accuracy, as high as possible, while maintaining the accuracy of the less important class at an acceptable level. Recently, a novel biased classification model, Biased Minimax Probability Machine (BMPM), provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data points and achieves promising performance (Huang et al., 2004a, Huang et al., 2006a).

Applying machine learning techniques to medical diagnosis tasks has the advantage of saving time and reducing cost (Kononenko, 2001, West and West, 2000). Many different techniques have been applied to medical diagnosis in the machine learning literature, including Naive Bayesian method (NB) (Langley, Iba, & Thompson, 1992), the k-Nearest Neighbor method (kNN) (Aha, Kibler, & Albert, 1991), the decision tree (Quinlan, 1993) and the logistic regression (Jordan, 1995). The challenging task of medical diagnosis based on machine learning techniques requires an inherent bias, i.e., the diagnosis should favor the positive identification of the “ill” class over the misidentification of the “healthy” class, since a misdiagnosis of an ill patient as a healthy one may delay the therapy and aggravate the illness. Therefore, the objective in the identification task is not to improve the overall accuracy of the classification, but to focus on improving the sensitivity (the accuracy of the “ill” class) while maintaining an acceptable specificity (the accuracy of the “healthy” class) (Grzymala-Busse, Goodwin, & Zhang, 2003). Some current methods adopt roundabout ways to impose a certain bias toward the important class, i.e., they try to utilize some intermediate factors to influence the classification (Cardie and Nowe, 1997, Chawla et al., 2002, Kubat and Matwin, 1997, Maloof et al., 2004). However, it remains uncertain whether these methods can improve the classification performance systematically.

In this paper, by employing the Biased Minimax Probability Machine (BMPM), we deal with the issue in a more elegant way and directly achieve the objective of appropriate medical diagnosis. We extend the original BMPM model of Huang et al. (2006b) and propose a new training algorithm to tackle the complexity and accuracy issues in BMPM learning task. This model is transformed into a Second-Order Cone Programming (SOCP) problem instead of a Fractional Programming (FP) one (Peng & King, 2007). Under this new proposed framework, the imbalanced classification problem could be modelled and solved efficiently. Moreover, we apply the model to handle the biomedical problems in this work.

The rest of this paper is organized as follows. Section 2 reviews the concept of Biased Minimax Probability Machine (BMPM) and related work. Section 3 presents a robust learning algorithm based on the Second-Order Cone Programming for BMPM. Section 4 gives out the results of our empirical study on the derived learning scheme. Conclusion and future work are given in Section 5.

Section snippets

Biased minimax probability machine

In this section, we present the biased minimax framework, designed to achieve the goal of the imbalanced classification. We first introduce and define the linear Biased Minimax Probability Machine (BMPM) model. We then review optimization solutions that solve the linear version of the BMPM model.

Motivation

Biased Minimax Probability Machine (BMPM) has been extensively studied as a state-of-the-art learning techniques in various areas, such as bioinformatics (Huang et al., 2006b, Huang et al., 2004c), information retrieval (Peng and King, 2006a, Peng and King, 2006b) and statistical learning (Huang, Yang, King, & Lyu, 2004b). Most of the recent studies on BMPM are generally based on the Fractional Programming problem (we name it BMPMFP) which could be solved by Rosen Gradient method. However the

Experimental results

In this section we discuss the experimental evaluation of our proposed biased learning algorithm in comparison to some state-of-the-art approaches. For a consistent evaluation, we conduct our empirical comparisons on two standard datasets for medical diagnosis: the breast-cancer dataset, and the heart disease dataset. The traditional algorithms are the NB classifier, the kNN method, and the Minimax Probability Machine (MPM) in this paper, along with the two BMPMs, BMPMSOCP and BMPMFP.

Conclusion and future work

The computational complexity of our method for Biased Minimax Probability Machine (BMPM) is comparable to the quadratic program that one has to solve for the support vector machine (SVM) and Minimax Probability Machine (MPM). While we have viewed this model from the viewpoint of a convex optimization problem, we believe that there is much to gain from exploiting analogies to the SVM and developing specialized optimization procedures for our model. Another direction that we are currently

Acknowledgments

The authors thank G.R.G. Lanckriet for providing the Matlab source code of the MPM on the web, and Kaizhu Huang and Haiqin Yang for the FP-based BMPM code. The work described in this paper is supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CUHK4235/04E) and is affiliated with the VIEW Technologies Lab and the Microsoft-CUHK Joint Laboratory for Human-centric Computing and Interface Technologies.

References (27)

  • K. Huang et al.

    Imbalanced learning with a biased minimax probability machine

    IEEE Transactions on Systems, Man and Cybernetics (Part B)

    (2006)
  • K. Huang et al.

    Maximizing sensitivity in medical diagnosis using biased minimax probability machine

    IEEE Transactions on Biomedical Engineering

    (2006)
  • Huang, K., Yang, H., King, I., Lyu, M., & Chan, L. (2004c). Biased minimax probability machine for medical diagnosis....
  • Cited by (26)

    View all citing articles on Scopus

    An abbreviated version of some portions of this article appeared in Peng and King (2007) as part of the IJCNN 2007 Conference Proceedings, published under IEE copyright.

    View full text