A robust outlier control framework for classification designed with family of homotopy loss function

doi:10.1016/j.neunet.2019.01.013

Neural Networks

Volume 112, April 2019, Pages 41-53

https://doi.org/10.1016/j.neunet.2019.01.013 Get rights and content

Abstract

We propose a new homotopy loss, where practitioners can tune the parameter to derive different loss functions such as $l_{1}$ - $n o r m$ loss, logarithmic loss, Geman–Reynolds loss, Geman–McClure loss and correntropy-based loss. Moreover, we illustrate that the proposed loss satisfies Fisher consistency, and we analyze the robustness of the proposed homotopy loss from different perspectives: M-estimation and adversarial perturbations. Then, we represent a new evaluation standard to measure robustness and demonstrate its upper bound to ensure the validity of this measure. Applied the proposed homotopy loss to least square support vector machine (LSSVM) and the extreme learning machine (ELM), two robust models are presented to enhance the robustness. Furthermore, re-weighted least square algorithm is used to solve the problems, and the resulting algorithms converge globally. In addition, the proposed methods are implemented on various datasets with different levels of noise. Compared with traditional methods, experiment results on real-world datasets show that the proposed methods have superior anti-interference ability to outliers in most cases.

Introduction

Noise accompanied by the collection stage refers to those data that deviate from the original true value attributes. These noise data may result in the overfitting and decline in generalization ability. For classification learning, a dataset is usually composed of two parts: attributes and category labels. The quality of the attributes indicates whether the attributes of the data can accurately depict the samples, and the quality of the category labels represents the correct allocation of the categories of each sample. When classifying, samples with similar attributes are divided into the same category label. Thus, noise can be divided into two types: feature noise and label noise according to the noise whether in the attributes or in the label (Feng, Yang, Huang, Mehrkanoon, & Suykens, 2016).

In this work, we concentrate on studying mitigating the impact of the label noise to the model that many researchers have studied in this area. Generally speaking, these methods can be roughly divided into two branches: the one is based on dealing with the noise data and outliers directly, the other is robust classifier which aims at reducing the effects in the model.

For the first branch, three main directions are discussed: clustering, ensemble learning and graph embedding. These methods are used to detect the noise data and outliers. In the noise test, cluster analysis is often used by eliminating the samples away from the right class samples, so as to achieve a robust effect (Christy et al., 2015, Du et al., 2016, Hautamaki et al., 2004, Maiywan and Kashyap, 2002). Ensemble learning is also a robust learning technique where data are divided into several parts, by constructing and combining multiple base classifiers to complete the learning task. If the outliers are contained in a certain base learner, it may have poor performance. The ensemble system will remove the base learner that have poor performance in the process of integration. Corresponding algorithms are listed such as boosting algorithm (Galar, Fernandez, Barrenechea, Bustince, & Herrera, 2012), bagging algorithm (Breiman, 1996), AdaBoost algorithm (Freund & Schapire, 1996) and so on. Graph embedding (Goyal and Ferrara, 2018, Sheng, 1973, Zhang et al., 2013) is also an effective way to resist noise data and outliers. By defining intrinsic graph, penalty graphs and corresponding edge weight matrices, sample points with low similarity (noise points and outliers) are detected and assigned small weight values. In this way, robustness is achieved. However, these methods based on noise detection algorithm also increase the complexity of the algorithm.

To reduce impact of the noise, robust classifier is an alternative way to resist the adversarial perturbations. Also, we mainly discuss three directions: convolution neural networks, classifiers induced by robust statistics and classifiers induced by robust loss functions, where the last direction is most relevant to our work. Convolution neural network (Gu et al., 2018, Krizhevsky et al., 2012, Tran et al., 2018, Zhang et al., 2019) is a robust network topology. Neurons in the convolutional layer are connected with local receptive fields which are small areas of the input samples. This structure can tolerate noisy input samples and is more robust. However, convolutional neural network needs a large number of samples for training in practice, and will inevitably become over-fitting. Since the squared loss function is more appropriate for data subject to normal distribution, the data with noise and outliers are far beyond the right class. The mean value may be unsuitable for even one outlier may affect the final decision function heavily. Robust statistics are used to eliminate noise such as median (Ma, 2011), fractile quantile (Xu, Zhang, Jiang, Huang, & He, 2015), data divergence (Zhang, Jiang, & Chai, 2010) and maximum correntropy criterion (Du et al., 2018, Liu et al., 2007).

Building robust classifiers based on robust loss function is an effective method to mitigate the impact of noise, which is also most related to our work. In the following, we will introduce robust classifiers modeled by different kinds of robust loss functions. In practical applications, due to the different intensities and types of noise, it is difficult to detect all the anomalies through robust optimization algorithm based on noise detection directly. To reduce the effect of outliers to the classifier, series of truncated loss function as robust loss functions are often used to induce robust classifiers in the literature such as Ramp loss, which is also well known as truncated hinge loss (Gimpel and Smith, 2012, Huang et al., 2014, Wu and Liu, 2007), the truncated least squares loss (Wang & Zhong, 2014) and truncated logistic loss (Park & Liu, 2011). Classifiers induced by these truncated loss functions intuitively restrict the upper bound of the penalty to the outliers. However, they are neither convex nor smooth which make the models difficult to solve. For these problems, there are generally two important branches. The first is to design effective algorithms such as re-weighted least squares algorithm (Perez-Cruz, Navia-Vazquez, Alarcon-Diana, & Artes-Rodriguez, 2000), concave convex procedure (CCCP) algorithm (Yuille & Rangarajan, 2003) and outlier path algorithm (OP) (Suzumura, Ogawa, Sugiyama, Karasuyama, & Takeuchi, 2015). The other is to smooth the loss function to reduce the complexity of the algorithm (Lee and Mangasarian, 2001, Wang et al., 2007), such as Geman–Reynolds loss (Yu, Aslan, & Schuurmans, 2012), Geman–McClure loss (Geman, 1987), $l n c o s h$ loss (Karal, 2017) and general $l_{2}$ -loss (Barron, 2017). In 2016, Feng et al. (2016) gave two families of non-convex and smooth classification loss: correntropy-based and logarithm-based. Singh and Principe (2010) claim that correntropy (Liu et al., 2007), as a good similarity measure, can be used as a robust loss function. Based on this, Singh, Pokharel, and Principe (2014) and Xu, Cao, Hu, and Principe (2016) design a C-loss and rescaled Hinge loss respectively. Another robust loss function can be found in Chen, Zhou, Chen, Shao, and Gu (2017) which is called by $l_{1, 2}$ - $n o r m$ .

However, from another point of view, the cut-off level should be carefully considered in the truncated loss function for practitioners. What is more, the pull-in cut-off level parameter will also bring either computation complexity or time complexity. On the other hand, the loss functions that are based on square loss and absolute loss are especially remarkable for Gaussian noise and Laplace noise respectively. However, in the applications, the noise type is confused and is tended to be more complex and uncontrollable. It is meaningful to design a more flexible loss function to handle. In addition, Barron (2017) claim that smooth non-convex losses are “plug and play”. The price is expensive to change the model when the practitioners take the time, manpower and material resources to adjust the parameters in the model and finally find that the result of the model is always unsatisfactory. Thus, it is more effective to design a superset that contains different types of loss function for practitioners. Motivated by this, this work is aimed at searching for a more simple and feasible method to establish a superset of loss functions controlled by homotopy parameter to facilitate practitioners and “achieve once and for all”.

In this work, we present a two-parameter loss function, in which, one parameter is a scale parameter, the other enables practitioners to find the optimal penalty function class so that the performance of the final classifier can be improved. In topology, those penalty function classes are called homotopic, such a deformation being called a homotopy among these functions. The main contributions of this work can be summarized as follows:

(1) A homotopy loss is proposed to continuously explore a wider family of loss functions for practitioners. The $l_{1}$ - $n o r m$ loss, logarithmic loss, Geman–Reynolds loss (Yu et al., 2012), Geman–McClure loss (Geman, 1987) and correntropy-based loss (Yang, Ren, Wang, & Dong, 2017) are all special cases of the homotopy loss. The Fisher consistency of this homotopy loss is proved to ensure that the proposed classifiers that are induced by this loss yield the Bayes decision boundary asymptotically. Further more, re-weighted least square algorithm is used to obtain the approximation optimal solution, and the resulting algorithms converge globally.

(2) We analyze the robustness of the proposed homotopy loss from different perspectives: M-estimation (Koltchinskii, 1997) and adversarial perturbations. In particular, to further prove the robustness to adversarial perturbations, we represent a new evaluation criterion to measure robustness in quantity and provide the upper bound to ensure the validity of this measure.

(3) The proposed robust LSSVM and ELM models are implemented on various datasets with different noises’ intensity. Compared with traditional methods, experiment results on real-world datasets show that the proposed models have good anti-interference ability to outliers.

The reminder of this paper is organized as follows. In Section 2, we give a brief view of LSSVM model and ELM model. Then, a new robust homotopy loss function is derived in Section 3 based on $l_{1}$ - $n o r m$ loss which also satisfies Fisher consistency. Properties are also given in this section. New robust frameworks with homotopy loss which are robust LSSVM model and robust ELM model and corresponding algorithms are given in Section 4. In Section 5, analysis of robustness to adversarial perturbations of the model and its upper bound is represented. Numerical experiments and parameter analysis are given in Section 6 to illustrate the validity of the proposed model and conclusions of our paper are given in Section 7.

Section snippets

Background

In this section, we briefly introduce LSSVM (Feng et al., 2016) and ELM (Huang et al., 2006, Yang and Zhang, 2016) models which are two popular machine learning methods.

For a binary classification problem in a $d$ -dimensional Euclidean space, suppose that training set $T$ consists of $m$ labeled samples: $T = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{m}, y_{m})} \in {(R^{d} \times Y)}^{m} .$ where $x_{i} \in R^{d}$ is the input vector whose component represents features. The $y_{i} = {- 1, 1}, (i = 1, \dots, m)$ is the label of sample $x_{i}$ .

A robust homotopy loss function

In this section, we try to design a robust loss function with $l_{1}$ - $n o r m$ loss. As is known to all that $l_{1}$ - $n o r m$ loss is more robust than $l_{2}$ - $n o r m$ loss function. With the value of $| x |$ increases, $l_{1}$ - $n o r m$ loss has a slower growth rate than $l_{2}$ - $n o r m$ loss. Thus, the influence of outliers to classifier is weakened when using $l_{1}$ - $n o r m$ loss. We first give the definition of a proper loss function as defined in Shewhart and Wilks (2006):

Definition 3.1

A function $θ (x)$ is called proper as a loss function if $θ (x)$ such that:

C1: $θ ($

Robust classification framework with homotopy loss

In this section, we present two robust classification frameworks with homotopy loss.

Analysis of robustness to adversarial perturbations

In the previous sections, we have illustrated the robustness of the homotopy loss from the viewpoint of M-estimation. Motivated by a recent contribution (Fawzi, Fawzi, & Frossard, 2018), this section tests mainly the robustness of classifiers to adversarial perturbations quantitatively. When there is a noise point, the classification hyperplane will move toward the noise point, and therefore this kind of classifiers is sensitive to noise.

Define the minimum distance that is needed to switch one

Numerical experiments

To illustrate the validity of the established model with new homotopy loss function, numerical experiments are represented in this section.

In the first experiment, we compare the ELM with homotopy $l_{1}$ -loss ( $l_{1}$ -HELM for short), ELM with Laplace kernel based homotopy loss ( $l_{1}$ -LKELM for short), ELM with general $l_{2}$ -loss (Barron, 2017) ( $l_{2}$ -HELM for short), ELM with C-loss (Singh et al., 2014) (CELM for short), ELM with $l n c o s h$ loss (Karal, 2017) (LnELM for short) and classical ELM (ELM for short) and

Conclusion

In this investigation, we have presented a two-parameter loss function that unifies a number of robust loss functions and generalizes many existing one-parameter robust loss functions: $l_{1}$ - $n o r m$ loss, logarithmic loss, Geman–Reynolds loss, Geman–McClure loss and correntropy-based loss. Thus the proposed homotopy loss is more convenient for practitioners to operate and has much more flexibility compared to classical loss functions. The Fisher consistency of the proposed homotopy loss is proved to

Acknowledgments

This work is supported by National Nature Science Foundation of China (Nos. 11471010 and 11271367). Moreover, the authors thank the referees and editor for their constructive comments to improve the paper.

References (51)

ChristyA. et al.
Cluster based outlier detection algorithm for healthcare data
Procedia Computer Science
(2015)
FawcettT.
An introduction to roc analysis
Pattern Recognition Letters
(2006)
GuJ. et al.
Recent advances in convolutional neural networks
Pattern Recognition
(2018)
HuangG.B. et al.
Extreme learning machine: Theory and applications
Neurocomputing
(2006)
KaralO.
Maximum likelihood optimal and robust Support Vector Regression with lncosh loss function
Neural Networks
(2017)
ParkS.Y. et al.
Robust penalized logistic regression with truncated loss functions
The Canadian journal of statistics Revue canadienne de statistique
(2011)
SinghA. et al.
The C-loss function for pattern classification
Pattern Recognition
(2014)
TranD.T. et al.
Improving efficiency in convolutional neural networks with multilinear filters
Neural Networks
(2018)
WangK. et al.
Robust non-convex least squares loss function for regression with outliers
(2014)
XuQ. et al.
Weighted quantile regression via support vector machine
Expert Systems with Applications
(2015)

YangL. et al.

A sparse extreme learning machine framework by continuous optimization algorithms and its application in pattern recognition, Vol. 53

(2016)

ZhangC. et al.

Penalized Bregman divergence for large-dimensional regression and classification

Biometrika

(2010)

ZhangQ. et al.

Recent advances in convolutional neural network acceleration

Neurocomputing

(2019)

BarronJ.T.

A more general robust loss function

(2017)

Blake, C., & Merz, C. (1998). UCI repository of machine learning databases,...

BreimanL.

Bagging predictors

Machine Learning

(1996)

ChenX. et al.

Supervised multiview feature selection exploring homogeneity and heterogeneity with $l_{1, 2}$ - $N o r m$ and automatic view generation

IEEE Transactions on Geoscience and Remote Sensing

(2017)

DuB. et al.

Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion

IEEE Transactions on Cybernetics

(2018)

DuH. et al.

Novel clustering-based approach for local outlier detection

Computer communications workshops

(2016)

FawziA. et al.

Analysis of classifiers’ robustness to adversarial perturbations

Machine Learning

(2018)

FengY. et al.

Robust support vector machines for classification with nonconvex and smooth losses

Neural Computation

(2016)

FreundY. et al.

Experiments with a new boosting algorithm

Thirteenth international conference on international conference on machine learning

(1996)

GalarM. et al.

A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches

IEEE Transactions on Systems Man and Cybernetics Part C

(2012)

GemanS.

Statistical methods for tomographic image reconstruction

(1987)

GimpelK. et al.

Structured ramp loss minimization for machine translation

Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: Human language technologies

(2012)

Cited by (13)

A generalized adaptive robust distance metric driven smooth regularization learning framework for pattern recognition
2023, Signal Processing
In this work, we propose a novel generalized adaptive robust distance metric called $L_{δ} (u)$ . Compared with other distance metrics, $L_{δ} (u)$ has some desirable salient properties, such as symmetry, boundedness, robustness, nonconvexity, and adaptivity, with both first-order and higher-order moments from samples. On the other hand, $L_{δ} (u)$ can pick different robust distance metrics for different learning tasks during the learning process by the adaptive parameter $δ$ . Furthermore, we apply $L_{δ} (u)$ to twin extreme learning machine (TELM) and develop an smooth regularized TELM learning framework for supervised and semi-supervised classification. By introducing the structural risk minimization (SRM) principle and smoothing techniques, the learning framework perfectly overcomes the computational burden associated with the matrix inversion operation required during TELM solving, while also significantly improving performance. More importantly, the proposed learning framework not only improves the robustness of TELM, but also can effectively use the geometric information of the marginal distribution embedded in the unlabeled samples to construct a more reasonable classifier. Finally, the globally convergent and quadratic convergent fast Newton-Armijo algorithm and DC (difference of convex functions) programming algorithm (DCA) are designed to solve the proposed methods. Experimental results demonstrate the effectiveness and robustness of our methods.
Robust twin extreme learning machines with correntropy-based metric
2021, Knowledge-Based Systems
In this paper, we propose a novel distance metric based on correntropy and kernel learning. Some properties of the proposed metric are demonstrated such as nonnegativity, non-convexity, boundedness and approximation behaviors. The proposed metric includes and extends classical metrics such as $L_{0}$ -norm and $L_{1}$ -norm metrics. Moreover, we develop a fraction loss function satisfying the Bayes rule, and demonstrate its robustness from the perspective of M-estimation. With the proposed robust metric and loss function, a new robust twin extreme learning machines framework (called LCFTELM) is presented to reduce the negative effect of noises and outliers. The proposed LCFTELM retains the advantages of twin extreme machines (TELM) and promotes the robustness. However, the non-convexity of the proposed model makes it difficult to optimize. Furthermore, an effective iterative algorithm for LCFTELM is designed, we present theoretical analysis on the convergence of the proposed algorithm. Following that, we evaluate the proposed algorithm on real-world datasets and artificial dataset under different noise settings. Experimental results show that the proposed method achieves better generalization than the state-of-the-art methods in most cases, which demonstrates the feasibility and robustness of the proposed LCFTELM.
A new heuristic model for monthly streamflow forecasting: Outlier-robust extreme learning machine
2021, Advances in Streamflow Forecasting: From Traditional to Modern Approaches
Streamflow plays an important role in several hydraulic and hydrologic applications. Therefore, forecasting streamflow accurately is critical for better understanding of streamflow characteristics and variability over the time. This chapter proposed and discussed four heuristic extreme learning machine (ELM) models used for forecasting streamflows. Subsequently, the chapter includes a case study where application of the four heuristic ELM models is demonstrated in forecasting monthly streamflows using data from two stations in Turkey, namely, Topluca and Tozkoy stations. Furthermore, the case study involves investigating the comparative feasibility and robustness of four heuristic methods. The proposed heuristic models are (i) the outlier-robust extreme learning machine (ORELM), (ii) the regularized extreme learning machine (RELM), (iii) the weighted regularized extreme learning machine (WRELM), and (iv) the original ELM. Results obtained using the proposed models were compared with those obtained using the multiple linear regression models. Using monthly streamflow dataset measured between 1994 and 2007, the models were developed using 70% of the dataset as training data, and 30% as validation data, according to six scenarios. A total of four statistical metrics were used to evaluate performance of the models: correlation coefficient (R²), Nash–Sutcliffe efficiency (NSE), root mean square error (RMSE), and mean absolute error. At Topluca Station, the ORELM models provided the R² values ranging from 0.604 to 0.858, NSE values from 0.313 to 0.735, and RMSE values from 11.121 to 17.827 m³/s, respectively. At Tozkoy Station, the ORELM models provided the R² values ranging from 0.615 to 0.911, NSE values from 0.291 to 0.829, and RMSE values from 3.164 to 6.439 m³/s, respectively. Overall, the results suggested that the proposed ELM models have capability to forecast monthly streamflow with reliable accuracy.
Twin minimax probability machine for pattern classification
2020, Neural Networks
Citation Excerpt :
Moreover, SVM cannot directly output a posterior probability to enable postprocessing. Some improved versions for SVM have been presented such as sequential minimal optimization SVM (Keerthi, Shevade, Bhattacharyya, & Murthy, 2014), chunking algorithm for SVM (Shawe-Taylor & Sun, 2011), robust SVM with family of homotopy loss function (Wang, Yang, et al., 2019a), privacy-reserving SVM (Farokhi, 2019), and generalized eigenvalue proximal SVM (GEPSVM) (Chauhan et al., 2019) that relaxes the parallel hyperplanes in traditional SVM, and attempts to set up a pair of nonparallel hyperplanes by solving two generalized eigenvalue problems. Subsequently, Jayadeva et al. proposed twin support vector machines (TWSVM) (Jayadeva & Chandra, 2007) to generate two non-parallel hyperplanes such that each hyperplane is close to one of the two classes and distant from the other simultaneously.
We propose a new distribution-free Bayes optimal classifier, called the twin minimax probability machine (TWMPM), which combines the benefits of both minimax probability machine(MPM) and twin support vector machine (TWSVM). TWMPM tries to construct two nonparallel hyperplanes such that each hyperplane separates one class samples with maximal probability, and is distant from the other class samples simultaneously. Moreover, the proposed TWMPM can control the misclassification error of samples in a worst-case setting by minimizing the upper bound on misclassification probability. An efficient algorithm for TWMPM is first proposed, which transforms TWMPM into concave fractional programming by applying multivariate Chebyshev inequality. Then the proposed TWMPM is reformulated as a pair of convex quadric programming (QP) by proper mathematical transformations. This guarantees TWMPM to have global optimal solution and be solved simply and effectively. In addition, we develop also an iterative algorithm for the proposed TWMPM. By comparing the two proposed algorithms theoretically, it is easy to know that the convex quadric programming algorithm is with lower computation burden than iterative algorithm for the TWMPM. A linear TWMPM version is first built, and then we show how to exploit mercer kernel to obtain nonlinear TWMPM version. The computation complexity for QP algorithm of TWMPM is in the same order as the traditional twin support vector machine (TWSVM). Experiments are carried out on three databases: UCI benchmark database, a practical application database and an artificial database. With low computation complexity and fewer parameters, experiments show the feasibility and effectiveness of the proposed TWMPM and its QP algorithm.
Efficiently utilizing complex-valued PolSAR image data via a multi-task deep learning framework
2019, ISPRS Journal of Photogrammetry and Remote Sensing
Citation Excerpt :
Although more automated sub-fields are emerging (Elsken et al., 2019; Liu et al., 2018), the design of network architectures and the construction of objectives incorporate the wisdom of the human experts. CNNs have the ability to utilize massive data compared to the shallow models in machine learning (Cortes and Vapnik, 1995; Wang et al., 2019; Zhang et al., 2015). Moreover, the generation of fast computing technology based on graphics processing unit (GPU) greatly promotes the application of CNNs.
Convolutional neural networks (CNNs) have been widely used to improve the accuracy of polarimetric synthetic aperture radar (PolSAR) image classification. However, in most studies, the difference between PolSAR images and optical images is rarely considered. Most of the existing CNNs are not tailored for the task of PolSAR image classification, in which complex-valued PolSAR data have been simply equated to real-valued data to fit the optical image processing architectures and avoid complex-valued operations. This is one of the reasons CNNs unable to perform their full capabilities in PolSAR classification. To solve the above problem, the objective of this paper is to develop a tailored CNN framework for PolSAR image classification, which can be implemented from two aspects: Seeking a better form of PolSAR data as the input of CNNs and building matched CNN architectures based on the proposed input form. In this paper, considering the properties of complex-valued numbers, amplitude and phase of complex-valued PolSAR data are extracted as the input for the first time to maintain the integrity of original information while avoiding immature complex-valued operations. Then, a multi-task CNN (MCNN) architecture is proposed to match the improved input form and achieve better classification results. Furthermore, depthwise separable convolution is introduced to the proposed architecture in order to better extract information from the phase information. Experiments on three PolSAR benchmark datasets not only prove that using amplitude and phase as the input do contribute to the improvement of PolSAR classification, but also verify the adaptability between the improved input form and the well-designed architectures.
Intuitionistic Fuzzy Extreme Learning Machine with the Truncated Pinball Loss
2024, Neural Processing Letters

View all citing articles on Scopus

View full text

A robust outlier control framework for classification designed with family of homotopy loss function

Abstract

Introduction

Section snippets

Background

A robust homotopy loss function

Robust classification framework with homotopy loss

Analysis of robustness to adversarial perturbations

Numerical experiments

Conclusion

Acknowledgments

Procedia Computer Science

Pattern Recognition Letters

Pattern Recognition

Neurocomputing

Neural Networks

The Canadian journal of statistics Revue canadienne de statistique

Pattern Recognition

Neural Networks

Expert Systems with Applications

Biometrika

Neurocomputing

A more general robust loss function

Bagging predictors

Machine Learning

Supervised multiview feature selection exploring homogeneity and heterogeneity with l1,2-Norm and automatic view generation

IEEE Transactions on Geoscience and Remote Sensing

Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion

IEEE Transactions on Cybernetics

Novel clustering-based approach for local outlier detection

Computer communications workshops

Analysis of classifiers’ robustness to adversarial perturbations

Machine Learning

Robust support vector machines for classification with nonconvex and smooth losses

Neural Computation

Experiments with a new boosting algorithm

Thirteenth international conference on international conference on machine learning

A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches

IEEE Transactions on Systems Man and Cybernetics Part C

Statistical methods for tomographic image reconstruction

Structured ramp loss minimization for machine translation

Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: Human language technologies

Supervised multiview feature selection exploring homogeneity and heterogeneity with $l_{1, 2}$ - $N o r m$ and automatic view generation