Parallel orthogonal deep neural network

doi:10.1016/j.neunet.2021.03.002

Neural Networks

Volume 140, August 2021, Pages 167-183

https://doi.org/10.1016/j.neunet.2021.03.002 Get rights and content

Under a Creative Commons license

open access

Highlights

•
We propose a novel deep learning ensemble method with the enforced diversity mechanism.
•
We introduce a new approach for analyzing ensemble diversity, taking into account model confidence.
•
We perform experimental evaluation and show improved classification performance.
•
We demonstrate an efficient way of building deep learning ensembles with end-to-end architecture.

Abstract

Ensemble learning methods combine multiple models to improve performance by exploiting their diversity. The success of these approaches relies heavily on the dissimilarity of the base models forming the ensemble. This diversity can be achieved in many ways, with well-known examples including bagging and boosting.

It is the diversity of the models within an ensemble that allows the ensemble to correct the errors made by its members, and consequently leads to higher classification or regression performance. A mistake made by a base model can only be rectified if other members behave differently on that particular instance, and provide the aggregator with enough information to make an informed decision. On the contrary, lack of diversity not only lowers model performance, but also wastes computational resources. Nevertheless, in the current state of the art ensemble approaches, there is no guarantee on the level of diversity achieved, and no mechanism ensuring that each member will learn a different decision boundary from the others.

In this paper, we propose a parallel orthogonal deep learning architecture in which diversity is enforced by design, through imposing an orthogonality constraint. Multiple deep neural networks are created, parallel to each other. At each parallel layer, the outputs of different base models are subject to Gram–Schmidt orthogonalization. We demonstrate that this approach leads to a high level of diversity from two perspectives. First, the models make different errors on different parts of feature space, and second, they exhibit different levels of uncertainty in their decisions. Experimental results confirm the benefits of the proposed method, compared to standard deep learning models and well-known ensemble methods, in terms of diversity and, as a result, classification performance.

Keywords

Ensemble learning

Deep learning

Orthogonalization

Diversity

Uncertainty