Journals & Magazines >IEEE Transactions on Signal P... >Volume: 69

Learning Two-Layer ReLU Networks Is Nearly as Easy as Learning Linear Classifiers on Separable Data

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Neural networks with non-linear rectified linear unit (ReLU) activation functions have demonstrated remarkable performance in many fields. It has been observed that a suf...Show More

Metadata

Abstract:

Neural networks with non-linear rectified linear unit (ReLU) activation functions have demonstrated remarkable performance in many fields. It has been observed that a sufficiently wide and/or deep ReLU network can accurately fit the training data, with a small generalization error on the testing data. Nevertheless, existing analytical results on provably training ReLU networks are mostly limited to over-parameterized cases, or they require assumptions on the data distribution. In this paper, training a two-layer ReLU network for binary classification of linearly separable data is revisited. Adopting the hinge loss as classification criterion yields a non-convex objective function with infinite local minima and saddle points. Instead, a modified loss is proposed which enables (stochastic) gradient descent to attain a globally optimal solution. Enticingly, the solution found is globally optimal for the hinge loss too. In addition, an upper bound on the number of iterations required to find a global minimum is derived. To ensure generalization performance, a convex max-margin formulation for two-layer ReLU network classifiers is discussed. Connections between the sought max-margin ReLU network and the max-margin support vector machine are drawn. Finally, an algorithm-dependent theoretical quantification of the generalization performance is developed using classical compression bounds. Numerical tests using synthetic and real data validate the analytical results.

Published in: IEEE Transactions on Signal Processing ( Volume: 69)

Page(s): 4416 - 4427

Date of Publication: 07 July 2021

ISSN Information:

DOI: 10.1109/TSP.2021.3094911

Funding Agency:

Contents

References is not available for this document.

Learning Two-Layer ReLU Networks Is Nearly as Easy as Learning Linear Classifiers on Separable Data

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Learning Two-Layer ReLU Networks Is Nearly as Easy as Learning Linear Classifiers on Separable Data

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?