Skip to main content
Log in

Competitive Cross-Entropy Loss: A Study on Training Single-Layer Neural Networks for Solving Nonlinearly Separable Classification Problems

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

After Minsky and Papert (Perceptrons, MIT Press, Cambridge, 1969) showed the inability of perceptrons in solving nonlinearly separable problems, for several decades people misinterpreted it as an inherent weakness that is common to all single-layer neural networks. The introduction of the backpropagation algorithm reinforced this misinterpretation as its success in solving nonlinearly separable problems passed through the training of multilayer neural networks. Recently, Conaway and Kurtz (Neural Comput 29(3):861–866, 2017) proposed a single-layer network in which the number of output units for each class is the same as input units and showed that it could solve some nonlinearly separable problems. They used the MSE (Mean Square Error) between the input units and the output units of the actual class as the objective function for training the network. They showed that their method could solve the XOR and M&S’81 problems, but it could not do any better than random guessing on the 3-bit parity problem. In this paper, we use a soft competitive approach to generalize the CE (Cross-Entropy) loss, which is a widely accepted criterion for multiclass classification, to networks that have several output units for each class, calling the resulting measure the CCE (Competitive cross-entropy) loss. In contrast to Conaway and Kurtz (2017), in our method, the number of output units for each class can be chosen arbitrarily. We show that the proposed method can successfully solve the 3-bit parity problem, in addition to the XOR and M&S’81 problems. Furthermore, we perform experiments on several datasets for multiclass classification, comparing a single-layer network trained with the proposed CCE loss against LVQ, linear SVM, a single-layer network trained with the CE loss, and the method of Conaway and Kurtz (2017). The results show that the CCE loss performs remarkably better than existing algorithms for training single-layer neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. These datasets can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets.

  2. Note that while in this experiment the proposed method has 60 output neurons, the number of output neurons for the method of Conaway and Kurtz [4] is 7840.

References

  1. Bagarello F, Cinà M, Gargano F (2017) Projector operators in clustering. Math Methods Appl Sci 40(1):49–59

    Article  MathSciNet  Google Scholar 

  2. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    MATH  Google Scholar 

  3. Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) Pcanet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032

    Article  MathSciNet  Google Scholar 

  4. Conaway N, Kurtz KJ (2017) Solving nonlinearly separable classifications in a single-layer neural network. Neural Comput 29(3):861–866

    Article  MathSciNet  Google Scholar 

  5. Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(Mar):551–585

    MathSciNet  MATH  Google Scholar 

  6. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9(Aug):1871–1874

    MATH  Google Scholar 

  7. Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158

    Article  Google Scholar 

  8. Kohonen T (1995) Learning vector quantization. In: Self-organizing maps. Springer, pp 175–189

  9. Kohonen T, Hynninen J, Kangas J, Laaksonen J, Torkkola K (1996) Lvq pak: the learning vector quantization program package. Tech. rep., Technical report, Laboratory of Computer and Information Science Rakentajanaukio 2 C, 1991–1992

  10. Martín-del Brío B (1996) A dot product neuron for hardware implementation of competitive networks. IEEE Trans Neural Netw 7(2):529–532

    Article  Google Scholar 

  11. Medin DL, Schwanenflugel PJ (1981) Linear separability in classification learning. J Exp Psychol Hum Learn Mem 7(5):355

    Article  Google Scholar 

  12. Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell 35(11):2624–2637

    Article  Google Scholar 

  13. Minsky M, Papert S (1969) Perceptrons. MIT Press, Cambridge

    MATH  Google Scholar 

  14. Rosasco L, De Vito E, Caponnetto A, Piana M, Verri A (2004) Are loss functions all the same? Neural Comput 16(5):1063–1076

    Article  Google Scholar 

  15. Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  16. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–538

    Article  Google Scholar 

  17. Siomau M (2014) A quantum model for autonomous learning automata. Quantum Inf Process 13(5):1211–1221

    Article  MathSciNet  Google Scholar 

  18. Urcid G, Ritter GX, Iancu L (2004) Single layer morphological perceptron solution to the n-bit parity problem. In: Iberoamerican congress on pattern recognition, Springer, pp 171–178

  19. Zhu G, Lin L, Jiang Y (2017) Resolve xor problem in a single layer neural network. In: IWACIII 2017-5th international workshop on advanced computational intelligence and intelligent informatics, Fuji Technology Press Ltd

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamaledin Ghiasi-Shirazi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghiasi-Shirazi, K. Competitive Cross-Entropy Loss: A Study on Training Single-Layer Neural Networks for Solving Nonlinearly Separable Classification Problems. Neural Process Lett 50, 1115–1122 (2019). https://doi.org/10.1007/s11063-018-9906-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-018-9906-5

Keywords

Navigation