Skip to main content
Log in

Growing Self-Organizing Maps for Nonlinear Time-Varying Function Approximation

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Function approximation may be described as the task of modeling the input-output relation and therefore yielding an estimation of the real output function value. In many domains, an ideal learning algorithm needs to approximate nonlinear time-varying functions from a high-dimensional input space and avoid problems from irrelevant or redundant input data. Therefore, the method has to meet three requirements, namely, it must: allow incremental learning to deal with changing functions and changing input distributions; keep the computational cost low; and achieve accurate estimations. In this paper, we explore different approaches to perform function approximation based on the Local Adaptive Receptive Fields Self-Organizing Map (LARFSOM). Local models are built by calculating between the output associated with the winning node and the difference vector between the input vector and the weight vector. These models are combined by using a weighted sum to yield the final approximate value. The topology is adapted in a self-organizing way, and the weight vectors are adjusted in a modified unsupervised learning algorithm for supervised problems. Experiments were carried out on synthetic and real-world datasets. Experimental results indicate that the proposed approaches perform competitively against Support Vector Regression (SVR) and can improve function approximation accuracy and computational cost against the locally weighted interpolation (LWI), a state-of-the-art interpolating algorithm for self-organizing maps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. In this paper, incremental learning means that weights of a learning system are updating without degrading previous knowledge and three matters are considered: (1) a limited memory scenario such that after a new sample is learned it is discarded and cannot be reused. The approach of stored and reused previous training data can be very inefficient in a real-time environment [34]; (2) input and output distributions are unknown; (3) these distributions can be time-varying [39].

  2. The calculations of the required derivatives are described in “Appendix A”.

  3. The calculations of the required derivatives are described in “Appendix B”.

  4. The calculations of the required derivatives are described in “Appendix C”.

  5. The same noisy function was employed in previous works to evaluate LWIGNG  [17] and LWPR  [45].

  6. The random rotation matrix R is calculated by the command qr(randn(10)) present in Matlab, Octave, and Julia.

  7. The datasets can be downloaded from http://www.dcc.fc.up.pt/~ltorgo/Regression/DataSets.html. This site also gives more details about each dataset.

  8. The LWI takes 150 times and 5 times the time spend by LWxD to train and execute, respectively.

References

  1. Al-Musaylh MS, Deo RC, Adamowski JF, Li Y (2018) Short-term electricity demand forecasting with mars, svr and arima models using aggregated demand data in Queensland, Australia. Adv Eng Inform 35:1–16. https://doi.org/10.1016/j.aei.2017.11.002. http://www.sciencedirect.com/science/article/pii/S1474034617301477

  2. Araújo AFR, Costa DC (2009) Local adaptive receptive field self-organizing map for image color segmentation. Image Vis Comput 27(9):1229–1239

    Article  MathSciNet  Google Scholar 

  3. Araújo AFR, Rêgo R (2013) Self-organizing maps with a time-varying structure. ACM Comput Surv (CSUR) 46(1):7:1–7:38

    Article  Google Scholar 

  4. Aupetit M (2006) Learning topology with the generative Gaussian graph and the em algorithm. In: Advances in neural information processing systems, pp 83–90

  5. Aupetit M, Couturier P, Massotte P (2000) Function approximation with continuous self-organizing maps using neighboring influence interpolation. In: Proc. neural computation (NC’2000), pp 23–26

  6. Aupetit M, Couturier P, Massotte P (2001) Induced voronoï kernels for principal manifolds approximation. In: Advances in self-organising maps. Springer, pp 73–80

  7. Bache K, Lichman M (2013) Uci machine learning repository. http://archive.ics.uci.edu/ml

  8. Barreto GA (2007) Time series prediction with the self-organizing map: a review. In: Hammer B, Hitzler P (eds) Perspectives of neural-symbolic integration, vol 77. Studies in Computational Intelligence. Springer, Berlin, pp 135–158

    Chapter  Google Scholar 

  9. Barreto GA, Araújo AFR (2004) Identification and control of dynamical systems using the self-organizing map. IEEE Trans Neural Netw 15(5):1244–1259

    Article  Google Scholar 

  10. Basak D, Pal S, Patranabis DC (2007) Support vector regression. Neural Inf Process Lett Rev 11(10):203–224

    Google Scholar 

  11. Campos MM, Carpenter GA (2000) Building adaptive basis functions with a continuous self-organizing map. Neural Process Lett 11(1):59–78. https://doi.org/10.1023/A:1009622004201

    Article  Google Scholar 

  12. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2: 27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  13. Chávez E, Navarro G, Baeza-Yates R, Marroquín JL (2001) Searching in metric spaces. ACM Comput Surv (CSUR) 33(3):273–321. https://doi.org/10.1145/502807.502808

    Article  Google Scholar 

  14. Cho J, Principe JC, Erdogmus D, Motter MA (2006) Modeling and inverse controller design for an unmanned aerial vehicle based on the self-organizing map. IEEE Trans Neural Netw 17(2):445–460

    Article  Google Scholar 

  15. Cuadros-Vargas E, Romero RF, Obermayer K (2003) Speeding up algorithms of SOM family for large and high dimensional databases. In: Proc. workshop on self organizing maps (WSOM’03), pp 167–172

  16. Díaz-Vico D, Torres-Barrán A, Omari A, Dorronsoro JR (2017) Deep neural networks for wind and solar energy prediction. Neural Process Lett 46(3):829–844. https://doi.org/10.1007/s11063-017-9613-7

    Article  Google Scholar 

  17. Flentge F (2006) Locally weighted interpolating growing neural gas. IEEE Trans Neural Netw 17(6):1382–1393

    Article  Google Scholar 

  18. Fritzke B (1995) Incremental learning of local linear mappings. In: Proc. International conference on artificial neural networks (ICANN), pp 217–222

  19. Gaillard P, Aupetit M, Govaert G (2008) Learning topology of a labeled data set with the supervised generative gaussian graph. Neurocomputing 71(7):1283–1299. https://doi.org/10.1016/j.neucom.2007.12.028. http://www.sciencedirect.com/science/article/pii/S0925231208000635

  20. Göppert J, Rosenstiel W (1995) Interpolation in SOM: improved generalization by iterative methods. In: Soulié FF, Gallinari P (eds) Proc. international conference on artificial neural networks (ICANN), vol II, pp 69–74. EC2, Nanterre, France

  21. Göppert J, Rosentiel W (1997) The continuous interpolating self-organizing map. Neural Process Lett 5(3):185–192. https://doi.org/10.1023/A:1009694727439

    Article  Google Scholar 

  22. Hartono P, Hollensen P, Trappenberg T (2015) Learning-regulated context relevant topographical map. IEEE Trans Neural Netw Learn Syst 26(10):2323–2335. https://doi.org/10.1109/TNNLS.2014.2379275

    Article  MathSciNet  Google Scholar 

  23. Hecht T, Lefort M, Gepperth A (2015) Using self-organizing maps for regression: the importance of the output function. In: Proc. European symposium on artificial neural networks (ESANN). Bruges, Belgium. https://hal.archives-ouvertes.fr/hal-01251011

  24. Heskes T (1999) Energy functions for self-organizing maps. In: Oja E, Kaski S (eds) Kohonen maps, pp 303–315. Elsevier Science B.V., Amsterdam. https://doi.org/10.1016/B978-044450270-4/50024-3

  25. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D, Hadsell R (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526. https://doi.org/10.1073/pnas.1611835114. https://www.pnas.org/content/114/13/3521

  26. Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69

    Article  MathSciNet  Google Scholar 

  27. Kohonen T (2013) Essentials of the self-organizing map. Neural Netw 37:52 – 65. http://www.sciencedirect.com/science/article/pii/S0893608012002596

  28. Lawrence S, Tsoi AC, Back AD (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proc. Australian conference on neural networks (ACNN), vol 1621

  29. Li S, Fang H, Liu X (2018) Parameter optimization of support vector regression based on sine cosine algorithm. Expert Syst Appl 91:63–77. https://doi.org/10.1016/j.eswa.2017.08.038. http://www.sciencedirect.com/science/article/pii/S0957417417305833

  30. Li Z, Hoiem D (2018) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40(12):2935–2947. https://doi.org/10.1109/TPAMI.2017.2773081

    Article  Google Scholar 

  31. Lomonaco V, Maltoni D (2017) Core50: a new dataset and benchmark for continuous object recognition. In: Levine S, Vanhoucke V, Goldberg K (eds) Proc. annual conference on robot learning, Proceedings of machine learning research, vol 78, pp 17–26. PMLR. http://proceedings.mlr.press/v78/lomonaco17a.html

  32. Ludwig L, Kessler W, Göppert J, Rosenstiel W (1995) SOM with topological interpolation for the prediction of interference spectra. In: Proc. engineering applications of neural networks (EANN), pp 379–389. Helsinki, Finland

  33. Maltoni D, Lomonaco V (2019) Continuous learning in single-incremental-task scenarios. Neural Netw 116:56–73. https://doi.org/10.1016/j.neunet.2019.03.010. http://www.sciencedirect.com/science/article/pii/S0893608019300838

  34. Parisi GI, Kemker R, Part JL, Kanan C, Wermter S (2019) Continual lifelong learning with neural networks: a review. Neural Netw 113:54–71. https://doi.org/10.1016/j.neunet.2019.01.012. http://www.sciencedirect.com/science/article/pii/S0893608019300231

  35. Principe JC, Wang L, Motter MA (1998) Local dynamic modeling with self-organizing maps and applications to nonlinear system identification and control. Proc IEEE 86(11):2240–2258

    Article  Google Scholar 

  36. Rasmussen CE, Neal RM, Hinton G, van Camp D, Revow M, Ghahramani Z, Kustra R, Tibshirani R (1996) Delve data for evaluating learning in valid experiments. http://www.cs.toronto.edu/~delve

  37. Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks. arXiv:1606.04671

  38. Salcedo-Sanz S, Rojo-Álvarez JL, Martínez-Ramón M, Camps-Valls G (2014) Support vector machines in engineering: an overview. Wiley Interdiscip Rev Data Min Knowl Discov 4(3):234–267. https://doi.org/10.1002/widm.1125

    Article  Google Scholar 

  39. Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8):2047–2084. https://doi.org/10.1162/089976698300016963

    Article  Google Scholar 

  40. Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245. https://doi.org/10.1162/089976600300015565

    Article  Google Scholar 

  41. Shepard D (1968) A two-dimensional interpolation function for irregularly-spaced data. In: Proc. ACM national conference. ACM ’68, pp 517–524. ACM, New York, NY, USA. https://doi.org/10.1145/800186.810616

  42. Sibson R (1981) A brief description of natural neighbour interpolation. Interpreting multivariate data, pp 21–36

  43. Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    Article  MathSciNet  Google Scholar 

  44. Thampi G, Principe JC, Cho J, Motter M (2002) Adaptive inverse control using SOM based multiple models. In: Proc. Portuguese conference automatic control (CONTROLO), pp 278–282

  45. Vijayakumar S, D’Souza A, Schaal S (2005) Incremental online learning in high dimensions. Neural Comput 17(12):2602–2634. https://doi.org/10.1162/089976605774320557

    Article  MathSciNet  Google Scholar 

  46. Walter J, Ritter H (1996) Rapid learning with parametrized self-organizing maps. Neurocomputing 12(2):131–153

    Article  Google Scholar 

  47. Wang J, Li Y (2018) Short-term wind speed prediction using signal preprocessing technique and evolutionary support vector regression. Neural Process Lett 48(2):1043–1061. https://doi.org/10.1007/s11063-017-9766-4

    Article  MathSciNet  Google Scholar 

  48. Zenke F, Poole B, Ganguli S (2017) Continual learning through synaptic intelligence. In: Proc. International Conference on Machine Learning - Volume 70, ICML’17, pp. 3987–3995. JMLR.org. http://dl.acm.org/citation.cfm?id=3305890.3306093

Download references

Acknowledgements

The authors would like to thank the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paulo H. M. Ferreira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Derivatives for Learning Rule of Difference Technique

For the vector \(\varvec{k}_{s}\) of the winner \(n_{s}\):

$$\begin{aligned} \frac{\partial \widetilde{M}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{s}} = \frac{\partial }{\partial \varvec{k}_{s}}\left( w^{out}_{s} + \varvec{k}_{s} \cdot \varvec{d}_{s} \right) = \varvec{d}_{s} \end{aligned}$$
(22)

Derivatives for Learning Rule of Extended Difference Technique

For the vectors \(\varvec{k}_{s,j}\) of the winner \(n_{s}\):

$$\begin{aligned} \frac{\partial \widetilde{M}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{s,j}} = \frac{\partial }{\partial \varvec{k}_{s,j}}\left( w^{out}_{s} + \sum _{k = 0}^{ |N_{s}| } \varvec{k}_{s,k} \cdot \varvec{d}_{s,k} \right) = \sum _{k = 0}^{|N_{s}|} \frac{\partial \varvec{k}_{s,k}}{\partial \varvec{k}_{s,j}} \cdot \varvec{d}_{s,k} \end{aligned}$$
(23)

since

$$\begin{aligned} \frac{\partial \varvec{k}_{s,k}}{\partial \varvec{k}_{s,j}} = \left\{ \begin{array}{ll} \varvec{I} &{}\quad k = j \\ \varvec{0} &{}\quad k \ne j \end{array}\right. \end{aligned}$$
(24)

then

$$\begin{aligned} \frac{\partial \widetilde{M}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{s,j}} = \varvec{d}_{s,j} \end{aligned}$$
(25)

Derivatives for Learning Rule of Locally Weighted Extended Difference Technique

To calculate the derivative \(\frac{\partial \widetilde{F}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}}\), we need to determine \(\frac{\partial \widetilde{M}_{(s,k)}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}}\)

$$\begin{aligned} \frac{\partial \widetilde{F}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}} = \frac{\partial }{\partial \varvec{k}_{i}}\left( \sum _{k = 0}^{|N_{s}|} m_{s,k}(\varvec{\xi }^{in})\widetilde{M}_{(s,k)}(\varvec{\xi }^{in}) \right) = \sum _{k = 0}^{|N_{s}|} m_{s,k}(\varvec{\xi }^{in}) \frac{\partial \widetilde{M}_{(s,k)}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}} \end{aligned}$$
(26)

We can distinguish two cases:

  1. 1.

    Case: \(n_{i} = n_{(s,k)}\)

    $$\begin{aligned} \frac{\partial \widetilde{M}_{(s,k)}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}} = \frac{\partial }{\partial \varvec{k}_{(s,k),j}} \left( w^{out}_{(s,k)} + \sum _{l = 0}^{ |N_{(s,k)}| } \varvec{k}_{(s,k),l} \cdot \varvec{d}_{(s,k),l} \right) = \sum _{l = 0}^{ |N_{(s,k)}| } \frac{\partial \varvec{k}_{(s,k),l}}{\partial \varvec{k}_{(s,k),j}} \cdot \varvec{d}_{(s,k),l}\nonumber \\ \end{aligned}$$
    (27)

    since

    $$\begin{aligned} \frac{\partial \varvec{k}_{(s,k),l}}{\partial \varvec{k}_{(s,k),j}} = \left\{ \begin{array}{ll} \varvec{I} &{}\quad l = j \\ \varvec{0} &{}\quad l \ne j \end{array}\right. \end{aligned}$$
    (28)

    then

    $$\begin{aligned} \frac{\partial \widetilde{M}_{(s,k)}(\varvec{\xi }^{in})}{\partial \varvec{k}_{(s,k),j}} = \varvec{d}_{(s,k),j} \end{aligned}$$
    (29)
  2. 2.

    Case: \(n_{i} \ne n_{(s,k)}\)

    $$\begin{aligned} \frac{\partial \widetilde{F}_{s}(\varvec{\xi }^{in})}{\partial \varvec{k}_{i}} = \varvec{0} \end{aligned}$$
    (30)

    since

    $$\begin{aligned} \frac{\partial \varvec{k}_{(s,k),l}}{\partial \varvec{k}_{i}} = \varvec{0} \end{aligned}$$
    (31)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ferreira, P.H.M., Araújo, A.F.R. Growing Self-Organizing Maps for Nonlinear Time-Varying Function Approximation. Neural Process Lett 51, 1689–1714 (2020). https://doi.org/10.1007/s11063-019-10168-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-019-10168-9

Keywords

Navigation