Abstract
In this paper, we propose a new type of information-theoretic method called “double enhancement learning,” in which two types of enhancement, namely, self-enhancement and information enhancement, are unified. Self-enhancement learning has been developed to create targets spontaneously within a network, and its performance has proven to be comparable with that of conventional competitive learning and self-organizing maps. To improve the performance of the self-enhancement learning, we try to include information on input variables in the framework of self-enhancement learning. The information on input variables is computed by information enhancement in which a specific input variable is used to enhance competitive unit outputs. This information is again used to train a network with the self-enhancement learning. We applied the method to three problems, namely, an artificial data, a student survey and the voting attitude problem. In all three problems, quantization errors were significantly decreased with the double enhancement learning. The topographic errors were relatively higher, but the smallest number of topographic errors was also obtained by the double enhancement learning. In addition, we saw that U-matrices for all problems showed explicit boundaries reflecting the importance of input variables.
Similar content being viewed by others
References
Kamimura R (2009) Self-enhancement learning: self-supervised and target-creating learning. In: Proceedings of the international joint conference on neural networks, pp 1503–1509
Kamimura R (2009) Enhancing and relaxing competitive units for feature discovery. Neural Process Lett 30:37–57
Kohonen T (1988) Self-organization and associative memory. Springer, New York
Kohonen T (1995) Self-organizing maps. Springer, Berlin
Araújo AFR, de Barreto GA (2006) A self-organizing context-based approach to the tracking of multiple robot trajectories. Appl Intell 17:305–316
Rauber A, Merkl D (2003) Text mining n the somlib digital library system: the representation of topics and genres. Appl Intell 18:271–293
Lee C-H, Yang H-C (2003) A multilingual text mining approach based on self-organizing maps. Appl Intell 18:295–310
Cirrinclone G (2003) A novel self-organizing neural network for motion segmentation. Appl Intell 18:27–35
Papadimitriou S (2002) The supervised network self-organizing map for classification of large data sets. Appl Intell 16:185–203
Vesanto J (1999) SOM-based data visualization methods. Intell Data Anal 3:111–126
Kaski S, Nikkila J, Kohonen T (1998) Methods for interpreting a self-organized map in data analysis. In: Proceedings of European symposium on artificial neural networks, Bruges, Belgium
Mao I, Jain AK (1995) Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans Neural Netw 6(2):296–317
Ultsch A, Siemon HP (1990) Kohonen self-organization feature maps for exploratory data analysis. In: Proceedings of international neural network conference, Dordrecht, pp 305–308. Kluwer Academic, Dordrecht
Kamimura R (2009) An information-theoretic approach to feature extraction in competitive learning. Neural Comput 72:2693–2704
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Rakotomamonjy A (2003) Variable selection using SVM-based criteria. J Mach Learn Res 3:1357–1370
Kwak N, Choi C (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159
Sindhwani V, Raskshit S, Deodhare D, Erdogmus D, Principe JC (2004) Feature selection in mlps and svms based on maximum output information. IEEE Trans Neural Netw 15(4):937–948
Shie J-D, Chen S-M (2008) Feature subset selection based on fuzzy entropy measures for handling classification problems. Appl Intell 28:69–82
Setino R, Liu H (1996) Improving backpropagation learning with feature selection. Appl Intell 6:129–139
Liu H, Setino R (1998) Incremental feature selection. Appl Intell 9:217–230
Ye H, Lo BW (2000) Feature competitive algorithm for dimension reduction of the self-organizing map input space. Appl Intell 13:215–230
Kamimura R (2008) Feature discovery by enhancement and relaxation of competitive units. In: Intelligent data engineering and automated learning-IDEAL2008. LNCS, vol 5326. Springer, Berlin, pp 148–155
Kamimura R (2009) Supervised enhanced learning to simplify internal representations of multi-layered networks. In: Proceedings of the IASTED international conference on artificial intelligence and applications (AIA2009), pp 169–174
Kamimura R (2003) Information-theoretic competitive learning with inverse Euclidean distance output units. Neural Process Lett 18:163–184
Rumelhart DE, Zipser D (1985) Feature discovery by competitive learning. Cogn Sci 9:75–112
DeSieno D (1988) Adding a conscience to competitive learning. In: Proceedings of IEEE international conference on neural networks, San Diego, IEEE, pp 117–124
Ahalt SC, Krishnamurthy AK, Chen P, Melton DE (1990) Competitive learning algorithms for vector quantization. Neural Netw 3:277–290
Xu L (1993) Rival penalized competitive learning for clustering analysis, RBF net, and curve detection. IEEE Trans Neural Netw 4(4):636–649
Luk A, Lien S (2000) Properties of the generalized lotto-type competitive learning. In: Proceedings of international conference on neural information processing, San Mateo, CA. Morgan Kaufmann, San Mateo, pp 1180–1185
Hulle MMV (1997) The formation of topographic maps that maximize the average mutual information of the output responses to noiseless input signals. Neural Comput 9(3):595–606
Kamimura R, Kamimura T, Uchida O (2001) Flexible feature discovery and structural information. Connect Sci 13(4):323–347
Kamimura R, Kamimura T, Takeuchi H (2002) Greedy information acquisition algorithm: a new information theoretic approach to dynamic information acquisition in neural networks. Connect Sci 14(2):137–162
Kamimura R (2003) Information theoretic competitive learning in self-adaptive multi-layered networks. Connect Sci 13(4):323–347
Kamimura R (2008) Conditional information and information loss for flexible feature extraction. In: Proceedings of the international joint conference on neural networks (IJCNN2008), pp 2047–2083
Gurney KN (2001) Information processing in dendrites: II information theoretic complexity. Neural Netw 14:1005–1022
Kamimura R (2010) Contradiction resolution and its application to self-organizing maps. In: Proceedings of the IASTED international conference on artificial intelligence and applications (AIA2010) (in press)
Kamimura R (2006) Cooperative information maximization with Gaussian activation functions for self-organizing maps. IEEE Trans Neural Netw 17(4):909–919
Linsker R (1989) How to generate ordered maps by maximizing the mutual information between input and output. Neural Comput 1:402–411
Ueda N, Nakano R (1995) Deterministic annealing variant of the EM algorithm. In: Advances in neural information processing systems, pp 545–552
Rose K, Gurewitz E, Fox GC (1990) Statistical mechanics and phase transition in clustering. Phys Rev Lett 65(8):945–948
Martinez TM, Berkovich SG, Schulten KJ (1993) Neural-gas network for vector quanitization and its application to time-series prediction. IEEE Trans Neural Netw 4(4):558–569
Erdogmus D, Principe J (2004) Lower and upper bounds for misclassification probability based on Renyi’s information. J VLSI Signal Process Syst 37(2/3):305–317
Torkkola K (2003) Feature extraction by non-parametric mutual information maximization. J Mach Learn Res 3:1415–1438
Kamimura R (2008) Free energy-based competitive learning for mutual information maximization. In: Proceedings of IEEE conference on systems, man, and cybernetics, pp 223–227
Kamimura R (2008) Free energy-based competitive learning for self-organizing maps. In: Proceedings of artificial intelligence and applications, pp 414–419
Heskes T (2001) Self-organizing maps, vector quantization, and mixture modeling. IEEE Trans Neural Netw 12(6):1299–1305
Oja M, Serber GO, Blomberg J, Kaski S (2005) Self-organizing map-based discovery and visualization of human endogenous retroviral sequence groups. Int J Neural Syst 15(3):163–179
Kaski S, Nikkila J, Oja M, Venna J, Toronen P, Castren E (2003) Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinform 4(48). doi:10.1186/1471-2105-4-48
Venna J, Kaski S (2001) Neighborhood preservation in nonlinear projection methods: an experimental study. In: Lecture notes in computer science, vol 2130, pp 485–491
Nikkila J, Toronen P, Kaski S, Venna J, Castren E, Wong G (2002) Analysis and visualization of gene expression data using self-organizing maps. Neural Netw 15:953–966
Vathy-Fogarassy A, Werner-Stark A, Gal B, Abonyi J (2007) Visualization of topological representing networks. In: Lecture notes in computer science (IDEAL2007), vol 4881, pp 557–566
Himberg J (2007) From insights to innovation: data mining, visualization, and user interfaces. Dissertation, Helsinki University of Technology
Venna J (2007) Dimensionality reduction for visual exploration of similarity structures. Dissertation, Helsinki University of Technology
Lee JA, Verleysen M (2008) Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods. In: JMLR: workshop and conference proceedings, vol 4, pp 21–35
Polzlbauer G (2004) Survey and comparison of quality measures for self-organizing maps. In: Proceedings of the fifth workshop on data analysis (WDA04), pp 67–82
Berglund E, Sitte J (2006) The parameterless self-organizing map algorithm. IEEE Trans Neural Netw 17(2):305–316
Berglund E (2010) Improved plsom algorithm. Appl Intell 32:122–130
Zhu X (2005) Semi-supervised learning literature survey. Tech. rep. 1530, Computer Sciences. University of Wisconsin-Madison
Chapell OZ, Scholkopf B (eds) (2005) Semi-supervised learning. MIT Press, Cambridge
Nigam K, McCalum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39:103–134
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting of the association for computational linguistics, pp 189–196
Haffari GR, Sparkar A (2007) Analysis of semi-supervised learning with the Yarowsky algorithm. In: Proceedings of the 23rd conference on uncertainty in artificial intelligence
Rosenberg C, Hebert M, Schuneiderman H (2005) Semi-supervised self-training of object detection models. In: Proceedings of the seventh IEEE workshop on applications of computer vision
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the workshop on computational learning theory
Mitchell T (1999) The role of unlabeled data in supervised learning. In: Proceedings of the sixth international colloquium on cognitive science
Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of EMNLP/VLC-99
Joachims T (2000) Transductive inference for text classification using support vector machines. In: Proceedings of the 20th international conference on machine learning
Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the 18th international conference on machine learning
Micheli A, Sperduti A, Starita A (2001) Analysis of the internal representations developed by neural networks for structures applied to quantitative structure-activity relationship studies of benzodiazepines. J Chem Inf Comput Sci 41:202–218
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kamimura, R. Double enhancement learning for explicit internal representations: unifying self-enhancement and information enhancement to incorporate information on input variables. Appl Intell 36, 834–856 (2012). https://doi.org/10.1007/s10489-011-0300-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-011-0300-5