Abstract
In this study, we propose a new type of information-theoretic method in which the comprehensibility of networks is progressively improved upon within a course of learning. The comprehensibility of networks is defined by using mutual information between competitive units and input patterns. When comprehensibility is maximized, the most simplified network configurations are expected to emerge. Comprehensibility is defined for competitive units, and the comprehensibility of the input units is measured by examining the comprehensibility of competitive units, with special attention being paid to the input units. The parameters to control the values of comprehensibility are then explicitly determined so as to maximize the comprehensibility of both the competitive units and the input units. For the sake of easy reproducibility, we applied the method to two problems from the well-known machine learning database, namely, the Senate problem and the cancer problem. In both experiments, any type of comprehensibility can be improved upon, and we observed that fidelity measures such as quantization errors could also be improved.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Torkkola K (2003) Feature extraction by non-parametric mutual information maximization. J Mach Learn Res 3:1415–1438
Alexander JA, Mozer MC (1999) Template-based procedures for neural network interpretation. Neural Netw 12:479–498
Andrews R, Diederich J, Tickle AB (1993) Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl Based Syst 8(6):373–389
Kahramanli H, Allahverdi N (2009) Rule extraction from trained adaptive networks using artificial immune systems. Expert Syst Appl 36:1513–1522
Towell GG, Shavlik JW (1993) Extracting refined rules from knowledge-based neural networks. Mach Learn 13:71–101
Tsukimoto H (2000) Extracting rules from trained neural networks. IEEE Trans Neural Netw 11(2):377–389
Garcez ASd, Broda K, Gabbay D (2001) Symbolic knowledge extraction from trained neural networks: a sound approach. Artif Intell 125:155–207
Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intell 2(1):59–62
Kohonen T (1990) The self-organizing maps. Proc IEEE 78(9):1464–1480
Kohonen T (1995) Self-organizing maps. Springer, Berlin
Kamimura R, Kamimura T, Shultz TR (2001) Information theoretic competitive learning and linguistic rule acquisition. Trans Jpn Soc Artif Intell 16(2):287–298
Kamimura R (2003) Information theoretic competitive learning in self-adaptive multi-layered networks. Conn Sci 13(4):323–347
Kamimura R (2003) Information-theoretic competitive learning with inverse Euclidean distance output units. Neural Process Lett 18:163–184
Kamimura R, Nakanishi S (1995) Hidden information maximization for feature detection and rule discovery. Network 6:577–622
Kamimura R (1998) Minimizing α-information for generalization and interpretation. Algorithmica 22:173–197
Kamimura R, Kamimura T, Uchida O (2001) Flexible feature discovery and structural information. Conn Sci 13(4):323–347
Kamimura R (2003) Progressive feature extraction by greedy network-growing algorithm. Complex Syst 14(2):127–153
Kamimura R (2007) Information loss to extract distinctive features in competitive learning. In: Proceedings of IEEE conference on systems, man, and cybernetics, pp 1217–1222
Kamimura R (2008) Feature detection and information loss in competitive learning. In: Proceedings of the international conference on soft computing and intelligent systems and the international symposium on advanced intelligent systems (SCIS and ISIS2008), pp 1144–1148
Kamimura R (2008) Conditional information and information loss for flexible feature extraction. In: Proceedings of the international joint conference on neural networks (IJCNN2008), pp 2047–2083
Kamimura R (2008) Feature discovery by enhancement and relaxation of competitive units. In: Intelligent data engineering and automated learning-IDEAL2008 (LNCS), vol LNCS5326. Springer, Berlin, pp 148–155
Kamimura R (2009) Enhancing and relaxing competitive units for feature discovery. Neural Process Lett 30(1):37–57
Kohonen T (1988) Self-organization and associative memory. Springer, New York
Vesanto J (1999) SOM-based data visualization methods. Intell Data Anal 3:111–126
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput C 18(5):401–409
Ultsch A (2003) U*-matrix: a tool to visualize clusters in high dimensional data. Technical report 36, Department of Computer Science, University of Marburg
Ultsch A (2003) Maps for the visualization of high-dimensional data spaces. In: Proceedings of the 4th workshop on self-organizing maps, pp 225–230
Himberg J (1998) Enhancing the SOM based data visualization by linking different data projections. In: Proceedings of the international symposium on intelligent data engineering and learning (IDEAL), pp 427–434
Villmann T, Merenyi E (2001) Extensions and modifications of the kohonen-som and applications in remote sensing image analysis. In: Seiffert U, Jain LC (eds) Self-organizing maps: recent advances and applications., Springer, Berlin, pp 121–145
Himberg J (2000) A SOM based cluster visualization and its application for false colouring. In: Proceedings of the international joint conference on neural networks, pp 69–74
Kaski S, Venna J, Kohonen T (2000) Coring that reveals cluster structures in multivariate data. Aust J Intell Inf Process Syst 6:82–88
Yin H (2002) ViSOM-a novel method for multivariate data projection and structure visualization. IEEE Trans Neural Netw 13(1):237–243
Xu L, Xu Y, Chow TW (2010) PolSOM-a new method for multidimentional data visualization. Pattern Recogn Lett 43:1668–1675
Su M-C, Chang H-T (2001) A new model of self-organizing neural networks and its application in data projection. IEEE Trans Neural Netw 123(1):153–158
Villmann T, Claussen JC (2006) Magnification control in self-organizing maps and neural gas. Neural Comput Appl 18:446–469
Bauer HU, Der R, Herrmann M (1996) Controlling the magnification factor of self-organizing maps. Neural Comput Appl 8(4):757–771
Merenyi E, Jain A, Villmann T (2007) Explicit magnification control of self-organizing maps for forbidden data. IEEE Trans Neural Netw 18(3):786–797
Merenyi E, Jain A (2004) Forbidden magnification? II. In: Proceedings of 12th European symposium on artificial neural networks, pp 57–62
Linsker R (1988) Self-organization in a perceptual network. Comput Aided Des 21:105–117
Linsker R (1989) How to generate ordered maps by maximizing the mutual information between input and output. Neural Comput Appl 1:402–411
Linsker R (1992) Local synaptic rules suffice to maximize mutual information in a linear network. Neural Comput Appl 4:691–702
Linsker R (2005) Improved local learning rule for information maximization and related applications. Neural Netw 18:261–265
Kamimura R (2003) Teacher-directed learning: information-theoretic competitive learning in supervised multi-layered networks. Conn Sci 15:117–140
Kamimura R, Kamimura T, Takeuchi H (2002) Greedy information acquisition algorithm: a new information theoretic approach to dynamic information acquisition in neural networks. Conn Sci 14(2):137–162
Anderson JR (1980) Cognitive psychology and its implication. Worth Publishers, New York
Korsten NJH, Fragopanagos N, Hartle M, Taylor N, Taylor JG (2006) Attention as a controller. Neural Netw 19:1408–1421
Hamker FH, Zirnsak M (2006) V4 receptive field dynamics as predicted by a systems-level model of visual attention using feedback from the frontal eye field. Neural Netw 19:1371–1382
Lanyon LJ, Denham SL (2006) A model of active visual search with object-based attention guiding scan paths. Neural Netw 19:873–897
Rolls ET, Deco G (2006) Attention in natural scenes; neurophysiological and computational bases. Neural Netw 19:1383–1394
Taylor JG, Gragopanagos NF (2005) The interaction of attention and emotion. Neural Netw 18:353–369
Newman J, Baars BJ, Cho SB (1997) A neural global workspace model for conscious attention. Neural Netw 10(7):1195–1206
Kilmer W (1996) Global inhibition for selecting modes of attention. Neural Netw 9(4):567–573
Lanyon LJ, Denham SL (2004) A biased competition computational model of spatial and object-based attention mediating active visual search. Neurocomputing 58–60:655–662
Vesanto J, Himberg J, Alhoniemi E, Parhankangas J (2000) SOM toolbox for Matlab. technical report, Laboratory of Computer and Information Science, Helsinki University of Technology
Kiviluoto K (1996) Topology preservation in self-organizing maps. In: In Proceedings of the IEEE international conference on neural networks, pp 294–299
Oja M, Serber GO, Blomberg J, Kaski S (2005) Self-organizing map-based discovery and visualization of human endogenous retroviral sequence groups. Int J Neural Syst 15(3):163–179
Kaski S, Nikkila J, Oja M, Venna J, Toronen P, Castren E (2003) Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinforma 4(48)
Venna J, Kaski S (2001) Neighborhood preservation in nonlinear projection methods: an experimental study. In: Lecture Notes in Computer Science, vol 2130, pp 485–491
Polzlbauer G (2004) Survey and comparison of quality measures for self-organizing maps. In: Proceedings of the fifth workshop on data analysis (WDA04), pp 67–82
Nikkila J, Toronen P, Kaski S, Venna J, Castren E, Wong G (2002) Analysis and visualization of gene expression data using self-organizing maps. Neural Netw 15:953–966
Vathy-Fogarassy A, Werner-Stark A, Gal B, Abonyi J (2007) Visualization of topological representing networks. In: Lecture Notes in Computer Science (IDEAL2007), vol 4881, pp 557–566
Himberg J (2007) From insights to innovation: data mining, visualization, and user interfaces. Dissertation, Helsinki University of Technology
Venna J (2007) Dimensionality reduction for visual exploration of similarity structures. Dissertation, Helsinki University of Technology
Lee JA, Verleysen M (2008) Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods. In: JMLR workshop and conference proceedings, vol 4, pp 21–35
Vesanto J, Himberg J, Alhoniemi E, Parhankangas J (2000) SOM toolbox for matlab 5. technical report A57, Helsinki University of Technology
Mingyu Zhong MG, Anagnostopoulos GC (2008) A k-norm pruning algorithm for decision tree classifiers based on error rate estimation. Mach Learn 71:55–88
Floriana Esposito DM, Semeraro G (1997) A comparative analysis of methods for pruning decision trees. IEEE Trans Neural Netw 19(5):476–491
Romesburg HC (1984) Cluster analysis for researchers. Krieger Publishing Company, Florida
Frank A, Asuncion A (2010) UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Acknowledgments
The author is very grateful to the editor and two reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Appendix: Senate data
Appendix: Senate data
In Sect. 3.2, we applied the method to the data of US congressmen with their voting attitude on 19 environmental bills [69]. Table 5 shows the data where the first 8 congressmen are Democrats, while the latter 7 (from 9 to 15) congressmen are Republicans. In the table, 1, 0, and 0.5 represent yes, no, and undecided, respectively.
Rights and permissions
About this article
Cite this article
Kamimura, R. Repeated comprehensibility maximization in competitive learning. Neural Comput & Applic 22, 911–932 (2013). https://doi.org/10.1007/s00521-011-0785-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-011-0785-1