2006 Special IssueSpherical self-organizing map using efficient indexed geodesic data structure
Introduction
The Self-Organizing Map introduced by Kohonen (2001) is a popular tool for high-dimensional data analysis and visualization. Its applications can be found in various research areas such as social science (Lin, White, & Buzydlowski, 2003), web-document mining (Lagus, Kaski, & Kohonen, 2004) and robotics (Ritter, Martinetz, & Schulten, 1992). In a conventional SOM, the neighborhood relationship is defined by a two-dimensional rectangular or hexagonal lattice. During the training phase, all neurons compete with each other for the input signals. The winner and its neighbors update their connection weights. Ideally, at the end of the training, all samples are equally represented by neurons and the neighboring units in the grid tend to model similar regions of the data space. However, the grid units at the boundary of the SOM have fewer neighbors than the units inside the map, so they have fewer chances of being updated. This often leads to the “border effect” — weight vectors of these units “collapse to the center of the input space” (Sarle, 2002). Several approaches have been suggested to solve the problem, such as the heuristic weighting rule method by Kohonen (2001) and local-linear smoothing by Wand and Jones (1995). Aside from these mathematical solutions, a straightforward method is to remove the boundaries from the grid by implementing the SOM on a torus (Ito et al., 2000, Li et al., 1993) or a sphere (Boudjemaï et al., 2003, Hirokazu et al., 2005, Nakatsuka and Oyabu, 2003, Sangole and Knopf, 2003).
Compared to a toroidal SOM, the spherical SOM is more visually effective. Firstly, the area associated with each neuron varies significantly on the surface of a torus: larger around the outer circle and compressed near the inner circle. For a SOM, it is desirable to have all neurons receive equal geometrical treatment. Secondly, the toroidal SOM fails to provide an intuitively readable map. People are usually more familiar with maps generated from a sphere, such as the world maps, than they are with maps based on a torus.
Several spherical SOMs have been implemented and applied to different types of data sets. Using a tessellated platonic polyhedron as a lattice was first suggested by Ritter (1999). He also pointed out that the spherical SOM would be very suitable for data with underlying directional structures. In Sangole et al.’s work (Sangole & Knopf, 2003), a spherical SOM was employed in three-dimensional (3D) immersive virtual reality environments for interactive data analysis. Boudjemaï et al. (2003) applied the spherical SOM in 3D object modeling. While these applications have been successful, existing data structures for geodesic domes are either not space efficient or are time consuming when finding the neighbors. In this paper, we present a 2D data structure to store the spherical lattice. It reduces the overhead in maintaining the spherical grid structure. Finding the immediate neighbors of a vertex (neuron) is efficient because it is indexing in a 2D array. It also supports fast dome tessellation (increasing the number of neurons). We call a spherical SOM using this data structure a GeoSOM.
The remainder of this paper is organized as follows. Section 2 reviews existing techniques for spherical SOMs using lattices of geodesic domes. Details of our data structure are presented in Section 3. Section 4 describes the interface that we developed to visualize the GeoSOM. In Section 5, we test our data structure on two data sets. Results are compared with the corresponding 2D SOMs. In Section 6, GeoSOM is compared analytically with the existing spherical SOMs. Conclusions and future work are presented in Section 7.
Section snippets
A brief discussion of geodesic domes
For a 2D SOM, a hexagonal lattice is preferable to a rectangular one as it is more uniform: every grid unit has the same number of immediate neighbors and the distances between a unit and its immediate neighbors are the same. However, such uniformity cannot be achieved on the sphere except for the five platonic polyhedra — tetrahedron, cube, octahedron, icosahedron and dodecahedron (Pugh, 1976). These polyhedra can be further tessellated into different frequencies — the geodesic domes. The
Opening the icosahedron-based geodesic dome
An icosahedron-based geodesic dome can be opened onto the 2D plane. Fig. 3 illustrates the process. Note that, for clarity, we did not project the vertices onto the surface of the sphere. Thicker lines are the original edges of the icosahedron. The 12 vertices of an icosahedron can be grouped into six pairs. Vertices in each pair are opposite to each other on the sphere and can be considered as two poles, such as vertices A and C. The edges marked with colors indicate where the dome is cut
Projecting the GeoSOM onto 2D plane
Usually, a spherical SOM is visualized as a 3D spherical object (see Fig. 6). The surface of the sphere is deformed to reflect the variation in weight vectors. Since our flat screens cannot show the front and back parts of a spherical object at the same time, interfaces such as rotation by mouse are provided so that the users can view every part of the SOM. However, it is still difficult for the user to maintain an entire picture of the map. We implemented an interface to project the spherical
Experiments
In this section, we conduct two sets of experiments on GeoSOMs using a synthetic data set and a breast cancer data set. The experiments are carried out on a Pentium IV 3.0 GHz workstation with 1 GB of memory. Results are compared with the corresponding 2D hexagonal SOMs generated by SOM_PAK (Kohonen, Hynninen, Kangas, & Laaksonen, 1996). Parameters in training the two type of SOMs are set to be the same:
- •
Sizes of the GeoSOM and the 2D SOM are chosen to be as close as possible.
- •
Before the
Comparing GeoSOM to other spherical SOMs
From the above experiments, we can see that the GeoSOM reduces up to two thirds of the data distortion. This reduction is achieved because we use geodesic domes to remove the SOM’s border effect. As mentioned earlier, several researchers have also implemented spherical SOMs using icosahedron-based geodesic domes (Boudjemaï et al., 2003, Nakatsuka and Oyabu, 2003, Sangole and Knopf, 2003). It is expected that these SOMs will have the same data distortion as the GeoSOM.
The difference between the
Conclusion and future work
It has been demonstrated that a spherical SOM can effectively remove the border effect of the 2D SOM and reveal more information about the high-dimensional data. We introduced a method to organize the neurons of such an SOM using a 2D rectangular grid structure, which reduces the overhead required to maintain the neuron’s relationships. Experimental results show that the speed of a GeoSOM is comparable to the speed of a 2D SOM generated by SOM_PAK. Because of the grid topology, GeoSOM reduces
References (22)
Self-organizing maps: Optimization approaches
- et al.
Mining massive document collections by the WEBSOM method
Information Sciences
(2004) - et al.
Real-time author co-citation mapping for online searching
Information Processing and Management: an International Journal
(2003) - et al.
Visualization of randomly ordered numeric data sets using spherical self-organizing feature maps
Computers & Graphics
(2003) - Boudjemaï, F., Enberg, P. B., & Postaire, J. -G. (2003). Self organizing spherical map architecture for 3D object...
- et al.
The world in perpective: A directory of world map projections
(1989) - et al.
Readings in information visualization: Using vision to think
(1999) Explorations in the geometry of thinking
(1975)- Hirokazu, N., Altaf-Ul, A. M., Ken, K., Kotaro, M., & Shigehiko, K. (2005). Spherical SOM with arbitrary number of...
- Hoffmann, G. (2002). Sphere tessellation by icosahedron...
The characteristics of the torus self organizing map
Cited by (64)
Unstructured borderline self-organizing map: Learning highly imbalanced, high-dimensional datasets for fault detection
2022, Expert Systems with ApplicationsCitation Excerpt :To address the aforementioned issues, we develop a new model called an unstructured borderline self-organizing map (UB-SOM) that performs undersampling and feature selection considering the distributional information in borderline areas. SOMs have some limitations in addressing undersampling due to the fixed topological neighborhood structure they use to connect nodes (Wu & Takatsuka, 2006). In this paper, we replace the neighborhood structure with a dynamic connection between nodes and introduce a technique to increase the power to represent borderline data.
Decoding the Stratigraphic Heterogeneity of Bengal Basin, India Using Supervised Machine Learning-A Case Study
2024, International Petroleum Technology Conference, IPTC 2024Latest Trends in the Conversion of Industrial Territories of the Existing Urban Development
2023, AIP Conference ProceedingsThe Method of Making the Low-dimensional Map that Preserves the Distance Relationships from Selected Data Point
2023, Proceedings - 2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science, IRI 2023Estimation of confidence margins for Direct Load Recognition (DLR) using supervised and unsupervised machine learning
2023, FORUM 2023 - Vertical Flight Society 79th Annual Forum and Technology DisplayRiemannian Quaternion Self-Organizing Map to Overcome Degree-of-Polarization Error in Polarimetric Ground-Penetrating Radar
2023, IEEE Transactions on Geoscience and Remote Sensing