Elsevier

Neural Networks

Volume 14, Issue 2, March 2001, Pages 175-183
Neural Networks

Contributed article
Two methods for encoding clusters

https://doi.org/10.1016/S0893-6080(00)00096-4Get rights and content

Abstract

This paper presents two methods for generating numerical codes representing clusters of Rn, while preserving various topological properties of data spaces. This is useful for networks whose input, or eventually output, consists of unordered sets of points. The first method is the best one from a theoretical point of view, while the second one is more usable for large clusters in practice.

Introduction

There are applications in which the input, or eventually the output, of a neural network is not a point but a cluster, that is an unordered, countable finite set of points of Rn. A frequent difficulty is to find an appropriate representation for such an input, given that there is no natural order of points in Rn for n>1. Any permutation of a cluster's points is a priori equivalent to other ones, but distinct permutations provide distinct input vectors, and learning the equivalence of permutations is a very complex problem given that for a set of m points there are m! possible permutations. One can always build an artificial order of points in Rn, for example using a Peano–Hilbert scanning. Unfortunately, this type of procedure never allows for preserving the topology of data spaces, and a small variation of data points can lead to a large variation in the representation. This results in major difficulties for learning regular functions on the space of such representations. A quite similar problem occurs with pseudo-sequences of points, that is when data points are ordered with respect to a natural variable, such as a time coordinate, but the underlying process (generating data points) is not actually sequential or is a random mixture of several sequences. In this case, small and possibly random variations of time coordinates can modify the order of points, resulting in large variations of the input representation.

The problem we address here is: find a mapping from the set of clusters of Rn to a set of real vectors (or matrices), referred to as ‘cluster codes’, such that (1) any cluster has a unique code, (2) distinct clusters have distinct codes, and (3) to any continuous movement of points in a cluster corresponds a continuous variation of code components. Such a mapping would be appropriate for encoding input clusters. Now, if the application requires that one can decode output cluster codes, we also need that (4) the mapping is inversible (which implies (2)). A mapping with the four above properties is in general an homeomorphism between the data space and the code space. However, one must take care that the code space can be a special part of Rk, which is not as simple as Rk itself. To date, we know of no solution which can be applied to all practical problems. However, there are various solutions for various families of problems. Two of these solutions are presented hereafter. Note that the problem addressed here is to find a finite exact representation of data, in a format which ensures that representations of various clusters can be compared. This is a problem different (and much less studied in the literature) from that of approximating data distribution by a density of probability (Husmeier and Taylor, 1998, Specht, 1990, Traven, 1991) or an attractor (Barnsley, Academic Press, Diaconis and Freedman, 1999). The problem is in some way related to the ‘encoding problem’ in neural self associators (Rumelhart, Hinton & Williams, 1986). However, such encoders require prior learning, and they cannot guarantee property (1), since a permutation of a cluster's points generally results in a change of the internal representation.

Section snippets

First method: polynomial encoding of clusters

For encoding a cluster in R or R2 only, one can consider a real or complex polynomial of the form P(z)=∏j=1m(zzj), which has exactly m real or complex roots, depending on the dimension. These roots are obviously the zj’s, which are the coordinates of the cluster's points. The product in real or complex algebra is commutative and associative, then P(z) does not depend on the order in which the m roots are taken into account. As a consequence, P(z) is unequivocally associated to the unordered

Second method: cluster codes depending on a separating variable

This method satisfies requirements (1) and (3) for any cluster, and requirements (2) and (4) in most cases, but not all. Its advantages are that the generated code is concise (nm real coefficients), and that decoding, when possible, is very simple. Moreover, a simple inspection of a code vector immediately shows whether or not decoding is allowed. This method is particularly adapted for encoding and decoding pseudo-sequences of points. A possible application field is the encoding of sets of

Conclusion

Two methods for encoding clusters were presented. In applications, cluster codes can be used for comparisons or as arguments of various functions (Spline functions, Radial Basis functions, etc.), and decodable codes can also be used as output of approximation processes. The first method is the best one from a theoretical point of view since it satisfies all requirements specified in the Introduction, for all clusters. This method has limited applications in practice since, due to its relative

Acknowledgements

This work was partially supported by a grant from Ministère de l'Education Nationale, de la Recherche et de la Technologie—ACI ‘Cognitique’ (1999, #90).

References (11)

There are more references available in the full text version of this article.

Cited by (0)

View full text