A unifying model involving a categorical and/or dimensional reduction for multimode data
Introduction
Data that imply one or more sets of entities (or modes) with a large number of elements (experimental units, variables, time points, others) imply a major challenge for the data analyst. This is even more the case if the data pertain to more than two modes, that is, if they are multiway multimode in nature. The complexity of the information as present in such data may be tremendous. In order to grasp it in a proper way, the data analyst may wish to subject one or more of the data modes to a (simultaneous) reduction. Reduction is here to be understood either in a categorical sense, in that the elements of the reduced mode are grouped into a small number of clusters (which may be overlapping or not, and which may cover the full mode or not), or in a dimensional sense, in that the elements of the reduced mode are represented as points in a lowdimensional space. A simultaneous reduction further can be purely categorical, that is, categorical for all reduced modes, purely dimensional, that is, dimensional for all reduced modes, or hybrid, that is, categorical for some of the reduced modes and dimensional for the other ones. Purely categorical reduction models can be amply found in the clustering domain, examples including one-mode partitioning models (such as k-means type models and all kinds of one-mode mixture models, e.g., McLachlan and Chang, 2004), two-mode clustering (or biclustering) models (such as two-mode hierarchical and additive clustering models, Furnas, 1980, Gaul and Schader, 1996, and two-mode hierarchical classes models, De Boeck and Rosenberg, 1988, Van Mechelen et al., 1995), as well as their multimode generalizations (e.g., Ceulemans and Van Mechelen, 2005, Eckes and Orlik, 1994). Pure dimension reduction models can be amply found in the domain of component and factor analysis, examples including the standard two-mode principal component model and its multimode generalizations (such as PARAFAC/CANDECOMP and the family of N-mode Tucker models, e.g., Kroonenberg, 1983). Examples of hybrid models include various projection pursuit type clustering methods (e.g., Bock, 1987, Vichi and Kiers, 2001), cluster differences scaling (Heiser and Groenen, 1997), and cluster unfolding (De Soete and Heiser, 1993).
The family of categorical and dimensional reduction models for multimode data clearly is very large in number. Moreover, it is also fairly heterogeneous, both in terms of the mathematical structures implied by the different models and by the principles and methods used in the associated data analysis. In the present paper, we will contribute to a clarification of this situation by introducing a unifying model that encompasses a broad range of (existing as well as to be developed) discrete, continuous and hybrid reduction models as special cases. The to be proposed unifying model considerably extends the already very broad CANDCLUS and MUMCLUS models as proposed by Carroll and Chaturvedi (1995), with this extension including a much broader family of decomposition functions other than (generalized) Cartesian products, room for various types of modeling constraints, and room for a possible addition of distributional assumptions. An analysis of the objective or loss function associated with the unifying model will further lead to two generic algorithmic strategies, the possibilities and limitations of which are the object of a subsequent discussion.
The remainder of this paper is organized as follows: In Section 2 we will introduce the type of data under study, along with a few associated concepts. In Section 3 we will introduce our unifying reduction model. The associated objective or loss function will be dealt with in Section 4 and the algorithmics in Section 5. Section 6 will present a general discussion.
Section snippets
Data
Data arrays can have different conceptual structures. In order to typify the various cases, Carroll and Arabie (1980) have introduced some terminology (which in turn relies on work by Tucker, 1964). To use this terminology, a data set is conceived as a mapping D from a Cartesian product of N sets to some (typically univariate) domain Y: for any N-tuple with a value from Y is recorded. The total number N of constituent (possibly
Model
Assume a real-valued -way N-mode data array D (i.e., a mapping , with ) with entries . The unifying reduction model we propose for D includes a deterministic heart and optional additional stochastic assumptions. We will now successively introduce both. Subsequently, we will discuss how various existing reduction models show up as special cases of the unifying model.
Criterion to be optimized in the data analysis
In the deterministic case, the objective or loss function l to be minimized in the data analysis will typically be of the least type,with, in case :
In the stochastic case, the objective function to be maximized will be the likelihood. In this regard, it may be useful to note that for models of real-valued data maximizing the likelihood is equivalent to minimizing the
Two propositions
A first proposition focuses on the core array W: Proposition 1 If W is real-valued and unconstrained, equals a generalized Cartesian product (3), and the loss function equals (15), then the conditionally optimal W, given component matrices , can be expressed as a closed form function of the component matrices, . Proof The conditionally optimal W can be considered a set of regression weights in the prediction of the vectorized data on the basis of predictor vectors that
Discussion
In this paper we introduced a novel unifying model for multimode data based on two key components: (a) the elements of each of the modes involved in the data are reduced to either points in a lowdimensional space or to elements of a limited set of (possibly overlapping) clusters, and (b) the connection between the dimensions and clusters to which each of the data modes are reduced is captured by a linking array. The reduction for each of the modes is further such that the coordinates or cluster
Acknowledgments
Work on this paper has been supported by the Fund for Scientific Research—Flanders (project G.0146.06) and by the Research Fund of K.U. Leuven (GOA/2005/04 and EF/05/007). The authors gratefully acknowledge Henk Kiers and Eva Ceulemans for their useful comments on a previous version of this manuscript.
References (32)
- et al.
Three-mode partitioning
Comput. Stat. Data Anal.
(2006) - et al.
Factorial k-means analysis for two-way data
Comput. Stat. Data Anal.
(2001) Simultaneous clustering of objects and variables
On the interface between cluster analysis, principal component analysis, and multidimensional scaling
Multi-way analysis in the food industry. Models, algorithms and applications. Unpublished doctoral dissertation
(1998)- et al.
Multidimensional scaling
Annu. Rev. Psychol.
(1980) - et al.
A general approach to clustering and multidimensional scaling of two-way, three-way, or higher-way data
- et al.
Hierarchical classes models for three-way three-mode binary data: interrelations and model selection
Psychometrika
(2005) - et al.
Tucker3 hierarchical classes analysis
Psychometrika
(2003) - et al.
Adapting the formal to the substantive: constrained Tucker3-HICLAS
J. Classification
(2004)
A Theory of Data
Hierarchical classes: model and data analysis
Psychometrika
Three-way metric unfolding via alternating weighted lesat squares
Psychometrika
A latent class unfolding model for analyzing single stimulus preference ratings
Psychometrika
Three-mode hierarchical cluster analysis of three-way three-mode data
Objects and their features: the metric analysis of two-class data. Unpublished doctoral dissertation
Cited by (10)
A Framework for Low-Level Data Fusion
2019, Data Handling in Science and TechnologyCitation Excerpt :Subsequently, we describe a few existing examples of our generic proposal. The first ingredient of our framework is a submodel for each data block as described in more detail elsewhere [28]. This submodel is made of two parts: quantifications of the modes per data block and association rules that define how these quantifications can be combined to model each block.
A generic linked-mode decomposition model for data fusion
2010, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :An extension to the N-way N′-mode case is rather straightforward and will be briefly touched upon below.) The submodel for data block B is subsumed by a unifying model as proposed by Ref. [14]. The heart of this unifying model is deterministic in nature; yet, optionally, the deterministic heart can be extended with a stochastic error model to represent discrepancies between the actual entries in the data and the corresponding reconstructed entries in the deterministic heart of the model (for one possible general procedure to build a stochastic extension of a deterministic model, see Ref. [15]).
Simultaneous analysis of coupled data blocks differing in size: A comparison of two weighting schemes
2009, Computational Statistics and Data AnalysisCitation Excerpt :From a data-analytic viewpoint, this implies that a global model is needed in which the different data blocks, one block per piece of information, are analyzed simultaneously. In the present paper, in this regard, global models will be considered that consist of different submodels, one for each data block, with each submodel implying a (dimensional or categorical) quantification of all modes of the corresponding data block (Van Mechelen and Schepers, 2007). Further, only global models, consisting of different submodels, will be considered in which each common mode of the coupled data is represented by a single quantification, which is the same for all submodels of the global model that mode belongs to.
Algorithms for additive clustering of rectangular data tables
2008, Computational Statistics and Data AnalysisBlock clustering with Bernoulli mixture models: Comparison of different approaches
2008, Computational Statistics and Data AnalysisCitation Excerpt :These procedures differ in the patterns they seek, the types of data they apply to, and the assumptions on which they rest. In particular we should mention the work of Hartigan (1975), Bock (1979), Garcia and Proth (1986), Marchotorchino (1987), Govaert (1983, 1984, 1995), Arabie and Hubert (1990), Duffy and Quiroz (1991) and Mechelen and Schepers (2007), all of whom have proposed algorithms dedicated to different kinds of matrices. In recent years block clustering has become an important challenge in data mining.
Statistical Learning Methods Including Dimensionality Reduction
2007, Computational Statistics and Data Analysis