Skip to main content
Log in

A joint latent factor analyzer and functional subspace model for clustering multivariate functional data

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We introduce a model-based approach for clustering multivariate functional data observations. We utilize theoretical results regarding a surrogate density on the truncated Karhunen–Loeve expansions along with a direct sum specification of the functional space to define a matrix normal distribution on functional principal components. This formulation allows for individual parsimonious modelling of the function space and coefficient space of the univariate components of the multivariate functional observations in the form a subspace projection and latent factor analyzers, respectively. The approach facilitates interpretation at both the full multivariate level and the component level, which is of specific interest when the component functions have clear meaning. We derive an AECM algorithm for fitting the model, and discuss appropriate initialization strategies, convergence and model selection criteria. We demonstrate the model’s applicability through simulation and two data analyses on observations that have many functional components.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Sharp.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Countries included in the energy sector analysis

We gather complete data on the following 97 countries (Tables 11, 12):

Table 11 Countries used in the energy sector analysis, sorted by geographical location and listed in alphabetical order

The following table lists the countries assigned to each group in alphabetical order.

Table 12 Countries used in the energy sector analysis sorted by the best BIC model grouping and listed in alphabetical order

Code for scraping pitch data

The following is an R function for scraping the pitch data form MLBAM’s Statcast database.

figure a

Parameter specification for the model selection and parameter recovery simulation

In our parameter recovery simulation, we specify three simulation parameters which we choose to vary the value of across different implementations. Every other model parameter which is not mentioned here is fixed across these implementations. In this section we give a brief overview of how these parameters were generated. In particular, a clever specification for these parameters eluded us, so we instead proceeded to generate the parameters randomly. The generation process was the same for each group, and proceeded in the following manner. The mean matrix \(\varvec{M}_g^\star \) was generated from a matrix normal distribution specified as \(\mathscr {N}_{p\times d_g}(\varvec{0}, \varvec{I}_p, 4\varvec{I}_{d_g})\). Next, we created a \(2p \times p\) matrix filled with iid samples from a standard normal distribution. We then estimate the covariance matrix of these data and set \(\varvec{\varLambda }_{1g}\) to be the first \(q_g\) eigenvectors found in the corresponding spectral decomposition. Let \(\mathscr {U}_1 \sim \text {Unif}(50,100)\) and \(\mathscr {U}_2 \sim \text {Unif}(0.5,5)\) be two uniform random variables. Let \(\varvec{\varOmega }_g\) be a \(d_g \times d_g\) diagonal matrix with diagonal elements comprised of iid draws from \(\mathscr {U}_1\). Let \(\eta _g\) be the result of a single draw from \(\mathscr {U}_2\). We proceeded to construct a b-dimensional diagonal matrix \(\varvec{\varDelta }_g\) from these by specifying the diagonal to be \(\varvec{\varOmega }_g\) followed by \(p-d_g\) copies of \(\eta _g\). We then set,

$$\begin{aligned} \varvec{\varOmega }_{2g}&= \left| \varvec{\varDelta }_g\right| ^{-1/b} \varvec{\varOmega }_g, \quad \text {and,}\\ \eta _{2g}&= \left| \varvec{\varDelta }_g\right| ^{-1/b} \eta _g. \end{aligned}$$

The associated matrix of eigenvalues, \(\varvec{\varGamma }_{2g}\), was generated randomly from a uniform distribution over the \(b \times b\) orthogonal matrices. This completes the parameter generation process.

Parameter specification for comparative analysis II

As mentioned in Sect. 4.4, MFSF and the funHDDC model overlap when we specify our factor loadings and specific variances to have the form \(\varvec{\varLambda }_{1g} = \varvec{\varGamma }_{1g}(\varvec{\varOmega }_{1g} - \eta _{1g}\varvec{I}_{q_g})^\frac{1}{2}\) and \(\varvec{\varXi }_{1g}=\eta _{1g}\varvec{I}_p\), respectively, where \(\varvec{\varOmega }_{1g}\) is a diagonal \(q_g \times q_g\) matrix and \(\eta _{1g}\) is a positive real number which is less than any of the entries of \(\varvec{\varOmega }_{1g}\). Under such a scenario we will have \(\varvec{\Sigma }_{1g} = \varvec{\varLambda }_{1g}\varvec{\varLambda }_{1g}^\top + \varvec{\varXi }_{1g} = \varvec{\varGamma }_{1g}\varvec{\varDelta }_{1g}\varvec{\varGamma }_{1g}^\top \), where \(\varvec{\varDelta }_{1g}\) has the subspace clustering form given in Eq. (28) with \(\varvec{\varOmega }_{1g}\) in place of \(\varvec{\varOmega }_{g}\) and \(\eta _{1g}\) replacing \(\eta _g\). Hence, \(\varvec{\Sigma }_{1g}\) has both the factor analyzer and subspace clustering form. The resulting group covariance matrix then has the subspace clustering form with \(\varvec{\varOmega }_g\) given by Eq. (31) and \(\eta _g\) given by \(\eta _{1g}\eta _{2g}\). Under such parameter specification, the MFSF and funHDDC overlap.

For our comparative analysis, this model specification serves as basis for M3, the situation in which parameter specification satisfies the requirements of both the funHDDC algorithm as well as MFSF. With this starting point, we devise a way to deterministically perturb these parameters so that they satisfy only one of the competing models, rather than both. To do this, we need to identify a defining characteristic of each model that is not important for the other. For the funHDDC model, that characteristic is the constant value of the trailing eigenvalues, while for MFSF it is the presence of the kronecker product form. We begin with the former. All subsequent discussion will pertain to a particular, but arbitrary, group g of the model. We hence drop the subscript g in the remainder.

The overlap model is characterized by the dual specification of a latent factor model and a latent subspace model through \(\varvec{\Sigma }_{1}\). However, as we noted previously in Sect. 4.4, when \(\varvec{\Sigma }_{1}\) does not exhibit the latent subspace form, then the funHDDC model no longer holds. Hence, our goal is to find a transformation that can be applied to \(\varvec{\Sigma }_{1}\) which will weaken or remove its latent subspace structure, but preserve its latent factor structure. One obvious way to do this is to alter the specific variances \(\varvec{\varXi }_1\). In particular, the latent subspace structure requires \(\varvec{\varXi }_1\) to be spherical, so altering its diagonal values so that they differ from one another will subjugate \(\varvec{\Sigma }_1\) to deviance from this model. To this end, let \(\{\delta _{1i}\}\) denote the trailing \(p - q\) eigenvalues of \(\varvec{\Sigma }_{1}\). Under the latent subspace model \(\delta _{1i} = \eta _1\) for each i. We let the following linear relationship define the \(\delta _{1i}\),

$$\begin{aligned} \delta _{1i} = \eta _1 + a \left[ \frac{\omega _{1q} - \eta _1}{p - q -1} \right] (p - q - i ), \quad i =1,2,..., p-q, \end{aligned}$$

where \(\omega _{1q}\) is the smallest element of \(\varvec{\varOmega }_1\), and a is a value between 0 and 1. We see that when \(a=0\), we recover the latent subspace structure, while \(a=1\) results in equally spacing the \(\delta _{1i}\) along the line between the point \(\omega _{1q}\) and \(\eta _1\). A graphical example of how this changes the eigenvalues of \(\varvec{\Sigma }_{1}\) for different values of a is provided in Fig. 10.

Fig. 10
figure 10

A depiction of the trailing eigenvalues for different values of the parameter a. We see that as a increases, the eigenvalues move up like a drawbridge. This eliminates the subspace structure from \(\varvec{\Sigma }_1\)

In this figure, \(q=2\), and hence the first two points plotted in the figure correspond to \(\varvec{\varOmega }_1\). In the case that \(a=0\), which corresponds to the black line, we get a constant line at \(\eta _1\), recovering the latent subspace structure. As a increases the eigenvalues are lifted above \(\eta _1\) at different rates, causing them to take different values and eliminating the latent subspace structure. For our simulation, the value \(a=0.5\) corresponds to M4 and the value \(a=1\) corresponds to M5.

A central component of MFSF is the assumption that the model covariance matrix is formed as the kronecker product of two lower dimensional matrices. When we consider parameter specifications that overlap with the funHDDC model, this causes the \(\varvec{\varOmega }\) matrix to have the form given in Eq. (28). From this structure, we see that \(\varvec{\varOmega }\) will always have repeated eigenvalues, thanks to the terms involving \(\eta _1\varvec{\varOmega }_2\) and \(\eta _2\varvec{\varOmega }_1\). This property is a direct result of the kronecker product assumption, hence, if we transform \(\varvec{\varOmega }\) so that no repeated values appear, the resulting model will no longer satisfy the MFSF modelling assumptions. Note that under the funHDDC assumptions, \(\varvec{\varOmega }\) is arbitrary (aside from being diagonal with nonnegative entries), so these assumptions will still be satisfied.

Let \((\omega _{ij})\) be the sorted vector of the repeated eigenvalues of \(\varvec{\varOmega }\) under M3, where i indexes the unique eigenvalues, and j indexes the repetitions of each, and let \(\{\omega _i\}\) be the corresponding set of unique values. We specify the relationship between the eigenvalues using a linear model. If \(\omega _i\) belongs to \(\eta _1\varvec{\varOmega }_2\) then,

$$\begin{aligned} \omega _{ij} = \omega _{i-1} + a\left[ \frac{\omega _{i-1} - \omega _i}{p-q}\right] (p-q-j) \end{aligned}$$

where a is again some value between 0 and 1. If \(\omega _i\) belongs to \(\eta _2\varvec{\varOmega }_1\), then,

$$\begin{aligned} \omega _{ij} = \omega _{i-1} + a\left[ \frac{\omega _{i-1} - \omega _i}{b-d}\right] (b-d-j). \end{aligned}$$

This approach works in exactly the same manner as the previous. By increasing a, we raise the set of repeated eigenvalues like a drawbridge to connect them with the preceding eigenvalue. By doing this, we remove all repetitions in the eigenvalues, and hence remove the kronecker structure. Setting the value of a to 0.5 corresponds to model M2 and setting the value of a to be 1 corresponds to model M1.

In our study, for each of the 12 scenarios, we generated one parameter set according to the specification M3 and then modified these according to the rules described above to obtain the parameters for models \(M1-M5\). Lacking clever ideas for choosing the particular values of these parameters ourselves, we resigned to generating them randomly. Generation proceeded in the following manner. The mean matrix for the first group, denoted by \(\varvec{M}_1\), was generated from \(\mathscr {N}_{p\times b}(\varvec{0}, \varvec{I}_p, \varvec{I}_b)\), which is the standard matrix normal distribution. The mean of the other parameter group, denoted by \(\varvec{M}_2\), was determined by adding a random pb-dimensional vector of length \(\rho \) to \(\varvec{M}_1\), where \(\rho \) is the value such that \(\left\Vert \varvec{M}_1 - \varvec{M}_2\right\Vert = \rho \), which is specified by each of the experimental conditions. Define the random variables \(\mathscr {U}_1 \sim \text {Unif}(5,5.5)\) and \(\mathscr {U}_2 \sim \text {Unif}(0.5,5)\). Let \(\varvec{\varOmega }_g^\star \) be a \(q_g \times q_g\) diagonal matrix with elements composed of iid draws from \(\mathscr {U}_1\). Define \(\eta _g^\star \) as a single draw from \(\mathscr {U}_2\). Construct a p-dimensional diagonal matrix \(\varvec{\varDelta }^\star \) from these by specifying the diagonal as \(\varvec{\varOmega }^\star \) followed by \(p-q_g\) copies of \(\eta ^\star \). We then set,

$$\begin{aligned} \varvec{\varOmega }_{1g}&= \left| \varvec{\varDelta }^\star \right| ^{-1/p} \omega ^\star , \quad \text {and,}\\ \eta _{1g}&= \left| \varvec{\varDelta }^\star \right| ^{-1/p} \eta ^\star \end{aligned}$$

from which we can then construct \(\varvec{\varGamma }_{1g}\) and \(\varvec{\varXi }_{1g}\). The parameter \(\varvec{\varDelta }_{2g}\) is generated similarly, but with p replaced by b and \(q_g\) replaced with \(d_g\). For simplicity, we specify that the eigenvector matrix is equal to identity for each group. In each of the 12 experimental conditions, the hyperparameters \(q_g\) and \(d_g\) are set to 2 and 3, respectively. This results in a value of k for the funHDDC model of 34 for the low-dimensional settings, and 114 for the high-dimensional settings.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharp, A., Browne, R. A joint latent factor analyzer and functional subspace model for clustering multivariate functional data. Stat Comput 32, 68 (2022). https://doi.org/10.1007/s11222-022-10128-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10128-9

Keywords

Mathematics Subject Classification

Navigation