Abstract
In this work, we consider the problem of learning regression models from a finite set of functional objects. In particular, we introduce a novel framework to learn a Gaussian process model on the space of Strictly Non-decreasing Distribution Functions (SNDF). Gaussian processes (GPs) are commonly known to provide powerful tools for non-parametric regression and uncertainty estimation on vector spaces. On top of that, we define a Riemannian structure of the SNDF space and we learn a GP model indexed by SNDF. Such formulation enables to define an appropriate covariance function, extending the Matérn family of covariance functions. We also show how the full Gaussian process methodology, namely covariance parameter estimation and prediction, can be put into action on the SNDF space. The proposed method is tested using multiple simulations and validated on real-world data.
The authors thank the ANITI program (Artificial Natural Intelligence Toulouse Institute) and the ANR Project RISCOPE (Risk-based system for coastal flooding early warning). JM Loubes acknowledges the funding by DEEL-IRT and C. Samir acknowledges the funding by CNRS Prime.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderes, E.: On the consistent separation of scale and variance for Gaussian random fields. Ann. Stat. 38, 870–893 (2010)
Bachoc, F.: Cross validation and maximum likelihood estimations of hyper-parameters of gaussian processes with model misspecification. Comput. Stat. Data Anal. 66, 55–69 (2013)
Bachoc, F.: Asymptotic analysis of covariance parameter estimation for Gaussian processes in the misspecified case. Bernoulli 24, 1531–1575 (2018)
Bachoc, F., Gamboa, F., Loubes, J.M., Venet, N.: A Gaussian process regression model for distribution inputs. IEEE Trans. Inf. Theor. (2017)
Bachoc, F., Suvorikova, A., Loubes, J.M., Spokoiny, V.: Gaussian process forecast with multidimensional distributional entries. arXiv preprint arXiv:1805.00753 (2018)
Boothby, W.M.: An Introduction to Differential Manifolds and Riemannian Geometry. Academic Press, New york (1975)
Dryden, L., Mardia, K.V.: Statistical Shape Analysis. Wiley, Hoboken (1998)
Efrat, A., Fan, Q., Venkatasubramanian, S.: Curve matching, time warping, and light fields: new algorithms for computing similarity between curves. J. Math. Imaging Vis. 27(3), 203–216 (2007)
Gamboa, F., Loubes, J.M., Maza, E.: Semi-parametric estimation of shifts. Electron. J. Stat. 1, 616–640 (2007)
Gervini, D., Gasser, T.: Self-modeling warping functions. J. Roy. Stat. Soc. B 66, 959–971 (2004)
Grenander, U., Miller, M., Klassen, E., Le, H., Srivastava, A.: Computational anatomy: an emerging discipline. Q. Appl. Math. 4, 617–694 (1998)
James, G.: Curve alignment by moments. Ann. Appl. Stat., 480–501 (2007)
Kendall, D.G.: Shape manifolds, procrustean metrics and complex projective spaces. Bull. London Math. Soc. 16, 81–121 (1984)
Kneip, A., Gasser, T.: Statistical tools to analyze data representing a sample of curves. Ann. Stat. 20, 1266–1305 (1992)
Kolmogorov, A.N.: Wienersche spiralen und einige andere interessante kurven im hilbertschen raum. Doklady Akad. Nauk SSSR 26, 115–118 (1940)
Kurtek, S., Srivastava, A., Wu, W.: Signal estimation under random time-warpings and nonlinear signal alignment. In: Neural Information Processing Systems (NIPS) (2011)
Liu, X., Müller, H.G.: Functional convex averaging and synchronization for time-warped random curves. J. Am. Stat. Assoc. 99, 687–699 (2004)
Michor, P.W., Mumford, D.: Riemannian geometries on spaces of plane curves. J. Eur. Math. Soc. 8, 1–48 (2006)
Ramsay, J.O., Li, X.: Curve registration. J. Roy. Stat. Soc. B 60, 351–363 (1998)
Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer Series in Statistics, 2nd edn. Springer, New York (2005). https://doi.org/10.1007/b98888
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)
Sakoe, H.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 43–49 (1978)
Srivastava, A., Wu, W., Kurtek, S., Klassen, E., Marron, J.S.: Registration of functional data using fisher-rao metric. arXiv:1103.3817v2 (2011)
Srivastava, A., Klassen, E.: Functional and Shape Data Analysis. Springer, New York (2016). https://doi.org/10.1007/978-1-4939-4020-2
Stein, M.L.: Interpolation of Spatial Data. Springer Series in Statistics. Springer, New York (1999). https://doi.org/10.1007/978-1-4612-1494-6
Tang, R., Müller, H.G.: Pairwise curve synchronization for functional data. Biometrika 95(4), 875–889 (2008)
Zhang, H.: Inconsistent estimation and asymptotically equivalent interpolations in model-based geostatistics. J. Am. Stat. Assoc. 99, 250–261 (2004)
Zhang, H., Wang, Y.: Kriging and cross-validation for massive spatial data. Environmetrics Official J. Int. Environ. Soc. 21(3–4), 290–304 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proofs
A Proofs
Proof (Proof of Proposition 1)
Proof
Let \(F_1,\dots ,F_n\) in \( {{\mathcal {F}}}\). For \(i=1,\ldots ,n\), let \(g_i = \log _{1}( \phi _i )\). Consider the matrix \(\tilde{C}=(<g_i,g_j>)_{\{i,j\}}\). This matrix is a Grammian matrix in \({\mathbb {R}}^{n \times n}\) hence there exists a non negative diagonal matrix D and an orthogonal matrix P such that
Let \(e_1,\dots ,e_n\) be the canonical basis of \({\mathbb {R}}^n\). Then \( e_i^t \tilde{C} e_j= u_i^t u_j \) where \(u_i^t=e_i^t P D^{1/2}\). Note that the \(u_i\)’s are vectors in \({\mathbb {R}}^n\) that depend on the \(f_1,\dots ,f_n\). We get that
and for any \(F_1,\dots ,F_n\) in \( {{\mathcal {F}}}\) there are \(u_1,\dots ,u_n\) in \({\mathbb {R}}^n\) such that
So any covariance matrix that can be written as \([ K(\Vert \log _{1}( \phi _i) - \log _{1}( \phi _j) \Vert ) ]_{i,j}\) can be seen as a covariance matrix \([ K(\Vert u_i-u_j\Vert ) ]_{i,j}\) on \({\mathbb {R}}^n\) and inherits its properties. The invertibility and non-negativity of this covariance matrix entail the invertibility and non-negativity of the first one, which proves the result.
Proof (Proof of Theorem 1)
Proof
Let \(\theta _1 , \theta _2 \in \varTheta \), with \(\theta _1 \ne \theta _2\). Then, there exists \(t^* \in [0, \pi /4] \) so that \(K_{\theta _1}(0) - K_{\theta _1}(2 t^*) \ne K_{ \theta _2 }(0) - K_{ \theta _2 }( 2t^*)\).
For \(i \in {\mathbb {N}}\), let \(c_i:[0,1] \rightarrow {\mathbb {R}}\) be defined by \(c_i(t) = t^* \cos (2 \pi i t)\). Then, \(c_i \in T_1({\mathcal {H}})\). Let \(\tilde{e}_i = \exp _{1}( c_i )\). Then, for \(t \in [0,1]\)
It follows that \(\tilde{e}_i \in {\mathcal {H}}\) and we can let \(\tilde{F}_i(t) = \int _{0}^t \tilde{e}_i(s)^2 ds\). Letting \(\bar{e}_i = \exp _{1}( -c_i )\), we obtain similarly that \(\bar{e}_i \in {\mathcal {H}}\) and we let \(\bar{F}_i(t) = \int _{0}^t \bar{e}_i(s)^2 ds\).
Consider the 2n elements \((F_1,...,F_{2n})\) composed by the pairs \(( \tilde{F}_i,\bar{F_i})\) for \(i=1,\dots ,n\). Consider a Gaussian process Z on \( {{\mathcal {F}}}\) with mean function zero and covariance function \(K_{ \theta _1 }\). Then, the Gaussian vector \(W = (Z(F_i))_{i=1,...,2n}\) has covariance matrix C given by
Hence, we have \(C = D + M\) where M is the matrix with all components equal to \(K_{ \theta _1 }( \sqrt{2} t^*)\) and where D is block diagonal, composed of n blocks of size \(2 \times 2\), with each block \(B_{2,2}\) equal to
Hence, in distribution, \(W = M + E\), with M and E independent, \(M=(z,....,z)\) where \(z \sim {\mathcal {N}}(0,K_{ \theta _1 }( \sqrt{2} t^*))\) and where the n pairs \((E_{2k+1},E_{2k+2})\), \(k=0,...,n-1\) are independent, with distribution \({\mathcal {N}}(0,B_{2,2})\). Hence, with \(\bar{W}_1 = (1/n) \sum _{k=0}^{n-1} W_{2k+1}\), \(\bar{W}_2 = (1/n) \sum _{k=0}^{n-1} W_{2k+2}\) and \(\bar{E} = (1/n) \sum _{k=0}^{n-1} (E_{2k+1},E_{2k+2})^t\), we have
Hence, there exists a subsequence \(n' \rightarrow \infty \) so that, almost surely \(\hat{B} \rightarrow B_{2,2}\) as \(n' \rightarrow \infty \). Hence, almost surely \(\hat{B}_{1,1} - \hat{B}_{1,2} \rightarrow K_{ \theta _1 }(0) - K_{ \theta _1 }(2 t^*)\) as \(n' \rightarrow \infty \). Hence, the event \(\{ \hat{B}_{2,2} \rightarrow _{n' \rightarrow \infty } K_{ \theta _1 }(0) - K_{ \theta _1 }(2 t^*) \}\) has probability one under \({\mathbb {P}}_{\theta _1}\). With the same arguments, we can show that the event \(\{ \hat{B}_{2,2} \rightarrow _{n'' \rightarrow \infty } K_{ \theta _2 }(0) - K_{ \theta _2 }(2 t^*) \)} has probability one under \({\mathbb {P}}_{\theta _2}\), where \(n''\) is a subsequence extracted from \(n'\). Since these two events have zero intersection, it follows that \({\mathbb {P}}_{\theta _1}\) and \({\mathbb {P}}_{\theta _2}\) are orthogonal. Hence, \(\theta \) is microergodic.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Samir, C., Loubes, JM., Yao, AF., Bachoc, F. (2019). Learning a Gaussian Process Model on the Riemannian Manifold of Non-decreasing Distribution Functions. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)