Abstract
Longitudinal data is the repeated observations of individuals through time. They often exhibit rich statistical qualities, such as skew or multimodality, that are difficult to capture using traditional parametric methods. To tackle this, we build a non-parametric Markov transition model for longitudinal data. Our approach uses kernel mean embeddings to learn a transition model that can express complex statistical features. We also propose an approximate data subsampling technique based on kernel herding and random Fourier features that allows our method to scale to large longitudinal data sets. We demonstrate our approach on two real world data sets.
Notes
- 1.
KDE is a closely related method, but we only use positive-definite kernels. Without this requirement, we lose all the theoretical benefits discussed in this paper.
- 2.
A positive definite kernel (or just a kernel) \(k_\mathcal {X}\) defined on a measurable space \(\mathcal {X}\) satisfies \(\sum _{i=1}^n \sum _{j=1}^n c_i c_j k_\mathcal {X}(x_i, x_j) \ge 0\) for any \(n \in \mathbb {N}\), \(c_1, \dots , c_n \in \mathbb {R}\), and \(x_1, \dots , x_n \in \mathcal {X}\).
References
Kanagawa, M., Nishiyama, Y., Gretton, A., Fukumizu, K.: Filtering with state-observation examples via kernel Monte Carlo filter. Neural Comput. 28(2), 382–444 (2014)
Smola, A., Gretton, A., Song, L., Schölkopf, B.: A Hilbert space embedding for distributions. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 13–31. Springer, Heidelberg (2007). doi:10.1007/978-3-540-75225-7_5
Song, L., Huang, J., Smola, A., Fukumizu, K.: Hilbert space embeddings of conditional distributions with applications to dynamical systems. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 961–968. ACM, June 2009
McCalman, L.R.: Function embeddings for multi-modal Bayesian inference (2013)
Muandet, K., Fukumizu, K., Sriperumbudur, B., Schlkopf, B.: Kernel mean embedding of distributions: a review and beyonds. arXiv preprint arXiv:1605.09522 (2016)
Kanagawa, M., Fukumizu, K.: Recovering distributions from Gaussian RKHS embeddings. In: AISTATS, pp. 457–465 (2014)
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Advances in Neural Information Processing Systems, pp. 1177–1184 (2007)
Majecka, B.: Statistical models of pedestrian behaviour in the forum. Master’s thesis, School of Informatics, University of Edinburgh (2009)
GPy: GPy: a Gaussian process framework in python. http://github.com/SheffieldML/GPy
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Shen, D., Ramos, F. (2016). Kernel Embeddings of Longitudinal Data. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-50127-7_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)