Abstract
Latent Dirichlet allocation (LDA) is a popular topic model for extracting common patterns from discrete datasets. It is extended to the pachinko allocation model (PAM) with a hierarchical topic structure. This paper presents a combination meal allocation (CMA) model, which is a further enhanced topic model from the PAM that has both hierarchical categories and hierarchical topics. We consider count datasets in multiway arrays, i.e., tensors, and introduce a set of topics to each mode of the tensors. The topics in each mode are interpreted as patterns in the topics and categories in the next mode. Despite there being a vast number of combinations in multilevel categories, our model provides simple and interpretable patterns by sharing the topics in each mode. Latent topics and their membership are estimated using Markov chain Monte Carlo (MCMC) methods. We apply the proposed model to step-count data recorded by activity monitors to extract some common activity patterns exhibited by the users. Our model identifies four daily patterns of ambulatory activities (commuting, daytime, nighttime, and early-bird activities) as sub-topics, and six weekly activity patterns as super-topics. We also investigate how the amount of activity in each pattern dynamically affects body weight changes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp. 27–34 (2009)
Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127–134 (2003)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Durbin, J., Koopman, S.J.: Time Series Analysis by State Space Methods, 2nd edn. Oxford University Press, Oxford (2012)
Iwata, T., Sawada, H.: Topic model for analyzing purchase data with price information. Data Min. Knowl. Disc. 26, 559–573 (2013)
Iwata, T., Watanabe, S., Yamada, T., Ueda, N.: Topic tracking model for analyzing consumer purchase behavior. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp. 1427–1432 (2010)
Lee, N., Phan, A.-H., Cong, F., Cichocki, A.: Nonnegative tensor train decompositions for multi-domain feature extraction and clustering. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9949, pp. 87–95. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46675-0_10
Leech, R.M., McNaughton, S.A., Timperio, A.: The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review. Int. J. Behav. Nutr. Phys. Act. 11(4), 1–9 (2016). https://doi.org/10.1186/1479-5868-11-4
Li, W., McCallum, A.: Pachinko allocation: scalable mixture models of topic correlations. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 577–584 (2006)
Lu, H., Wei, C., Hsiao, F.: Modeling healthcare data using multiple-channel latent Dirichlet allocation. J. Biomed. Inform. 60, 210–223 (2016)
Liu, J.S.: The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J. Am. Stat. Assoc. 89, 958–966 (1994)
Masada, T., Takasu, A.: A topic model for traffic speed data analysis. In: Ali, M., Pan, J.-S., Chen, S.-M., Horng, M.-F. (eds.) IEA/AIE 2014. LNCS (LNAI), vol. 8482, pp. 68–77. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07467-2_8
McAloney, K., Graham, H., Law, C., Platt, L.: A scoping review of statistical approaches to the analysis of multiple health-related behaviours. Prev. Med. 56, 356–371 (2013). https://doi.org/10.1016/j.ypmed.2013.03.002
Metzger, J.S., Catellier, D.J., Evenson, K.R., Treuth, M.S., Rosamond, W.D., Siega-Riz, A.M.: Associations between patterns of objectively measured physical activity and risk factors for the metabolic syndrome. Am. J. Health Promot. 24(3), 161–169 (2010). https://doi.org/10.4278/ajhp.08051151
Mimno, D., Wallach, H.M., Naradowsky, J., Smith, D.A., McCallum, A.: Polylingual topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 880–889 (2009)
Minka, T.: Estimating a Dirichlet distribution (2000)
Oseledets, I.V.: Tensor-train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011). https://doi.org/10.1137/090752286
Acknowledgement
This study was conducted as part of the “Research and Development on Utilization of Fundamental Technologies for Social Big Data” (178A04) project of NICT (National Institute of Information and Communication Technology).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nomura, S., Watanabe, M., Oguma, Y. (2021). Hierarchical Topic Model for Tensor Data and Extraction of Weekly and Daily Patterns from Activity Monitor Records. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-75015-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75014-5
Online ISBN: 978-3-030-75015-2
eBook Packages: Computer ScienceComputer Science (R0)