Abstract
As spherical data (i.e. \(L_2\) normalized vectors) are often encountered in a variety of real-life applications (such as gesture recognition, gene expression analysis, etc.), sequential spherical data modeling has become an important research topic in recent years. Hidden Markov models (HMMs), as probabilistic graph models, have shown their effectiveness in modeling sequential data in previous research works. In this article, we propose a nonparametric hidden Markov model (NHMM) for modeling time series or sequential spherical data vectors. In our model, the emission distribution of each hidden state obeys a mixture of von Mises (VM) distributions which has better capability for modeling spherical data than other popular distributions (e.g. the Gaussian distribution). As we construct our NHMM by leveraging a Bayesian nonparametric model namely the Dirichlet process, the amount of hidden states and the number of mixture components for each state can be automatically adjusted according to observed data set. In addition, to handle high-dimensional data sets which may contain irrelevant or noisy features, feature selection, which is the process of selecting the “best” feature subset for describing the given data set, is adopted in our framework. In our case, an unsupervised localized feature selection method is incorporated with the developed NHMM, which results in a unified framework that can simultaneously perform data modeling and feature selection. Our model is learned by theoretically developing a convergence-guaranteed algorithm through variational Bayes. The advantages of our model are demonstrated by conducting experiments on both synthetic and real-world sequential data sets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability statement
The data sets analysed during the current study are available in the UCI Machine Learning Repository https://archive.ics.uci.edu.
References
Asilian Bidgoli A, Ebrahimpour-komleh H, Rahnamayan S (2021) A novel binary many-objective feature selection algorithm for multi-label data classification. Int J Mach Learn Cybern 12:2041–2057
Aytekin C, Ni X, Cricri F, Aksu E (2018) Clustering and unsupervised anomaly detection with \(l_2\) normalized deep auto-encoder representations. In: 2018 international joint conference on neural networks (IJCNN), pp 1–6
Banerjee A, Dhillon I, Ghosh J, Sra S (2005) Clustering on the unit hypersphere using von Mises-Fisher distributions. J Mach Learn Res 6:1345–1382
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Blei DM, Jordan MI (2005) Variational inference for Dirichlet process mixtures. Bayesian Anal 1:121–144
Blei DM, Kucukelbir A, Mcauliffe J (2017) Variational inference: a review for statisticians. J Am Stat Assoc 112(518):859–877
Calderara S, Prati A, Cucchiara R (2011) Mixtures of von Mises distributions for people trajectory shape analysis. IEEE Trans Circ Syst Video Technol 21(4):457–471
Chatzis SP, Kosmopoulos DI (2011) A variational Bayesian methodology for hidden Markov models utilizing student’s-t mixtures. Pattern Recogn 44(2):295–306
Ding N, Ou Z (2010) Variational nonparametric Bayesian hidden markov model. In: 2010 IEEE international conference on acoustics, speech and signal processing, pp 2098–2101
Dokeroglu T, Deniz A, Kiziloz HE (2021) A robust multiobjective harris’ hawks optimization algorithm for the binary classification problem. Knowl-Based Syst 227(107):219
Epaillard E, Bouguila N (2019) Variational Bayesian learning of generalized Dirichlet-based hidden Markov models applied to unusual events detection. IEEE Trans Neural Netw 30(4):1034–1047
Fan W, Bouguila N (2020) Spherical data clustering and feature selection through nonparametric Bayesian mixture models with von Mises distributions. Eng Appl Artif Intell 94(103):781
Fan W, Bouguila N, Ziou D (2011) Unsupervised anomaly intrusion detection via localized Bayesian feature selection. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 1032–1037
Fan W, Bouguila N, Du J, Liu X (2019) Axially symmetric data clustering through Dirichlet process mixture models of Watson distributions. IEEE Trans Neural Netw Learn Syst 30(6):1683–1694
Fan W, Yang L, Bouguila N, Chen Y (2020) Sequentially spherical data modeling with hidden Markov models and its application to fMRI data analysis. Knowl-Based Syst 206(106):341
Fan W, Yang L, Bouguila N (2021) Unsupervised grouped axial data modeling via hierarchical Bayesian nonparametric models with Watson distributions. IEEE Trans Pattern Anal Mach Intell 2021:1–1. https://doi.org/10.1109/TPAMI.2021.3128271
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res 14(1):1303–1347
Illingworth CJR, Roy S, Beale MA, Tutill HJ, Williams R, Breuer J (2017) On the effective depth of viral sequence data. Virus Evol 3:2
Javidi MM (2021) Feature selection schema based on game theory and biology migration algorithm for regression problems. Int J Mach Learn Cybern 12:303–342
Ji S, Krishnapuram B, Carin L (2006) Variational Bayes for continuous hidden Markov models and its application to active learning. IEEE Trans Pattern Anal Mach Intell 28(4):522–532
Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233
Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: ICLR
Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166
Ley C, Verdebout T (2018) Applied directional statistics: modern methods and case studies. Chapman and Hall/CRC, Hoboken
Li J, Cheng K, Wang S, Morstatter F, Trevino R, Tang J, Liu H (2017) Feature selection: a data perspective. ACM Comput Surv 50(6):94
Li Y, Dong M, Hua J (2009) Simultaneous localized feature selection and model detection for Gaussian mixtures. IEEE Trans Pattern Anal Mach Intell 31(5):953–960
Mabrouk AB, Zagrouba E (2018) Abnormal behavior recognition for intelligent video surveillance systems. Expert Syst Appl 91:480–491
Mardia KV, Jupp PE (2000) Directional statistics. Wiley, USA
Nasfi R, Amayri M, Bouguila N (2020) A novel approach for modeling positive vectors with inverted Dirichlet-based hidden Markov models. Knowl Based Syst 192(105):335
Pigou L, Den Oord AV, Dieleman S, Van Herreweghe M, Dambre J (2018) Beyond temporal pooling: Recurrence and temporal convolutions for gesture recognition in video. Int J Comput Vis 126:430–439
Qiu Z, Shen H (2017) User clustering in a dynamic social network topic model for short text streams. Inf Sci 414:102–116
Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):267–296
Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4:639–650
Sra S, Karp D (2013) The multivariate Watson distribution: Maximum-likelihood estimation and other aspects. J Multivar Anal 114:256–269
Taghia J, Leijon A (2016) Variational inference for Watson mixture model. IEEE Trans Pattern Anal Mach Intell 38(9):1886–1900
Taghia J, Ma Z, Leijon A (2014) Bayesian estimation of the von Mises-fisher mixture model with variational inference. IEEE Trans Pattern Anal Mach Intell 36(9):1701–1715
Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical Dirichlet processes. J Am Stat Assoc 101(476):1566–1581
Tubishat M, Ja’afar S, Alswaitti M, Mirjalili S, Idris N, Ismail MA, Omar MS (2021) Dynamic salp swarm algorithm for feature selection. Expert Syst Appl 164(113):873
Volant S, Berard C, Martinmagniette M, Robin S (2014) Hidden markov models with mixtures as emission distributions. Stat Comput 24(4):493–504
Zheng Y, Jeon B, Sun L, Zhang J, Zhang H (2018) Student’s t-hidden Markov model for unsupervised learning using localized feature selection. IEEE Trans Circuits Syst Video Technol 28(10):2586–2598
Zhu H, He Z, Leung H (2012) Simultaneous feature and model selection for continuous hidden markov models. IEEE Signal Process Lett 19(5):279–282
Acknowledgements
The completion of this work was supported by the National Natural Science Foundation of China (61876068).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fan, W., Hou, W. Unsupervised modeling and feature selection of sequential spherical data through nonparametric hidden Markov models. Int. J. Mach. Learn. & Cyber. 13, 3019–3029 (2022). https://doi.org/10.1007/s13042-022-01579-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01579-7