Nonparametric hierarchical mixture models based on asymmetric Gaussian distribution
Introduction
With the rapid advancement of digital technologies, the need for the development of visual data modeling becomes more urgent. Among these techniques, finite mixture models have been widely used in a series of domains, such as image processing and pattern recognition [1] [2], as an efficient unsupervised learning approach. Such models can discover the structure of extracted visual features and classify them into distinct groups.
A challenging problem when applying finite mixture models is model selection (i.e., determination of the model's complexity) since an inappropriate number of mixture components could result in poor generalization capability. Numerous studies have been devoted to the self-refinement selection of the components' number which best depict the vectors, such as maximum likelihood method in which a model selection criterion is included. Recently, nonparametric Bayesian methods, especially Dirichlet process (DP) mixture models, have been widely considered to deal with the model selection problem [3] [4].
Mixture models which allow the number of clusters to extend to infinity as the new data arrive could be viewed as nonparametric models [5]. In one of our earlier work [6], we have constructed a DP mixture of asymmetric Gaussian distributions (AGD) allowing simultaneous feature selection for video background subtraction. The DP is a parameterized stochastic process with a positive scaling factor and base distribution which is used to form a distribution over discrete distribution. A sound alternative to DP is the Pitman-Yor process (PYP) which can be viewed as a generalization to the DP prior for nonparametric Bayesian modeling [7]. In this paper, we are interested in Bayesian nonparametric models based on the Dirichlet and Pitman-Yor processes.
Hierarchical Bayesian models have been an attractive research topic and been successfully applied in various fields such as language modeling, image segmentation, etc [8] [9]. For modeling grouped data with shared clusters, hierarchical nonparametric Bayesian approaches namely the hierarchical DP or hierarchical PYP mixtures are considered. Within the same group, each observation is drawn independently from a mixture model, where the number of observations may be different within each group. Under the settings of hierarchical modeling, parameters are shared among groups, and the randomness of the parameters induces dependencies among different groups. Another crucial problem when dealing with vectors of visual descriptors, that we take into account within our nonparametric Bayesian framework, is parameter estimation. The inference for the resulting models is conducted under a Bayesian setting by means of a variational Bayes technique and a gradient ascent method namely black box variational inference (BBVI).
Another challenging problem when considering mixture models within nonparametric Bayesian frameworks is the choice of the base distribution. Gaussian distribution has enjoyed great popularity in many fields since it provides interpretable results and is easily generalized to new tasks [10]. Despite its success and usefulness in many application domains, it is not always an adequate choice and suffers from some limitations with asymmetric shaped data which is frequently appears in image processing tasks [11]. Indeed, this is the case especially for natural images. For achieving improved accurate approximation and modeling performance, we consider AGD which is competent in modeling asymmetric data: this distribution has left and right standard deviations to capture the asymmetry of data [12].
The major contributions of this work can be summarized as follows: Firstly, we propose two efficient nonparametric hierarchical models based on DP and PYP mixtures with AGD. Secondly, we develop efficient learning algorithms to estimate both models' parameters through an inference framework which integrates coordinate ascent variational inference (CAVI) and BBVI method. The proposed nonparametric hierarchical Bayesian models and the learning algorithm are validated using a real-life application namely dynamic texture clustering. It is noteworthy to mention that the complexity of the proposed method still remains less than MCMC. Furthermore, model selection is a simultaneous operation in the proposed algorithm. This is not the case for MCMC methods where model selection is usually solved as a preprocessing or post processing step. Overall, that also leads to faster convergence of the proposed method in comparison to the alternatives.
The remainder of this paper is organized as follows: Section 2 describes the background of hierarchical DP mixture and hierarchical PYP mixture, and defines it via stick-breaking construction. Section 3 develops the variational inference framework to optimize the evidence lower bound and estimate the parameters of the resulting model. Section 4 presents the complete learning algorithms for our approaches. Section 5 applies the proposed approach to the challenging task of dynamic texture clustering. Finally, Section 6 concludes the paper.
Section snippets
Hierarchical infinite asymmetric Gaussian mixture
In this section, we briefly introduce our hierarchical DP mixture model of AGD, which may also be referred to as the hierarchical infinite asymmetric Gaussian mixture model.
Variational approximation
Variational inference is a well-defined method to approximate probability densities through optimization [23] [24]. The idea behind variational inference is to approximate the true posterior distribution with a suitable approximation distribution from a restrictive family of distribution, where represents the set of latent variables in the HDPAGM model.
The objective of variational inference is to discover the closest parameters in the constrained variational
Learning algorithm
An important aspect when applying variational inference is the convergence assessment. In our work, we trace the convergence systematically by monitoring the ELBO. Convergence is reached when the ELBO is less than between epochs or the number of iterations is more than 300. The Bayesian inference framework of the HDPAGM is summarized in Algorithm 1.
The detailed learning equations of HPYPAGM are presented in the Appendix A. The complete learning algorithm is summarized in Algorithm 2.
Experimental results
We evaluate the effectiveness of the proposed HDP mixture and HPYP mixture model with AGD using challenging dynamic texture clustering application. In our experiments, we initialize the global truncation level K and group level truncation level T to 120 and 60, respectively. For HDP mixture, the hyperparameters of the stick lengths ω and α are initialized to 0.25; we set the parameters of HPYP mixture , , and as 0.25.
The hyperparameters of asymmetric Gaussian base distribution are
Conclusion
In this paper, we have presented a statistical clustering framework based on AGD. This framework is developed from nonparametric Bayesian prior. We have proposed and implemented an effective variational inference framework to estimate latent variables for hierarchical infinite mixtures. As for estimating the parameters, we adopt a tenable fully factorized assumption over the family of variables to optimize the lower bound of the likelihood of the models. The effectiveness of these models are
CRediT authorship contribution statement
Ziyang Song: Methodology, Software, Writing - original draft. Samr Ali: Writing - review & editing. Nizar Bouguila: Supervision. Wentao Fan: Conceptualization.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The completion of this research was possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC) grant number 6656-2017 and the National Natural Science Foundation of China (61876068).
Ziyang Song is currently a M.Sc. student at the Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada. His research interests include machine learning, probabilistic graphical model and Bayesian inference.
References (36)
Model-based classification using latent Gaussian mixture models
J. Stat. Plan. Inference
(2010)- et al.
Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images
Pattern Recognit.
(2008) Improving object detection with boosted histograms
Image Vis. Comput.
(Apr. 2009)- et al.
Bayesian learning of finite generalized Gaussian mixture models on images
Signal Process.
(2011) - et al.
Simultaneous high-dimensional clustering and feature selection using asymmetric Gaussian mixture models
Image Vis. Comput.
(Feb. 2015) - et al.
DynTex: a comprehensive database of dynamic textures
Pattern Recognit. Lett.
(2010) - et al.
A nonparametric Bayesian learning model using accelerated variational inference and feature selection
Pattern Anal. Appl.
(Feb 2019) - et al.
Axially symmetric data clustering through Dirichlet process mixture models of Watson distributions
IEEE Trans. Neural Netw. Learn. Syst.
(June 2019) - et al.
Bayesian nonparametric modelling with the Dirichlet process regression smoother
Stat. Sin.
(2010) - et al.
Bayesian learning of infinite asymmetric Gaussian mixture models for background subtraction
A hierarchical Bayesian language model based on Pitman-yor processes
Hierarchical Bayesian Nonparametric Models with Applications
Online learning of hierarchical Pitman-yor process mixture of generalized Dirichlet distributions with feature selection
IEEE Trans. Neural Netw. Learn. Syst.
Gaussian assumption: the least favorable but the most useful [lecture notes]
IEEE Signal Process. Mag.
A Bayesian analysis of some nonparametric problems
Ann. Stat.
Random Number Generation and Monte Carlo Methods
Independent Random Sampling Methods
Hierarchical Dirichlet processes
J. Am. Stat. Assoc.
Cited by (8)
Decoding of auditory surprise in adult magnetoencephalography data using Bayesian models
2024, Digital Signal Processing: A Review JournalExpectation propagation learning of finite and infinite Gamma mixture models and its applications
2023, Multimedia Tools and ApplicationsEnhancing Human Action Recognition with Asymmetric Generalized Gaussian Mixture Model-Based Hidden Markov Models and Bounded Support
2023, Conference Proceedings - IEEE International Conference on Systems, Man and CyberneticsRefining Nonparametric Mixture Models with Explainability for Smart Building Applications
2023, Conference Proceedings - IEEE International Conference on Systems, Man and CyberneticsData Mining Approach Based on Hierarchical Gaussian Mixture Representation Model
2023, Intelligent Automation and Soft ComputingEffective Frameworks Based on Infinite Mixture Model for Real-World Applications
2022, Computers, Materials and Continua
Ziyang Song is currently a M.Sc. student at the Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada. His research interests include machine learning, probabilistic graphical model and Bayesian inference.
Samr Ali received her M.Sc. degree in Electrical and Computer Engineering from Abu Dhabi University in 2017. She is currently a Ph.D. candidate in the Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada. She is also the recipient of the prestigious FRQNT award in 2019. Her current research interests include machine learning, pattern recognition, data mining, and computer vision.
Nizar Bouguila received the degree in engineering from the University of Tunis, in 2000, and the M.Sc. and Ph.D. degrees from Sherbrooke University, in 2002 and 2006, respectively, all in computer science. He is currently a Professor with the Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada. His research interests include image processing, machine learning, 3D graphics, computer vision, and pattern recognition.
Wentao Fan received the M.Sc. and Ph.D. degrees in electrical and computer engineering from Concordia University, Montreal, QC, Canada, in 2009 and 2014, respectively. He is currently an Associate Professor with the Department of Computer Science and Technology, Huaqiao University, Xiamen, China. His research interests include machine learning, computer vision, and pattern recognition.