Bayesian nonnegative matrix factorization in an incremental manner for data representation

Yang, Lijun; Yan, Lulu; Yang, Xiaohui; Xin, Xin; Xue, Liugen

doi:10.1007/s10489-022-03522-3

Bayesian nonnegative matrix factorization in an incremental manner for data representation

Published: 10 August 2022

Volume 53, pages 9580–9597, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Lijun Yang¹,
Lulu Yan²,
Xiaohui Yang¹,
Xin Xin¹ &
…
Liugen Xue¹

236 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Nonnegative matrix factorization (NMF) is a novel paradigm for feature representation and dimensionality reduction. However, the performance of the NMF model is affected by two critical and challenging problems. One is that the original NMF does not consider the distribution information of data and parameters, resulting in inaccurate representations. The other is the high computational complexity in online processing. Bayesian approaches are proposed to address the former problem of NMF. However, most existing Bayesian-based NMF models utilize an exponential prior, which only guarantees the nonnegativity of parameters without fully considering the prior information of the parameters. Thus, a new Bayesian-based NMF model is constructed based on the Gaussian likelihood and a truncated Gaussian prior, called the truncated Gaussian-based NMF (TG-NMF) model, in which a truncated Gaussian prior can prevent overfitting while ensuring nonnegativity. Furthermore, Bayesian inference-based incremental learning is introduced to reduce the high computational complexity of TG-NMF; this model is called TG-INMF. We adopt variational Bayesian to estimate all parameters of TG-NMF and TG-INMF. Experiments on genetic data-based tumor recognition demonstrate that our models are competitive with other existing methods for classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive factorization rank selection-based NMF and its application in tumor recognition

Article 31 May 2021

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Fast and accurate pseudoinverse with sparse matrix reordering and incremental approach

Article Open access 27 October 2020

References

Wu WM, Ma XK (2020) Joint learning dimension reduction and clustering of single-cell RNA-sequencing data. Bioinformatics 36:3825–3832
Google Scholar
Zhao W, Xu C, Guan Z, Liu Y (2021) Multiview concept learning via deep matrix factorization. IEEE Trans Neural Netw Learn Syst 32(2):814–825
MathSciNet Google Scholar
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Google Scholar
Guillen P, Ebalunode J (2016) Cancer classification based on microarray gene expression data using deep learning. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp 1403–1405
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
MATH Google Scholar
Lei XL, Tie JJ, Fujita H (2020) Relational completion based non-negative matrix factorization for predicting metabolite-disease associations. Knowledge-Based Systems 204(27):106238
Google Scholar
Feng XD, Jiao YT, Lv C, Zhou D (2016) Label consistent semi-supervised non-negative matrix factorization for maintenance activities identification. Eng Appl Artif Intell 52:161–167
Google Scholar
Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29(5):1947–1960
MathSciNet Google Scholar
Zheng C, Ng T, Zhang L, Shiu C, Wang H (2011) Tumor classification based on non-negative matrix factorization using gene expression data. IEEE Transactions on NanoBioscience 10:86–93
Google Scholar
Tu D, Chen L, Lv MQ, Shi HY, Chen GC (2018) Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream. Pattern Recogn 76:203–214
Google Scholar
Masood MA, Doshi-Velez F (2019) A particle-based variational approach to Bayesian non-negative matrix factorization. J Mach Learn Res 20(90):1–56
MathSciNet MATH Google Scholar
Schmidt MN, Winther O, Hansen LK (2009) Bayesian non-negative matrix factorization. Independent Component Analysis and Signal Separation 5441:504–547
Google Scholar
Sun QQ, Wu P, Wu YQ, Guo MC, Lu J (2012) Unsupervised multi-Level non-negative matrix factorization model: Binary data case. Int J Inf Secur 3:245–250
Google Scholar
Artac M, Jogan M, Leonardis A (2002) Incremental PCA for on-line visual learning and recognition. Object Recognition Supported by User Interaction for Service Robots 3:781–784
Google Scholar
Bucak SS, Gunsel B (2009) Incremental subspace learning via non-negative matrix factorization. Pattern Recogn 42:788–797
MATH Google Scholar
Hu C, Chen Y, Peng X, Yu H, Gao C, Hu L (2019) A novel feature incremental learning method for Sensor-Based activity recognition. IEEE Trans Knowl Data Eng 31(6):1038–1050
Google Scholar
Gu B, Sheng VS, Wang ZJ, Ho D, Osman S, Li S (2015) Incremental learning for v-support vector regression. Neural Netw 67:140–150
MATH Google Scholar
Cemgil AT (2009) Bayesian inference for nonnegative matrix factorization models. Computational Intelligence and Neuroscience 785152
Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res 14:1303–1347
MathSciNet MATH Google Scholar
Alquier P, Guedj B (2017) An oracle inequality for quasi-Bayesian non-negative matrix factorization. Mathematical Methods of Stats 26:55–67
MATH Google Scholar
Yen TJ (2011) A majorization-minimization approach to variable selection using spike and slab priors. Ann Stat 39:1748–1775
MathSciNet MATH Google Scholar
Ade RR, Deshmukh PR (2013) Methods for incremental learning: a survey. International Journal of Data Mining Knowledge and Management Process 3:119–125
Google Scholar
Bishop CM (2009) Pattern recognition and machine learning, vol 738. Springer, New York
Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge university press, Cambridge
MATH Google Scholar
Yang XH, Wu WM, Chen YM et al (2019) An integrated inverse space sparse representation framework for tumor classification. Pattern Recogn 93:293–311
Google Scholar
van’t Veer LJ, van de Vijver MJ, Dai H et al (2001) Expression profiling predicts poor outcome of disease in young breast cancer patients. Eur J Cancer 37:S271
Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87
MathSciNet MATH Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31:210–227
Google Scholar
Yang XH, Liu F, Tian L, Li HF, Jiang XY (2018) Pseudo-full-space representation based classification for robust face recognition. Signal Processing: Image Communication 60:64–78
Google Scholar
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Google Scholar
Alon U, Barkai N, Notterman DA et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences 96:6745–6750
Google Scholar
Shipp MA, Ross KN, Tamayo P et al (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68–74
Google Scholar
van’t Veer LJ, Dai HY, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536
Armstrong SA, Staunton JE, Silverman LB et al (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Google Scholar
Khan J, Wei JS, Ringner M, Saal LH et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679
Google Scholar
Staunton JE, Slonim DK, Coller HA et al (2001) Chemosensitivity prediction by transcriptional profiling. Proceedings of the National Academy of Sciences of the United States of America 98(19):10787–10792
Google Scholar
Brouwer T, Frellsen J, Lió P (2017) Comparative study of inference methods for Bayesian nonnegative matrix factorization Machine Learning and Knowledge Discovery in Databases - European Conference. ECML PKDD 2017(10534):513–529
Google Scholar
Cover TW, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27
MATH Google Scholar
Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16:906–914
Google Scholar
Deng W, Hu J, Guo J (2012) Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34:1864–1870
Google Scholar
Deng HT, Runger G (2013) Gene selection with guided regularized random forest. Pattern Recogn 46:3483–3489
Google Scholar
Lu H, Chen J, Yan K, Jin Q, Xue Y, Gao Z (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62
Google Scholar
Salem H, Attiya G, El-Fishawy N (2017) Classification of human cancer diseases by gene expression profiles. Appl Soft Comput 50:124–134
Google Scholar
Aydadenta H (2018) A clustering approach for feature selection in microarray data classification using random forest. Journal of Information Processing Systems, 14
Purbolaksono MD, Widiastuti KC, Mubarok MS, Ma’ruf FA (2018) Implementation of mutual information and bayes theorem for classification microarray data. Journal of physics: Conference Series, 012011
Younsi R, Bagnall A (2016) Ensembles of random sphere cover classifiers. Pattern Recogn 49:213–225
Google Scholar
Li JX, QS, Wang YN, Jiang XB, Chen FX, Lu WC (2017) A cancer gene selection algorithm based on the K-S test and CFS. BioMed Research International 2017:1645619
Google Scholar
Gan B, Zheng CH, Zhang J, Wang HQ (2014) Sparse representation for tumor classification based on feature extraction using latent low-rank representation. BioMed Res Int 10:63–68
Google Scholar
Gan B, Zheng CH, Liu JX (2013) Metasample-based robust sparse representation for tumor classification. Engineering 5:77–83
Google Scholar
Liu J, Xu Y, Zheng C, Kong H, Lai Z (2015) RPCA-based tumor classification using gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 12:964–970
Google Scholar
Yang X, Tian L, Chen Y, Yang L, Xu S, Wu W (2020) Inverse projection representation and category contribution rate for robust tumor recognition. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17(4):1262–1275
Google Scholar
Khormuji MK, Bazrafkan M (2016) A novel sparse coding algorithm for classification of tumors based on gene expression data. Med Biol Eng Comput 54(6):869–876
Google Scholar
Fan YY, Kong YF, Li DJ, Zheng ZM (2015) Innovated interaction screening for high-dimensional nonlinear classification. Ann Stat 43:1243–1272
MathSciNet MATH Google Scholar
Jiang BY, Chen ZQ, Leng CL (2020) Dynamic linear discriminant analysis in high dimensional space. Bernoulli 26:1234–1268
MathSciNet MATH Google Scholar

Download references

Acknowledgment

The authors would like to thank the journal editor and anonymous reviewers for their constructive comments. This work was supported by the National Natural Science Foundation of China (grant numbers 11701144, 1182002), Open Fund of Key Laboratory of Intelligence Perception and Image Understanding of Ministry of Eduction, and Program for Science and Technology Development of Henan Province (grant numbers 212102310305).

Author information

Authors and Affiliations

Henan Engineering Research Center for Artificial Intelligence Theory and Algorithms, Henan University, Kaifeng, 475000, China
Lijun Yang, Xiaohui Yang, Xin Xin & Liugen Xue
School of Mathematics, Sun Yat-Sen University, Guangzhou, 510080, China
Lulu Yan

Authors

Lijun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lulu Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xin
View author publications
You can also search for this author in PubMed Google Scholar
Liugen Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohui Yang.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Firstly, the optimization problem of the TG-NMF model is solved by the variational Bayesian inference algorithm. The variational posterior distribution can be derived using the (15), and then the update rules are obtained according to those distributions. The derivations for W_ir, H_rj, τ are shown below. For a random variable x and the function f of x, we adopt $\widetilde {f(x)}$ as a shorthand for E_q[f(x)].

Following (15), the variational posterior distribution of W_ir is truncated Gaussian distribution, i.e., ${W_{ir}} \sim TG\left ({\left .{{W_{ir}}} \right |\mu _{ir}^{W},(\tau _{ir}^{W})^{-1},0,+\infty }\right )$.

$$ \begin{array}{@{}rcl@{}} {q^{*}}\left( {{W_{ir}}} \right) &\propto &\exp \left\{ {{E_{q({{\theta_{-\!{W_{ir}}}}})}}\left[ {\log p\left( {{{V_{ij}}}|{W_{ir}},{H_{rj}}} \right) + \log p\left( {{{W_{ir}}}|\mu_{ir},(\tau_{ir})^{-1},0, + \infty } \right)} \right]} \right\} \times I\left( x \right) \\ &\propto &\exp \left\{ {{E_{q({{\theta_{-\!{W_{ir}}}}})}}\left[ { - \frac{\tau }{2}\sum\limits_{j} {{{\left( {{V_{ij}} - {{\left( {WH} \right)}_{ij}}} \right)}^{2}} - \frac{{{\tau_{ir}}}}{2}{{\left( {{W_{ir}} - \mu_{ir}^{}} \right)}^{2}}} } \right]} \right\} \times I\left( x \right) \\ &\propto &\exp \left\{ {{E_{q({{\theta_{-\!{W_{ir}}}}})}}\left[ {\left( {-\frac{\tau }{2}\sum\limits_{j} {H_{rj}^{2}-\frac{{\tau_{ir}^{}}}{2}}}\right)W_{ir}^{2}+A(W_{ir},H_{rj},\tau){W_{ir}}} \right]} \right\} \times I\left( x \right) \\ &\propto &\exp \left\{{-\frac{{\tau_{ir}^{W}}}{2}{{\left( {{W_{ir}}-\mu_{ir}^{W}}\right)}^{2}}}\right\} \\ &\propto &TG\left( {\left. {{W_{ir}}} \right|\mu_{ir}^{W},(\tau_{ir}^{W})^{-1},0,\infty } \right), \end{array} $$

(24)

In the above formula, A(W_ir,H_rj,τ) is shown in (25) respectively.

$$ A(W_{ir},H_{rj},\tau)={\tau \sum\limits_{j} {\left( {{V_{ij}}-\sum\limits_{r^{\prime}\ne r} {{W_{ir^{\prime}}}{H_{r^{\prime}j}}} } \right)}{H_{rj}} + {\tau_{ir}}\mu_{ir}}. $$

(25)

And the parameters $\tau _{ir}^{W}$ and $\mu _{ir}^{W}$ corresponding to variational posterior distribution of W_ir are shown in (26) and (26).

$$ \tau_{ir}^{W} = \tilde \tau \sum\limits_{j} {\tilde H_{rj}^{2}}+{\tilde \tau_{ir}}, $$

(26)

$$ \mu_{ir}^{W} = \frac{{\tilde \tau \sum\limits_{j} {\left( {{V_{ij}} - \sum\limits_{r^{\prime} \ne r} {{{\tilde W}_{ir^{\prime}}}{{\tilde H}_{r^{\prime}j}}} } \right){{\tilde H}_{rj}} + {{\tilde \tau }_{ir}}{{\tilde \mu }_{ir}}} }}{{\tilde \tau \sum\limits_{j} {\tilde H_{rj}^{2}} { +}{{\tilde \tau }_{ir}}}}. $$

(27)

In the same way, the variational posterior distribution of H_rj is also truncated Gaussian distribution, that is, ${H_{rj}} \sim TG\left ({\left .{{H_{rj}}} \right |\mu _{rj}^{H},(\tau _{rj}^{H})^{-1},0,+\infty }\right )$.

$$ \begin{array}{@{}rcl@{}} {q^{*}}\left( {{H_{rj}}} \right) &\propto& \exp \left\{ {{E_{q({{\theta_{-\!{H_{rj}}}}})}}\left[ {\log p\left( {\left. {{V_{ij}}} \right|{W_{ir}},{H_{rj}}} \right) + \log p\left( {\left. {{H_{rj}}} \right|{\mu_{rj}},\tau_{rj}^{- 1},0, + \infty } \right)} \right]} \right\} \times I\left( x \right) \\ &\propto& \exp \left\{ {{E_{q({{\theta_{-\!{H_{rj}}}}})}}\left[ { - \frac{\tau }{2}\sum\limits_{i} {{{\left( {{V_{ij}} - {{\left( {WH} \right)}_{ij}}} \right)}^{2}} - \frac{{{\tau_{rj}}}}{2}{{\left( {{H_{rj}} - \mu_{rj}^{}} \right)}^{2}}} } \right]} \right\} \times I\left( x \right) \\ &\propto& \exp \left\{ {{E_{q({{\theta_{-\!{H_{rj}}}}})}}\left[ {\left( {-\frac{\tau}{2}\sum\limits_{i} {W_{ir}^{2}-\frac{{\tau_{rj}}}2}} \right)H_{rj}^{2}+B(W_{ir},H_{rj},\tau){H_{rj}}} \right]} \right\} \times I\left( x \right) \\ &\propto& \exp \left\{ { - \frac{{\tau_{rj}^{H}}}{2}{{\left( {{H_{rj}} - \mu_{rj}^{H}} \right)}^{2}}} \right\} \\ &\propto& TG\left( {\left. {{H_{rj}}} \right|\mu_{rj}^{H},(\tau_{rj}^{H})^{-1},0, + \infty } \right), \end{array} $$

(28)

The B(W_ir,H_rj,τ) contained in above (28) is shown in (29).

$$ B(W_{ir},H_{rj},\tau)={\tau \sum\limits_{i} {\left( {{V_{ij}} - \sum\limits_{r^{\prime}\ne r} {{W_{ir^{\prime}}}{H_{r^{\prime}j}}} } \right)} {W_{ir}} + {\tau_{rj}}\mu_{rj}}.\\ $$

(29)

The parameters $\tau _{rj}^{H}$ and $\mu _{rj}^{H}$ in the variational posterior distribution of H_rj are written as in (30) and (31).

$$ \tau_{rj}^{H} = \tilde \tau \sum\limits_{i} {\tilde W_{ir}^{2}} {\text{ + }}{\tilde \tau_{rj}}, $$

(30)

$$ \mu_{ir}^{H} = \frac{{\tilde \tau \sum\limits_{i} {\left( {{V_{ij}} - \sum\limits_{r^{\prime} \ne r} {{{\tilde W}_{ir^{\prime}}}{{\tilde H}_{r^{\prime}j}}} } \right){{\tilde W}_{ir}} + {{\tilde \tau }_{rj}}{{\tilde \mu }_{rj}}} }}{{\tilde \tau \sum\limits_{i} {\tilde W_{ir}^{2}} {\text{ + }}{{\tilde \tau }_{rj}}}}. $$

(31)

For the parameter τ, the variational posterior distribution takes the same form as the prior distribution, i.e., $\tau \sim Gamma(\alpha _{\tau }^{*},\beta _{\tau }^{*})$,

$$ \begin{array}{@{}rcl@{}} {q^{*}}\left( \tau \right) &\propto& \exp \left\{ {{E_{q\left( {{\theta_{-\!\tau }}} \right)}}\left[ {\log p\left( {\left. {{V_{ij}}}\right|{W_{ir}},{H_{rj}},{\tau^{-1}}}\right)+ \log p\left( {\left. \tau \right|{\alpha_{\tau} },{\beta_{\tau} }} \right)} \right]} \right\} \\ &\propto& \exp \left\{ {{E_{q\left( {{\theta_{-\!\tau }}}\right)}}\left[ {\frac{{nm}}{2}\log \tau - \frac{\tau }{2}\sum\limits_{i,j} {{{\left( {{V_{ij}} - {{\left( {WH} \right)}_{ij}}} \right)}^{2}} + \left( {{\alpha_{\tau} } - 1} \right)\log \tau - {\beta_{\tau} }\tau } } \right]} \right\} \\ &\propto& \exp \left\{ {\left( {{\alpha_{\tau} } - 1 +\!\frac{{nm}}{2}} \right)\log \tau - \left[ {{\beta_{\tau} } + \frac{1}{2}\sum\limits_{i,j} {{E_{q}}\left[ {{{\left( {{V_{i,j}} - {{\left( {WH} \right)}_{i,j}}} \right)}^{2}}} \right]} } \right]\tau } \right\} \\ &\propto& Gamma\left( \tau|{\alpha_{\tau}^{*},\beta_{\tau}^{*}} \right), \end{array} $$

(32)

where the parameters $\alpha _{\tau }^{*},\beta _{\tau }^{*}$ corresponding to the variational posterior distribution of τ are shown as follows.

$$ \begin{array}{@{}rcl@{}} &&\alpha_{\tau}^{*} = {\alpha_{\tau} } + \frac{{nm}}{2}, \end{array} $$

(33)

$$ \begin{array}{@{}rcl@{}} &&\beta_{\tau}^{*} = {\beta_{\tau} } + \frac{1}{2}\sum\limits_{i,j} {{E_{q}}\left[ {{{\left( {{V_{ij}} - {{\left( {WH} \right)}_{ij}}} \right)}^{2}}} \right]}. \end{array} $$

(34)

The above is the detailed optimization process of TG-NMF model. In fact, the prior distribution of current samples in the TG-INMF model equals to the posterior distribution of previous samples, which can be obtained by the TG-NMF model. Therefore, based on the optimal estimation of TG-NMF model, we then give the optimization process of the TG-INMF model.

The parameters to be optimized in the TG-INMF model can be summarized as $\theta ^{\prime }=\{W_{ir}^{k+1},h_{ir}^{k+1},\tau ^{k+1}\}$. According to (15), the variational posterior distribution of $W_{ir}^{k+1}$ is first derived and $W_{ir}^{k+1}$ obeys the truncated Gaussian distribution with parameters $\tau _{ir}^{W^{k+1}},\mu _{ir}^{W^{k+1}}$.

$$ \begin{array}{@{}rcl@{}} {q^{*}}\!\left( {W_{ir}^{k + 1}}\right)\!&\propto&\!\exp\!\left\{{{E_{q({{\theta_{ - W_{ir}^{k + 1}}}})}}\!\left[{\log \left( {{p_{k + 1}}\left( {\left. {{v^{k + 1}}} \right|\theta^{\prime}} \right)} \right) + \log \left( {p\left. {\left( {W_{ir}^{k + 1}} \right)} \right|\mu\!_{ir}^{W}\!,(\tau_{ir}^{W})^{ - 1}\!,0,\!+\infty } \right)} \right]}\!\right\}\!\times\!I\left( x \right) \\ &\propto&\!\exp\!\left\{{{E_{q({{\theta_{ - W_{ir}^{k + 1}}}})}}\left[ {\!-\frac{{{\tau^{k+1}}}}{2}{{\left( {v_{i}^{k + 1} - {{\left( {{W^{k + 1}}{h^{k + 1}}} \right)}_{i}}} \right)}^{2}} - \frac{{\tau_{ir}^{W}}}{2}{{\left( {W_{ir}^{k + 1} - \mu\!_{ir}^{W}}\right)}^{2}}} \right]} \right\}\!\times\!I\left( x \right) \\ &\propto&\!\exp\!\left\{ { - \frac{{\tau_{ir}^{{W^{k + 1}}}}}{2}\left( {W_{ir}^{k + 1} - \mu_{ir}^{{W^{k+1}}}}\right)}\right\}\!\times\!I\left( x \right) \\ &\propto&\!TG\left( {\left. {W_{ir}^{k + 1}} \right|\mu_{ir}^{{W^{k+1}}}\!,(\tau_{ir}^{{W^{k+1}}})^{ - 1}\!,0, + \infty } \right), \end{array} $$

(35)

And (36) and (37) are the parameters $\tau _{ir}^{W^{k+1}}$, $\mu _{ir}^{W^{k+1}}$ of variational posterior distribution of $W_{ir}^{k+1}$.

$$ \tau_{ir}^{{W^{k+1}}} = \tilde \tau {\left( {\tilde h_{r}^{k + 1}} \right)^{2}} + \tilde \tau_{ir}^{W}, $$

(36)

$$ \mu_{ir}^{{W^{k+1}}} = \frac{{\tilde \tau \left( \! {v_{i}^{k + 1}\! -\! \sum\limits_{r^{\prime} \ne r} {\tilde W_{ir^{\prime}}^{k + 1}\tilde h_{r^{\prime}}^{k + 1}} } \right)\tilde h_{r}^{k + 1} + \tilde \tau_{ir}^{W}\tilde \mu_{ir}^{W}}}{{\tilde \tau {{\left( {\tilde h_{r}^{k + 1}} \right)}^{2}} + \tilde \tau_{ir}^{W}}}. $$

(37)

Secondly, (38) shows the optimization process of the variational posterior distribution corresponding to $h_{r}^{k+1}$,

$$ \begin{array}{@{}rcl@{}} {q^{*}}\left( {h_{r}^{k + 1}} \right) &\propto&\!\exp\!\left\{ {{E_{q({{\theta_{ - h_{r}^{k + 1}}}})}}\left[{\log p\left( {\left. {v_{i}^{k + 1}} \right|\theta^{\prime}} \right) + \log p\left( {\left. {h_{r}^{k + 1}} \right|\mu_{rj}^{H}\!,(\tau_{rj}^{H})^{ - 1}\!,0, + \infty } \right)} \right]} \right\}\!\times\!I\left( x \right) \\ &\propto&\!\exp\!\left\{ {{E_{q({{\theta_{ - h_{r}^{k + 1}}}})}}\left[ {\sum\limits_{i = 1}^{n} { - \frac{{{\tau^{k + 1}}}}{2}{{\left( {v_{i}^{k + 1} - {{\left( {{W^{k + 1}}{h^{k + 1}}} \right)}_{i}}} \right)}^{2}} - \frac{{\tau_{rj}^{H}}}{2}{{\left( {h_{r}^{k + 1} - \mu_{rj}^{H}} \right)}^{2}}} } \right]} \right\}\!\times\!I\left( x \right) \\ &\propto& \exp \left\{ { - \frac{{\tau_{r}^{{h_{k+1}}}}}{2}{{\left( {h_{r}^{k + 1} - \mu_{r}^{{h_{k + 1}}}} \right)}^{2}}} \right\} \\ &\propto&\!TG\!\left( {\left. {h_{r}^{k + 1}} \right|\mu_{r}^{{h^{k + 1}}}\!,(\tau_{r}^{{h^{k + 1}}})^{ - 1}\!,0, + \infty } \right), \end{array} $$

(38)

where the parameters $\tau _{r}^{h^{k+1}}$, $\mu _{r}^{h^{k+1}}$ in the variational posterior distribution of $h_{r}^{k+1}$ are shown below.

$$ \tau_{r}^{{h_{k + 1}}} = \tilde \tau \sum\limits_{i = 1}^{n} {\left( {{{\tilde W}_{k + 1}}} \right)_{ir}^{2} + \tilde \tau_{rj}^{H}}, $$

(39)

$$ \mu_{r}^{{h_{k + 1}}} = \frac{{\sum\limits_{i = 1}^{n} {\tilde \tau \left( {v_{i}^{k + 1} - \sum\limits_{r^{\prime} \ne r} {\tilde W_{ir^{\prime}}^{k + 1}\tilde h_{r^{\prime}}^{k + 1}} } \right)\tilde W_{ir}^{k + 1} + \tilde \tau_{rj}^{H}\tilde \mu_{rj}^{H}} }}{{\tilde \tau \sum\limits_{i = 1}^{n} {{{\left( {\tilde W_{ir}^{k + 1}} \right)}^{2}} + {{\tilde \tau }_{r}}} }}. $$

(40)

Finally, the variational posterior distribution of τ^k+ 1 is illustrated in (41).

$$ \begin{array}{@{}rcl@{}} {q^{*}}\!\left( {{\tau^{k + 1}}} \right) &\propto& \exp \left\{ {{E_{q({{\theta\!_{ - {\tau^{k + 1}}}}})}}\left[ {\log p\left( \left. {v_{i}^{k + 1}} \right|\theta^{\prime}\right) + \log p\left( {\left. {{\tau^{k + 1}}} \right|{\alpha_{\tau}^{*}},{\beta_{\tau}^{*}}}\right)}\right]}\right\} \\ &\propto&\!\exp\!\left\{ {{E_{q({{\theta_{ - {\tau^{k + 1}}}}})}}\left\{{\left( {{\alpha_{\tau}^{*}} - 1 + \frac{n}{2}} \right)\log {\tau^{k + 1}} - \left[ {{\beta_{\tau}^{*}} + \frac{1}{2}\sum\limits_{i = 1}^{n} {{{\left( {v_{i}^{k + 1} - {{\left( {{W^{k + 1}}{h^{k + 1}}} \right)}_{i}}} \right)}^{2}}} } \right]{\tau^{k + 1}}} \right\}} \right\} \\ &\propto& Gamma(\tau^{k+1}|\alpha^{k+1},\beta^{k+1}), \end{array} $$

(41)

where the parameters α^k+ 1, β^k+ 1 of new variational posterior distribution are written as (42) and (43).

$$ \begin{array}{@{}rcl@{}} &&{\alpha^{k+1}}= {\alpha_{\tau}^{*}} + \frac{n}{2}, \end{array} $$

(42)

$$ \begin{array}{@{}rcl@{}} &&{\beta^{k+1}}= {\beta_{\tau}^{*}}+\frac{1}{2}\sum\limits_{i = 1}^{n} {{{\left( {v_{i}^{k + 1} - {{\left( {{W^{k + 1}}{h^{k + 1}}} \right)}_{i}}} \right)}^{2}}}. \end{array} $$

(43)

After obtaining the variational posterior distributions of the parameters, we can derive the optimal updates of $ {W^{k+1}}_{ir} $, $ {h^{k+1}}_{r} $, τ^k+ 1.

$$ { {W^{k+1}_{ir}}} \!\leftarrow\! \mu_{ir}^{{W^{k +1}}} + \frac{1}{{\sqrt {\tau_{ir}^{{W^{k+1}}}} }}\lambda \left( {\! - \!\mu_{ir}^{{W^{k\! +\! 1}}}\sqrt {\tau_{ir}^{{W^{k + 1}}}} } \right), $$

(44)

$$ {h^{k+1}_{r}}\! \leftarrow\! \mu_{r}^{{h^{k +1}}}\! +\! \frac{1}{{\sqrt {\tau_{r}^{{h^{k+1}}}} }}\lambda \left( {\! -\! \mu_{r}^{{h^{k+1}}}\sqrt {\tau_{r}^{{h^{k + 1}}}} } \right), $$

(45)

$$ \tau^{k+1} \leftarrow \frac{{\alpha_{\tau}^{*}}}{{\beta_{\tau}^{*}}}. $$

(46)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, L., Yan, L., Yang, X. et al. Bayesian nonnegative matrix factorization in an incremental manner for data representation. Appl Intell 53, 9580–9597 (2023). https://doi.org/10.1007/s10489-022-03522-3

Download citation

Accepted: 14 March 2022
Published: 10 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03522-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian nonnegative matrix factorization in an incremental manner for data representation

Abstract

Access this article

Similar content being viewed by others

Adaptive factorization rank selection-based NMF and its application in tumor recognition

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Fast and accurate pseudoinverse with sparse matrix reordering and incremental approach

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian nonnegative matrix factorization in an incremental manner for data representation

Abstract

Access this article

Similar content being viewed by others

Adaptive factorization rank selection-based NMF and its application in tumor recognition

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Fast and accurate pseudoinverse with sparse matrix reordering and incremental approach

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation