GAME: GAussian Mixture Error-based meta-learning architecture

Dong, Jinhe; Shi, Jun; Gao, Yue; Ying, Shihui

doi:10.1007/s00521-023-08843-z

GAME: GAussian Mixture Error-based meta-learning architecture

Review
Published: 22 July 2023

Volume 35, pages 20445–20461, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Jinhe Dong¹,
Jun Shi²,
Yue Gao^3,4 &
…
Shihui Ying ORCID: orcid.org/0000-0001-9423-0146¹

424 Accesses
Explore all metrics

Abstract

In supervised learning, the gap between the truth label and the model output is always portrayed by an error function, and a fixed error function corresponds to a specific noise distribution that provides for model optimization. However, the actual noise usually has a much more complex structure. To be better fit for it, in this paper, we propose a robust noise model that embeds a mixture of Gaussian (MoG) noise modeling strategy into a baseline classification model, which is selected as the Gaussian mixture model (GMM) here. Further, to facilitate the automatic selection of the number of mixture components, we apply the penalized likelihood method. Then, we utilize an alternative strategy to update the parameters of the noisy model and the basic GMM classifier. From the meta-learning perspective, the proposed model offers a novel approach to defining the hyperparameters from the error representation. Finally, we compare the proposed approach with three conventional and related classification methods on the synthetic, two benchmark handwriting recognition datasets and the Yale Face dataset. In addition, we embed the noise modeling strategy into the semantic segmentation task. The numerical results validate that our approach achieves the best performance and the efficiency of MoG noise modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Segmentation Based on Multiscale Initialized Gaussian Mixtures

GMM discriminant analysis with noisy label for each class

Article 01 June 2020

A flexible probabilistic framework for large-margin mixture of experts

Article 05 June 2019

Notes

MNIST and USPS Data availability statements are released at https://cs.nyu.edu/~roweis/data.html.
Yale Data availability statement is released at http://cvc.yale.edu/projects/yalefaces/yalefaces.html.
GTA5 Data availability statement is released at https://download.visinf.tu-darmstadt.de/data/from_games/.
CityScapes Data availability statements is released at https://download.visinf.tu-darmstadt.de/data/from_games/.

References

Al-Shedivat M, Bansal T, Burda Y, Sutskever I, Mordatch I, Abbeel P (2017) Continuous adaptation via meta-learning in nonstationary and competitive environments. arXiv preprint arXiv:1710.03641
Antoniou A, Edwards H, Storkey A (2018) How to train your MAML. arXiv preprint arXiv:1810.09502
Babacan SD, Luessi M, Molina R, Katsaggelos AK (2012) Sparse Bayesian methods for low-rank matrix estimation. IEEE Trans Signal Process 60(8):3964–3977
Article MathSciNet MATH Google Scholar
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):1–37
Article MathSciNet MATH Google Scholar
Chen P, Wang N, Zhang NL, Yeung DY (2015) Bayesian adaptive matrix factorization with automatic model selection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1284–1292
Chen X, Han Z, Wang Y, Zhao Q, Meng D, Tang Y (2016) Robust tensor factorization with unknown noise. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5213–5221
Chen Y, Li W, Chen X, Gool LV (2019) Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1841–1850
Choi J, Kim T, Kim C (2019) Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6830–6840
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Statist Soc Ser B (Methodological) 39(1):1–22
MathSciNet MATH Google Scholar
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135
Gao Y, Zhang Z, Lin H, Zhao X, Zou C (2020) Hypergraph learning: methods and practices. IEEE Trans Pattern Anal Mach Intell 99:1–1
Article Google Scholar
Gong R, Li W, Chen Y, Gool LV (2019) Dlow: Domain flow for adaptation and generalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2477–2486
Grant E, Finn C, Levine S, Darrell T, Griffiths T (2018) Recasting gradient-based meta-learning as hierarchical bayes. arXiv preprint arXiv:1801.08930
Guo Y, Wang W, Wang X (2021) A robust linear regression feature selection method for data sets with unknown noise. IEEE Trans Knowl Data Eng 35(1):31–44
Google Scholar
Guo Y, Wang X, Ying S (2023) Domain adaptive semantic segmentation by optimal transport. arXiv preprint arXiv:2303.16435
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York
Book MATH Google Scholar
Hochreiter S, Younger AS, Conwell PR (2001) Learning to learn using gradient descent. In: International Conference on Artificial Neural Networks, pp. 87–94
Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: Cycle-consistent adversarial domain adaptation. In: International conference on machine learning, pp. 1989–1998. Pmlr
Hospedales TM, Antoniou A, Micaelli P, Storkey AJ (2021) Meta-learning in neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 44(9):5149–5169
Google Scholar
Huang T, Peng H, Zhang K (2017) Model selection for Gaussian mixture models. Statist Sinica 27(1):147–169
MathSciNet MATH Google Scholar
Jain LC, Seera M, Lim CP, Balasubramaniam P (2014) A review of online learning in supervised neural networks. Neural Comput Appl 25(3–4):491–509
Article Google Scholar
Jiang W, Kwok J, Zhang Y (2022) Subspace learning for effective meta-learning. In: International Conference on Machine Learning, pp. 10177–10194. PMLR
Karlis D, Xekalaki E (2003) Choosing initial values for the EM algorithm for finite mixtures. Comput Stat Data Anal 41(3–4):577–590
Article MathSciNet MATH Google Scholar
Lakshminarayanan B, Bouchard G, Archambeau C (2011) Robust Bayesian matrix factorisation. In: 14th International Conference on Artificial Intelligence and Statistics, pp. 425–433
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Lee DB, Min D, Lee S, Hwang SJ (2020) Meta-gmvae: mixture of Gaussian vae for unsupervised meta-learning. In: International Conference on Learning Representations
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10657–10665
Lee Y, Choi S (2018) Gradient-based meta-learning with learned layerwise metric and subspace. In: International Conference on Machine Learning, pp. 2927–2936
Li B, Zhang Y, Lin Z, Lu H (2015) Subspace clustering by mixture of Gaussian regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2094–2102
Li Y, Yang Y, Che J, Zhang L (2019) Predicting the number of nearest neighbor for KNN classifier. IAENG Int J Comput Sci 46(4):662–669
Google Scholar
Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835
Liu JW, Ren ZP, Lu RK, Luo XL (2022) GMM discriminant analysis with noisy label for each class. Neural Comput Appl 33(4):1171–1191
Article Google Scholar
Liu X, Bourennane S, Fossati C (2012) Denoising of hyperspectral images using the parafac model and statistical performance analysis. IEEE Trans Geosci Remote Sens 50(10):3717–3724
Article Google Scholar
Long M, Wang J, Ding G, Sun J, Yu PS (2013) Transfer feature learning with joint distribution adaptation. In: IEEE international Conference on Computer Vision, pp. 2200–2207
Luo Y, Liu P, Guan T, Yu J, Yang Y (2019) Significance-aware information bottleneck for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6778–6787
Luo Y, Zheng L, Guan T, Yu J, Yang Y (2019) Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2507–2516
Ma TH, Xu Z, Meng D, Zhao XL (2020) Hyperspectral image restoration combining intrinsic image characterization with robust noise modeling. IEEE J Sel Topics Appl Earth Observ Remote Sens 14:1628–1644
Article Google Scholar
Ma Z, Leijon A (2011) Bayesian estimation of beta mixture models with variational inference. IEEE Trans Pattern Anal Mach Intell 33(11):2160–2173
Article Google Scholar
Ma Z, Teschendorff AE, Leijon A, Qiao Y, Zhang H, Guo J (2015) Variational Bayesian matrix factorization for bounded support data. IEEE Trans Pattern Anal Mach Intell 37(4):876–889
Article Google Scholar
Ma Z, Xue JH, Leijon A, Tan ZH, Yang Z, Guo J (2016) Decorrelation of neutral vector variables: theory and applications. IEEE Trans Neural Netw Learn Syst 29(1):129–143
Article MathSciNet Google Scholar
Manton JH, Amblard PO (2015) A primer on reproducing kernel Hilbert spaces. Found Trends Signal Process 8(1–2):1–126
Article MathSciNet MATH Google Scholar
Meng D, De La Torre F (2013) Robust matrix factorization with unknown noise. In: IEEE International Conference on Computer Vision, pp. 1337–1344
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999
Papa JP, Fonseca LM, de Carvalho LA (2010) Projections onto convex sets through particle swarm optimization and its application for remote sensing image restoration. Pattern Recognit Lett 31(13):1876–1886
Article Google Scholar
Renard N, Bourennane S, Blanc-Talon J (2008) Denoising and dimensionality reduction using multilinear tools for hyperspectral images. IEEE Geosci Remote Sens Lett 5(2):138–142
Article Google Scholar
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2018) Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850. PMLR
Shahin I, Nassif AB, Hamsa S (2020) Novel cascaded Gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments. Neural Comput Appl 32(7):2575–2587
Article Google Scholar
Sinha A, Malo P, Deb K (2017) A review on bilevel optimization: from classical to evolutionary approaches and applications. IEEE Trans Evolut Comput 22(2):276–295
Article Google Scholar
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems 30
Tran D, Wagner M (2000) Fuzzy entropy clustering. In: Ninth IEEE International Conference on Fuzzy Systems, vol. 1, pp. 152–157. IEEE
Tsai YH, Hung WC, Schulter S, Sohn K, Yang MH, Chandraker M (2018) Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7472–7481
Wang H, Zhang C, Zhang S (2021) Robust Bayesian matrix decomposition with mixture of Gaussian noise. Neurocomputing 449:108–116
Article Google Scholar
Wang J, Jiang J (2021) Unsupervised deep clustering via adaptive GMM modeling and optimization. Neurocomputing 433:199–211
Article Google Scholar
Wang R, Zhou J, Jiang H, Han S, Wang L, Wang D, Chen Y (2021) A general transfer learning-based Gaussian mixture model for clustering. Int J Fuzzy Syst 23(3):776–793
Article Google Scholar
Wang ZK, Yang ZB, Li HQ, Wu SM, Chen XF (2021) Robust sparse representation model for blade tip timing. J Sound Vibr 500(4):1–21
Google Scholar
Xu W, Zhou Y, Wang X, Chen W (2020) Mog-based robust sparse representation for seismic erratic noise suppression. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Yu C, Ning Y, Qin Y, Su W, Zhao X (2021) Multi-label fault diagnosis of rolling bearing based on meta-learning. Neural Comput Appl 33(10):5393–5407
Article Google Scholar
Yu L, Antoni J, Deng J, Li C, Jiang W (2022) Low-rank Gaussian mixture modeling of space-snapshot representation of microphone array measurements for acoustic imaging in a complex noisy environment. Mech Syst Signal Process 165:1–20
Article Google Scholar
Zhang HR, Qian J, Qu HL, Min F (2022) A Mixture-of-Gaussians model for estimating the magic barrier of the recommender system. Appl Soft Comput 114:1–11
Article Google Scholar
Zhao Q, Meng D, Xu Z, Zuo W, Yan Y (2015) L1-norm low-rank matrix factorization by variational Bayesian method. IEEE Trans Neural Netw Learn Syst 26(4):825–839
Article MathSciNet Google Scholar
Zhao Q, Meng D, Xu Z, Zuo W, Zhang L (2014) Robust principal component analysis with complex noise. In: International Conference on Machine Learning, pp. 55–63
Zhu JY, Park T, Isola P, Efros, AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232

Download references

Author information

Authors and Affiliations

Department of Mathematics, Shanghai University, Shanghai, 200444, China
Jinhe Dong & Shihui Ying
School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China
Jun Shi
BNRist, KLISS, School of Software, Tsinghua University, Beijing, 100084, China
Yue Gao
THUIBCS, Tsinghua University, Beijing, 100084, China
Yue Gao

Authors

Jinhe Dong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yue Gao
View author publications
You can also search for this author in PubMed Google Scholar
Shihui Ying
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shihui Ying.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Natural Science Foundation of China (No. 11971296) and the National Key R & D Program of China (No. 2021YFA1003004).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dong, J., Shi, J., Gao, Y. et al. GAME: GAussian Mixture Error-based meta-learning architecture. Neural Comput & Applic 35, 20445–20461 (2023). https://doi.org/10.1007/s00521-023-08843-z

Download citation

Received: 28 August 2022
Accepted: 28 June 2023
Published: 22 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00521-023-08843-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GAME: GAussian Mixture Error-based meta-learning architecture

Abstract

Access this article

Similar content being viewed by others

Image Segmentation Based on Multiscale Initialized Gaussian Mixtures

GMM discriminant analysis with noisy label for each class

A flexible probabilistic framework for large-margin mixture of experts

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GAME: GAussian Mixture Error-based meta-learning architecture

Abstract

Access this article

Similar content being viewed by others

Image Segmentation Based on Multiscale Initialized Gaussian Mixtures

GMM discriminant analysis with noisy label for each class

A flexible probabilistic framework for large-margin mixture of experts

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation