Abstract
Building statistical models to explain the association between responses (output) and predictors (input) is critical in many real applications. In reality, responses may not be independent. A promising direction is to predict related responses together (e.g. Multi-task LASSO). However, not all responses have the same degree of relatedness. Sparse Gaussian conditional random field (SGCRF) was developed to learn the degree of relatedness automatically from the samples without any prior knowledge. In real cases, features (both predictors and responses) are not arbitrary, but are dominated by a (smaller) set of related latent factors, e.g. clusters. SGCRF does not capture these latent relations in the model. Being able to model these relations could result in more accurate association between responses and predictors. In this paper, we propose a novel (mixed membership) hierarchical Bayesian model, namely M\(^2\)GCRF, to capture this phenomenon (in terms of clusters). We develop a variational Expectation-Maximization algorithm to infer the latent relations and association matrices. We show that M\(^2\)GCRF clearly outperforms existing methods for both synthetic and real datasets, and the association matrices identified by M\(^2\)GCRF are more accurate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A task refers to the prediction task for a response based on the set of given predictors.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)
Allenby, G.M., Rossi, P.E.: Marketing models of consumer heterogeneity. J. Econometr. 89(1), 57–78 (1998)
Boutanaev, A.M., Kalmykova, A.I., Shevelyov, Y.Y., Nurminsky, D.I.: Large clusters of co-expressed genes in the drosophila genome. Nature 420(6916), 666–669 (2002)
Chen, X., Shi, X., Xu, X., Wang, Z., Mills, R., Lee, C., Xu, J.: A two-graph guided multi-task lasso approach for eqtl mapping. In: International Conference on Artificial Intelligence and Statistics, pp. 208–217 (2012)
Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 109–117. ACM (2004)
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso (2010). arXiv:1001.0736
Frot, B., Jostins, L., McVean, G.: Latent variable model selection for gaussian conditional random fields (2015). arXiv:1512.06412
Gong, P., Ye, J., Zhang, C.: Robust multi-task feature learning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 895–903. ACM (2012)
Gupta, S., Phung, D., Venkatesh, S.: Factorial multi-task learning: a bayesian nonparametric approach. In: Proceedings of the 30th International Conference on Machine Learning, pp. 657–665 (2013)
Jalali, A., Sanghavi, S., Ruan, C., Ravikumar, P.K.: A dirty model for multi-task learning. In: NIPS, pp. 964–972 (2010)
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)
Karlebach, G., Shamir, R.: Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9(10), 770–780 (2008)
Kim, S., Xing, E.P.: Tree-guided group lasso for multi-task regression with structured sparsity. In: Proceedings of the 27th International Conference on Machine Learning, pp. 543–550 (2010)
Lee, S., Zhu, J., Xing, E.P.: Adaptive multi-task lasso: with application to eQTL detection. In: NIPS, pp. 1306–1314 (2010)
Logothetis, A., Krishnamurthy, V.: Expectation maximization algorithms for map estimation of jump Markov linear systems. IEEE Trans. Signal Process. 47(8), 2139–2156 (1999)
McCarter, C., Kim, S.: Large-scale optimization algorithms for sparse conditional gaussian graphical models. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp. 528–537 (2016)
Pardoe, D., Stone, P.: The 2007 tac scm prediction challenge. In: Ketter, W., La Poutré, H., Sadeh, N., Shehory, O., Walsh, W. (eds.) AMEC/TADA -2008. LNBIP, vol. 44. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15237-5
Passos, A., Rai, P., Wainer, J., Daume, H.: Flexible modeling of latent task structures in multitask learning. In: Proceedings of the 29th ICML Conference, pp. 1103–1110 (2012)
Sohn, K.A., Kim, S.: Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. In: International Conference on Artificial Intelligence and Statistics, pp. 1081–1089 (2012)
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Mach. Learn. 104(1), 55–98 (2016)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodological), 267–288 (1996)
Wille, A., Zimmermann, P., Vranová, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelic, A., von Rohr, P., Thiele, L., et al.: Sparse graphical gaussian modeling of the isoprenoid gene network in arabidopsis thaliana. Genome Biol. 5(11), R92 (2004)
Wytock, M., Kolter, Z.: Sparse gaussian conditional random fields: algorithms, theory, and application to energy forecasting. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1265–1273 (2013)
Zhong, L.W., Kwok, J.T.Y.: Convex multitask learning with flexible task clusters. In: Proceedings of the 29th International Conference on Machine Learning, p. 49 (2012)
Zhou, Q., Zhao, Q.: Flexible clustered multi-task learning by learning representative tasks. IEEE Trans. PAMI 38(2), 266 (2016)
Acknowledgement
This work was supported by Hong Kong RGC Ref No. UGC/IDS14/16 and RGC Project No. CityU C1008-16G, AoE/M-403/16.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yang, J., Leung, H.C.M., Yiu, S.M., Chin, F.Y.L. (2017). Mixed Membership Sparse Gaussian Conditional Random Fields. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-69179-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69178-7
Online ISBN: 978-3-319-69179-4
eBook Packages: Computer ScienceComputer Science (R0)