ABSTRACT
Traditional data mining techniques are designed to model a single type of heterogeneity, such as multi-task learning for modeling task heterogeneity, multi-view learning for modeling view heterogeneity, etc. Recently, a variety of real applications emerged, which exhibit dual heterogeneity, namely both task heterogeneity and view heterogeneity. Examples include insider threat detection across multiple organizations, web image classification in different domains, etc. Existing methods for addressing such problems typically assume that multiple tasks are equally related and multiple views are equally consistent, which limits their application in complex settings with varying task relatedness and view consistency. In this paper, we advance state-of-the-art techniques by adaptively modeling task relatedness and view consistency via a nonparametric Bayes model: we model task relatedness using normal penalty with sparse covariances, and view consistency using matrix Dirichlet process. Based on this model, we propose the NOBLE algorithm using an efficient Gibbs sampler. Experimental results on multiple real data sets demonstrate the effectiveness of the proposed algorithm.
Supplemental Material
- R. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853, 2005. Google ScholarDigital Library
- C. Antoniak. Mixtures of dirichlet processes with applications to bayeisan nonparametric problems. The Annals of Statistics, 2:1152--1174, 1974.Google ScholarCross Ref
- C. Archambeau, S. Guo, and O. Zoeter. Sparse bayesian multi-task learning. NIPS, 2011.Google Scholar
- A. Asuncion and D. Newman. UCI machine learning repository, 2007.Google Scholar
- B. Bakker and T. Hesks. Task clustering and gating for bayesian multitask learning. JMLR, 4:83--99, 2003. Google ScholarDigital Library
- D. Blackwell and J. MacQueen. Ferguson distributions via polya urn schemes. The Annals of Statistics, 1:353--355, 1973.Google ScholarCross Ref
- D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. In NIPS, 2003.Google ScholarDigital Library
- A. Blum and T. M. Mitchell. Combining labeled and unlabeled sata with co-training. In COLT, 1998. Google ScholarDigital Library
- D. Burr and H. Doss. A bayesian semiparametric model for random-effects meta-analysis. Journal of the American Statistical Association, 100(469):242--251, 2005.Google ScholarCross Ref
- J. Chen, T. Lei, J. Liu, and J. Ye. A convex formulation for learning shared structures from multiple tasks. ICML, 2009. Google ScholarDigital Library
- J. Chen, J. Liu, and J. Ye. Learning incoherent sparse and low-rank patterns from multiple tasks. In KDD, pages 1179--1188, 2010. Google ScholarDigital Library
- J. Chen, J. Zhou, and J. Ye. Integrating low-rank and group-sparse structures for robust multi-task learning. In KDD, pages 42--50, 2011. Google ScholarDigital Library
- C. Christoudias, R. Urtasun, and T. Darrell. Multi-view learning in the presence of view disagreement. In UAI, pages 88--96, 2008.Google Scholar
- L. Ding, A. Yilmaz, and R. Yan. Interactive image segmentation using dirichlet process multiple-view learning. IEEE Transactions on Image Processing, 21(4):2119--2129, 2012.Google ScholarDigital Library
- D. Dunson, Y. Xue, and L. Carin. The matrix stick- breaking process: exible bayes meta analysis. Journal of the American Statistical Association, 103:317--327, 2008.Google ScholarCross Ref
- J. Farquhar, D. Hardoon, H. Meng, J. Shawe-Taylor, and S. Szedmak. Two view learning: Svm-2k, theory and practice. NIPS, 2005.Google Scholar
- T. Ferguson. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1:209--230, 1973.Google ScholarCross Ref
- H. Gavert, J. Hurri, J. Sarela, and A. Hyvarinen. The fastica package for matlab. http://research.ics.aalto.fi/ica/fastica/, 2005.Google Scholar
- S. Han, X. Liao, and L. Carin. Cross-domain multitask learning with latent probit models. NIPS, 2012.Google Scholar
- M. Harel and S. Mannor. Learning from multiple outlooks. In ICML, pages 401--408, 2011.Google ScholarDigital Library
- J. He and R. Lawrence. A graphbased framework for multi-task multi-view learning. In ICML, pages 25--32, 2011.Google Scholar
- J. He, Y. Liu, and Q. Yang. Linking heterogeneous input spaces with pivots for multi-task learning. In SDM, 2014.Google ScholarCross Ref
- X. Jin, F. Zhuang, S. Wang, Q. He, and Z. Shi. Shared structure learning for multiple tasks with multiple views. In ECML/PKDD (2), pages 353--368, 2013.Google Scholar
- S. M. Kakade and D. P. Foster. Multi-view regression via canonical correlation analysis. In COLT, pages 82--96, 2007. Google ScholarDigital Library
- A. Kumar, P. Rai, and H. D. III. Co-regularized multi-view spectral clustering. In NIPS, pages 1413--1421, 2011.Google ScholarDigital Library
- Y.-X. Li, S. Ji, S. Kumar, J. Ye, and Z.-H. Zhou. Drosophila gene expression pattern annotation through multi-instance multi-label learning. IEEE/ACM Trans. Comput. Biology Bioinform., 9(1):98--112, 2012. Google ScholarDigital Library
- Q. Liu, X. Liao, and L. Carin. Semi-supervised multi-task leraning. NIPS, 2007.Google Scholar
- I. Muslea, S. Minton, and C. A. Knoblock. Active + semi-supervised learning = robust multi-view learning. In ICML, pages 435--442, 2002. Google ScholarDigital Library
- K. Nigam and R. Ghani. Analyzing the effectiveness and applicability of co-training. In CIKM, pages 86--93, 2000. Google ScholarDigital Library
- J. O'Sullivan and S. Thrun. Discovering structure in multiple learning tasks: The tc algorithm. ICML, pages 489--497, 1996.Google Scholar
- M. Tipping and C. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 61:611--622, 1999.Google ScholarCross Ref
- L. van der Maaten, E. Postma, and H. van den Herik. Dimensionality reduction: A comparative review. Tilburg University Technical Report, TiCC-TR 2009-005, 2009.Google Scholar
- H. Wang, F. Nie, H. Huang, S. L. Risacher, A. J. Saykin, and L. Shen. Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics, 28(12):127--136, 2012. Google ScholarDigital Library
- W. Wang and Z.-H. Zhou. A new analysis of co-training. In ICML, pages 1135--1142, 2010.Google ScholarDigital Library
- Y. Xue, X. Liao, L. Carin, and B. Krishnapuram. Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research, 8:35--63, 2007. Google ScholarDigital Library
- C. Yau, O. Papaspiliopoulos, G. Roberts, and C. Holmes. Nonparametric hidden markov models with application to the analysis of copy-number-variation in mammalian genomes. Journal of Royal Statistical Society: Series B, 73(1):37--57, 2010.Google ScholarCross Ref
- K. Yu and W. Chu. Gaussian process models for link analysis and transfer learning. In NIPS, 2007.Google Scholar
- K. Yu, A. Schwaighofer, and V. Tresp. Learning gaussian processes from multiple tasks. ICML, 2005. Google ScholarDigital Library
- K. Yu, A. Schwaighofer, V. Tresp, W. Ma, and H. Zhang. Collaborative ensemble learning: Combining collaborative and content-based information filtering via hierarchical bayes. UAI, 2003. Google ScholarDigital Library
- K. Yu, V. Tresp, and S. Yu. A nonparametric hierarchical bayesian framework for information filtering. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004. Google ScholarDigital Library
- D. Zhang, J. He, and R. D. Lawrence. Mi2ls: multi-instance learning from multiple informationsources. In KDD, pages 149--157, 2013. Google ScholarDigital Library
- D. Zhang, J. He, Y. Liu, L. Si, and R. D. Lawrence. Multi-view transfer learning with a large margin approach. In KDD, pages 1208--1216, 2011. Google ScholarDigital Library
- J. Zhang and J. Huan. Inductive multi-task learning with multiple view data. In KDD, pages 543--551, 2012. Google ScholarDigital Library
- Y. Zhang and D.-Y. Yeung. A convex formulation for learning task relationships in multi-task learning. CoRR, abs/1203.3536, 2012.Google Scholar
- J. Zhou, J. Chen, and J. Ye. Clustered multi-task learning via alternating structure optimization. In NIPS, pages 702--710, 2011.Google ScholarDigital Library
Index Terms
Learning with dual heterogeneity: a nonparametric bayes model
Recommendations
Asymptotic Bayesian structure learning using graph supports for Gaussian graphical models
The theory of Gaussian graphical models is a powerful tool for independence analysis between continuous variables. In this framework, various methods have been conceived to infer independence relations from data samples. However, most of them result in ...
Likelihood-free approximate Gibbs sampling
AbstractLikelihood-free methods such as approximate Bayesian computation (ABC) have extended the reach of statistical inference to problems with computationally intractable likelihoods. Such approaches perform well for small-to-moderate dimensional ...
A Bayesian model selection method with applications
In this paper, we consider Bayesian model selection using the well-known Bayes factor. A method on the basis of path sampling for computing the ratio of two normalizing constants involved in the Bayes factor is proposed. The key idea is to construct a ...
Comments