ABSTRACT
We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and estimates local structure that is induced by interactions among the dyads through a discrete latent factor model. The discovered latent factors provide a redictive model that is both accurate and interpretable. We illustrate our method by working in a framework of generalized linear models, which include commonly used regression techniques like linear regression, logistic regression and Poisson regression as special cases. We also provide scalable generalized EM-based algorithms for model fitting using both "hard" and "soft" cluster assignments. We demonstrate the generality and efficacy of our approach through large scale simulation studies and analysis of datasets obtained from certain real-world movie recommendation and internet advertising applications.
Supplemental Material
- M. Aitkin. A general maximum likelihood analysis of overdispersion in generalized linear models. Journal of Statistics and Computing, 6(3):1573--1375, September 1996. Google ScholarDigital Library
- A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D. Modha. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. JMLR, 2007. to appear. Google ScholarDigital Library
- A. Banerjee, S. Merugu, I. Dhillon, and J. Ghosh. Clustering with Bregman divergences. JMLR, 6:1705--1749, 2005. Google ScholarDigital Library
- D. Chakrabarti, S. Papadimitriou, D. Modha, and C. Faloutsos. Fully automatic cross-associations. In KDD, 2004. Google ScholarDigital Library
- D. Chickering, D. Heckerman, C. Meek, J. C. Platt, and B. Thiesson. Targeted internet advertising using predictive clustering and linear programming. http://research.microsoft.com/meek/papers/goal-oriented.ps.Google Scholar
- I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. In KDD, 2003. Google ScholarDigital Library
- C. Fernandez and P. J. Green. Modelling spatially correlated data via mixtures: a Bayesian approach. Journal of Royal Statistics Society Series B, (4):805--826, 2002.Google ScholarCross Ref
- G. Golub and C. Loan. Matrix Computations. John Hopkins University Press, Baltimore, MD., 1989.Google Scholar
- Movielens data set. http://www.cs.umn.edu/Research/GroupLens/data/ml-data.tar.gz.Google Scholar
- A. Gunawardana and W. Byrne. Convergence theorems for generalized alternating minimization procedures. JMLR, 6:2049--2073, 2005. Google ScholarDigital Library
- P. Hoff, A. Raftery, and M. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97:1090--1098, 2002.Google ScholarCross Ref
- T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval, pages 50--57, Berkeley, California, August 1999. Google ScholarDigital Library
- D. L. Lee and S. Seung. Algorithms for non-negative matrix factorization. In NIPS, pages 556--562, 2001.Google ScholarDigital Library
- B. Long, X. Wu, Z. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In KDD, 2006. Google ScholarDigital Library
- S. C. Madeira and A. L. Oliveira. Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Computational Biology and Bioinformatics, 1(1):24--45, 2004. Google ScholarDigital Library
- P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman & Hall/CRC, 1989.Google Scholar
- S. Merugu. Distributed Learning using Generative Models. PhD thesis, Dept. of ECE, Univ. of Texas at Austin, 2006. Google ScholarDigital Library
- T. M. Mitchell. Machine Learning. McGraw-Hill Intl, 1997. Google ScholarDigital Library
- R. Neal and G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, pages 355--368. MIT Press, 1998. Google ScholarCross Ref
- K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077--1087, 2001.Google ScholarCross Ref
- M. Pazzani. A framework for collaborative, content-based and demographic filtering. Artificial Intelligence Review, (5-6):393--408, 1999. Google ScholarDigital Library
- J. Rasbash and H. Goldstein. Efficient analysis of mixed hierarchical and cross-classified random structures using a multilevel model. Journal of Educational Statistics, (4):337--350, 1994.Google Scholar
- P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of the ACM Conference on CSCW, pages 175--186, 1994. Google ScholarDigital Library
Index Terms
- Predictive discrete latent factor models for large scale dyadic data
Recommendations
Mining for the most certain predictions from dyadic data
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningIn several applications involving regression or classification, along with making predictions it is important to assess how accurate or reliable individual predictions are. This is particularly important in cases where due to finite resources or domain ...
A semiparametric latent factor model for large scale temporal data with heteroscedasticity
AbstractLarge scale temporal data have flourished in a vast array of applications, and their sophisticated structures, especially the heteroscedasticity among subjects with inter- and intra-temporal dependence, have fueled a great demand for ...
Comments