ABSTRACT
Recently, learning deep models from dense data has received a lot of attention in tasks such as object recognition and signal processing. However, when dealing with non-sensory data about real-world entities, data is often sparse; for example people interaction with products in e-Commerce, people interacting with each other in social networks or word sequences in natural language. In this talk, I will share lessons learned over the past 10 years when learning predictive models based on sparse data: 1) how to scale the inference algorithms to distributed data setting, 2) how to automate the learning process by reducing the amount of hyper-parameters to zero, 3) how to deal with Zipf distributions when learning resource-constrained models, and 4) how to combine dense and sparse-learning algorithms. The talk will be drawing from many real-world experiences I gathered over the past decade in applications of the techniques in gaming, search, advertising and recommendations of systems developed at Microsoft, Facebook and Amazon.
Index Terms
- Learning Sparse Models at Scale
Recommendations
Efficient gradient descent algorithm for sparse models with application in learning-to-rank
Recently, learning-to-rank has attracted considerable attention. Although significant research efforts have been focused on learning-to-rank, it is not the case for the problem of learning sparse models for ranking. In this paper, we consider the sparse ...
Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm
Learning-to-rank for information retrieval has gained increasing interest in recent years. Inspired by the success of sparse models, we consider the problem of sparse learning-to-rank, where the learned ranking models are constrained to be with only a ...
Multi-task Sparse Structure Learning
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementMulti-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously. While sometimes the underlying task relationship structure is known, often the structure needs to be estimated from data at hand. In ...
Comments