Abstract
In multi-task learning, different but related tasks are solved simultaneously. Extracting and utilizing relationships between these tasks can be very helpful for learning predictors with strong generalization ability. Unfortunately, the optimization objectives of multi-task learning are commonly non-convex. Traditional optimization methods based on gradient are limited in those non-convex problems. Previous studies mainly focused on transforming the objective function to be convex. But those methods will distort the original intention. This paper tries to solve the original optimization objective by applying derivative-free methods, which is able to solve complex non-convex problems but usually suffer from slow convergence speed. In this paper, we investigate combining derivative-free and gradient optimization methods to inherit the advantages of the both. We apply this mixed method to solve multi-task learning problems with a low-rank constraint directly. Experiment results show that this method can achieve better optimization performance than the derivative-free and the gradient methods alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abernethy, J., Bach, F., Evgeniou, T., Vert, J.: Low-rank matrix factorization with attributes. In arXiv preprint (2006)
Abernethy, J., Bach, F., Evgeniou, T., Vert, J.: A new approach to collaborative filtering: operator estimation with spectral regularization. J. Mach. Learn. Res. 10, 803–826 (2009)
Agarwal, A., Daume, H., Gerber, S.: Learning multiple tasks using manifold regularization, pp. 46–54 (2010)
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243–272 (2008)
Argyriou, A., Micchelli, C., Pontil, M., Ying, Y.: A spectral regularization framework for multitask structure learning. In: Advances in Neural Information Processing Systems 20, pp. 25–32 (2008)
Beyer, H.-G., Schwefel, H.-P.: Evolution strategies: a comprehensive introduction. J. Natural Comput. 1, 3–52 (2002)
Bickel, S., Bogojeska, J., Lengauer, T., Scheffer, T.: Multi-task learning for HIV therapy screening. In: Proceedings of the 25th International Conference on Machine Learning, pp. 56–63 (2008)
Boyd, S., Lieven, V.: Convex Optimization. Cambridge University Press, New York (2004)
Brochu, E., Cora, V.-M., Freitas, N.-D.: A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. In arXiv preprint (2010)
Chen, J., Liu, J., Ye, J.: Learning incoherent sparse and low-rank patterns from multiple tasks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1179–1188 (2010)
Dai, Y.H., Yuan, Y.-X.: Alternate minimization gradient method. IMA J. Numer. Anal. 23, 377–393 (2003)
Evgeniou, T., Pontil, M.: Regularized multiCtask learning. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 109–117 (2004)
Fazel, M.: Matrix rank minimization with applications. Stanford University (2002)
Ji, S., Ye, J.: An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 457–464 (2009)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, pp. 1942–1948 (1995)
Obozinski, G., Taskar, B., Jordan, M.: Joint covariate selection, joint subspace selection for multiple classification problems. Stat. Comput. 20, 231–252 (2010)
Qian, C., Yu, Y., Zhou, Z.-H.: Pareto ensemble pruning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI15), pp. 2935–2941 (2015)
Qian, C., Yu, Y., Zhou, Z.-H.: Subset selection by pareto optimization. In: Advances in Neural Information Processing Systems 28 (NIPS15) (2015)
Vandenberghe, L., Boyd, S.: Semidefinite programming. In: SIAM Review, pp. 49–95 (1996)
Wang, X.-G., Zhang, C., Zhang, Z.-Y.: Boosted multi-task learning for face verification with applications to web image and video search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 142–149 (2009)
Yu, Y., Qian, H.: The sampling-and-learning framework: a statistical view of evolutionary algorithm. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 149–158 (2014)
Yu, Y., Qian, H., Hu, Y.-Q.: Derivative-free optimization via classification. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence (2016)
Yu, Y., Yao, X., Zhou, Z.-H.: On the approximation ability of evolutionary optimization with application to minimum set cover. Artif. Intell. 180–181, 20–33 (2012)
Zhou, J., Yuan, L., Liu, J., Ye, J.: A multi-task learning formulation for predicting disease progression. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 814–822 (2011)
Acknowledgment
This research was supported by the NSFC (61375061), JiangsuSF (BK20160066), Foundation for the Author of National Excellent Doctoral Dissertation of China (201451), and 2015 Microsoft Research Asia Collaborative Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hu, Y., Yu, Y. (2016). A Multi-task Learning Approach by Combining Derivative-Free and Gradient Methods. In: Gong, M., Pan, L., Song, T., Zhang, G. (eds) Bio-inspired Computing – Theories and Applications. BIC-TA 2016. Communications in Computer and Information Science, vol 681. Springer, Singapore. https://doi.org/10.1007/978-981-10-3611-8_41
Download citation
DOI: https://doi.org/10.1007/978-981-10-3611-8_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3610-1
Online ISBN: 978-981-10-3611-8
eBook Packages: Computer ScienceComputer Science (R0)