Abstract
We consider the problem of estimating high dimensional spatial graphical models with a total cardinality constraint (i.e., the \(\ell _0\)-constraint). Though this problem is highly nonconvex, we show that its primal-dual gap diminishes linearly with the dimensionality and provide a convex geometry justification of this “blessing of massive scale” phenomenon. Motivated by this result, we propose an efficient algorithm to solve the dual problem (which is concave) and prove that the solution achieves optimal statistical properties. Extensive numerical results are also provided.
Similar content being viewed by others
References
Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44, 813–852 (2016)
Cai, T., Liu, W., Luo, X.: A constrained \(\ell _1\) minimization approach to sparse precision matrix estimation. J. Am. Stat. Assoc. 106, 594–607 (2011)
Cao, L., Fei-Fei, L.: Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007. IEEE (2007)
Fan, J., Feng, Y., Wu, Y.: Network exploration via the adaptive lasso and scad penalties. Ann Appl Stat 3, 521 (2009)
Fan, Y., Lv, J.: Asymptotic equivalence of regularization methods in thresholded parameter space. J. Am. Stat. Assoc. 108, 1044–1061 (2013)
Fan, Y., Lv, J.: Innovated scalable efficient estimation in ultra-large Gaussian graphical models. Ann. Stat. 44, 2098–2126 (2016)
Hall, P., Jin, J.: Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Stat. 38, 1686–1732 (2010)
Howard, A., Matarić, M. J., Sukhatme, G. S.: Mobile sensor network deployment using potential fields: a distributed, scalable solution to the area coverage problem. In: Asama, H., Arai, T., Fukuda, T., Hasegawa, T. (eds.) Distributed Autonomous Robotic Systems, Vol. 5. Springer, pp. 299–308 (2002)
Langendoen, K., Baggio, A., Visser, O.: Murphy loves potatoes: experiences from a pilot sensor network deployment in precision agriculture. In: Proceedings 20th IEEE International Parallel and Distributed Processing Symposium. IEEE (2006)
Lee, S.H., Lee, S., Song, H., Lee, H.S.: Wireless sensor network design for tactical military applications: remote large-scale environments. In: Military Communications Conference, 2009. MILCOM 2009. IEEE. IEEE (2009)
Liu, H., Wang, L.: Tiger: a tuning-insensitive approach for optimally estimating Gaussian graphical models. arXiv preprint arXiv:1209.2437 (2012)
Liu, W.: Gaussian graphical model estimation with false discovery rate control. Ann Stat 41, 2948–2978 (2013)
Magazine, M.J., Chern, M.-S.: A note on approximation schemes for multidimensional knapsack problems. Math. Oper. Res. 9, 244–247 (1984)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Ann. Stat. 34, 1436–1462 (2006)
Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72, 417–473 (2010)
Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37, 246–270 (2009)
Optimization, G.: Inc.,“gurobi optimizer reference manual,” 2015. (2014). http://www.gurobi.com. Accessed 29 Sept 2018
Pisinger, D.: A minimal algorithm for the multiple-choice knapsack problem. Eur. J. Oper. Res. 83, 394–410 (1995)
Ravikumar, P., Wainwright, M.J., Raskutti, G., Yu, B.: High-dimensional covariance estimation by minimizing \(\ell _1\)-penalized log-determinant divergence. Electron. J. Stat. 5, 935–980 (2011)
Ren, Z., Sun, T., Zhang, C.-H., Zhou, H.H.: Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Ann. Stat. 43, 991–1026 (2015)
Starr, R.M.: Quasi-equilibria in markets with non-convex preferences. Econometrica 37(1), 25–38 (1969)
Sun, T., Zhang, C.-H.: Scaled sparse linear regression. Biometrika 99, 879–898 (2012)
Williamson, D.P., Shmoys, D.B.: The Design of Approximation Algorithms. Cambridge University Press, Cambridge (2011)
Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Netw. 52, 2292–2330 (2008)
Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrika 94, 19–35 (2007)
Zhang, T.: On the consistency of feature selection using greedy least squares regression. J. Mach. Learn. Res. 10, 555–568 (2009)
Zhang, Y., Wainwright, M.J., Jordan, M.I.: Lower bounds on the performance of polynomial-time algorithms for sparse linear regression. In: Proceedings of Annual Conference on Learning Theory (2014)
Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fang, E.X., Liu, H. & Wang, M. Blessing of massive scale: spatial graphical model estimation with a total cardinality constraint approach. Math. Program. 176, 175–205 (2019). https://doi.org/10.1007/s10107-018-1331-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1331-z