Abstract
In this paper, we investigate the robust optimization for the minimum sum-of squares clustering (MSSC) problem. Each data point is assumed to belong to a box-type uncertainty set. Following the robust optimization paradigm, we obtain a robust formulation that can be interpreted as a combination of MSSC and k-median clustering criteria. A DCA-based algorithm is developed to solve the resulting robust problem. Preliminary numerical results on real datasets show that the proposed robust optimization approach is superior than MSSC and k-median clustering approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press, Princeton (2009)
Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 368–374. MIT Press, Cambridge, MA (1997)
Hubert, L., Arabie, P.: Comparing partitions. J CLASSIF 2, 193–218 (1985)
Gullo, F., Tagarelli, A.: Uncertain centroid based partitional clustering of uncertain data. Proc. VLDB Endowment (ACM) 5(7), 610–621 (2012)
Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham, D.T.: A DC programming approach for feature selection in support vector machines learning. J. Adv. Data Anal. Classif. 2, 259–278 (2008)
Le Thi, H.A., Le, H.M., Pham, D.T.: Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. J. Adv. Data Anal. Classif. 2, 1–20 (2007)
Le Thi, H.A., Le, H.M., Pham, D.T., Huynh, V.N.: Binary classification via spherical separator by DC programming and DCA. J. Global Optim. 56(4), 1393–1407 (2013)
An, L.T.H., Cuong, N.M.: Efficient algorithms for feature selection in multi-class support vector machine. In: Nguyen, N.T., van Do, T., Thi, H.A. (eds.) ICCSAMA 2013. SCI, vol. 479, pp. 41–52. Springer, Heidelberg (2013)
Le Thi, H.A., Vo, X.T., Pham, D.T.: Robust feature selection for SVMs under uncertain data. In: Perner, P. (ed.) ICDM 2013. LNCS, vol. 7987, pp. 151–165. Springer, Heidelberg (2013)
Le Thi, H.A., Le, H.M., Pham, D.T.: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. MachineLearning (published online 04.07.14). doi:10.1007/s10994-014-5455-y
Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)
Le Thi, H.A., Vo, X.T., Pham, D.T.: Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw. 59, 36–50 (2014)
Le Thi, H.A., Nguyen, M.C., Pham, D.T.: A DC programming approach for finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)
Le Thi, H.A., Pham, D.T., Le, H.M., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
Le Thi, H.A., Pham, D.T.: The DC (difference of convex functions) Programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005)
Le Thi, H.A., Belghiti, T., Pham, D.T.: A new efficient algorithm based on DC programming and DCA for clustering. J. Glob. Optim. 37, 593–608 (2006)
Le Thi, H.A., Le, H.M., Pham, D.T.: Optimization based DC programming and DCA for hierarchical clustering. Eur. J. Oper. Res. 183, 1067–1085 (2007)
Pham, D.T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–357 (1997)
Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Pham, D.T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Oppt. 8, 476–505 (1998)
Xu, H., Caramanis, C., Mannor, S.: Robustness and regularization of support vector machines. J. Mach. Learn. Res. 10, 1485–1510 (2009)
Acknowledgements
This research is funded by Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT), website: http://fostect.tdt.edu.vn, under Grant FOSTECT.2015.BR.15.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Given \(a, a_i \in \mathbb {R}\), \(b_i \in \mathbb {R}_{++}\) (\(i=1,\dots ,m\)), consider the problem
Assume that \(\{a_i\}_{i=1}^m\) is in ascending order \(a_1 \le a_2 \le \dots \le a_m\). Denote by \(f^-(x)\) (resp. \(f^+(x)\)) the left (resp. right) derivative of f at x. We have
where \(\delta _i = -1\) if \(x \le a_i\) and 1 otherwise, \(\sigma _i = -1\) if \(x < a_i\) and 1 otherwise (\(\forall i = 1,\dots ,m\)). For convenience, let \(a_0 = -\infty \) and \(a_{m+1} = +\infty \). Note that (13) is strongly convex, so the solution \(x^*\) is unique. We can find out the place where \(x^*\) is by using the following property.
Proposition 1
Let \(\bar{a} = {{\mathrm{arg\,min}}}\left\{ \sum _{i=1}^m b_i |x - a_i| : x \in \mathbb {R}\right\} \), \(l_b = \min (a,\bar{a})\), and \(u_b = \max (a,\bar{a})\). We have the following assertions:
-
(i)
\(x^* \in [l_b,u_b]\).
-
(ii)
If \(f^-(a_i) > 0\), then \(x^* \in (a_0,a_i)\). If \(f^+(a_i) < 0\), then \(x^* \in (a_i,a_{m+1})\) . If \(f^-(a_i) \le 0\) and \(f^+(a_i) \ge 0\), then \(x^* = a_i\).
Proof
Given g is any finite convex function on \(\mathbb {R}\). We have \(\partial g(x) = [f^-(x),f^+(x)]\) for any \(x \in \mathbb {R}\), and \(\tilde{x} \in \mathop {{{\mathrm{arg\,min}}}}_{x \in \mathbb {R}} g(x) \Leftrightarrow 0 \in \partial g(\tilde{x}) \Leftrightarrow f^-(\tilde{x}) \le 0 \le f^+(\tilde{x})\). Moreover, if \(\tilde{x}\) is the unique optimum of g on \(\mathbb {R}\), then for any \(x \ne \tilde{x}\),
Therefore, \(g^+(x) < 0\) if \(x < \tilde{x}\), and \(g^-(x) > 0\) if \(x>\tilde{x}\). We have ii) is proved.
Let \(f_1(x) = \frac{1}{2}(x-a)^2\) and \(f_2(x) = \sum _{i=1}^m b_i |x - a_i|\). We have \(f_1\) and \(f_2\) are finite convex functions on \(\mathbb {R}\), and a (resp. \(\bar{a}\)) is optimum of \(f_1\) (resp. \(f_2\)) on \(\mathbb {R}\). Without loss of generality we assume that \(a \le \bar{a}\). Then \(f^+(a) = f_2^+(a) \le 0\) and \(f^-(\bar{a}) = f_1^-(\bar{a}) \ge 0\). This implies that \(a \le x^* \le \bar{a}\). Thus, i) is proved.
Once we find out the interval where \(x^*\) is and f is differentiable, \(x^*\) is easily determined by solving the equation \(f'(x) = 0\). The specific procedure for finding the solution \(x^*\) of problem (13) is given in Algorithm 1. It is clear that Algorithm 1 terminates after at most m steps.
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vo, X.T., Le Thi, H.A., Pham Dinh, T. (2016). Robust Optimization for Clustering. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9622. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49390-8_65
Download citation
DOI: https://doi.org/10.1007/978-3-662-49390-8_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49389-2
Online ISBN: 978-3-662-49390-8
eBook Packages: Computer ScienceComputer Science (R0)