Skip to main content

Robust Optimization for Clustering

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9622))

Included in the following conference series:

  • 1570 Accesses

Abstract

In this paper, we investigate the robust optimization for the minimum sum-of squares clustering (MSSC) problem. Each data point is assumed to belong to a box-type uncertainty set. Following the robust optimization paradigm, we obtain a robust formulation that can be interpreted as a combination of MSSC and k-median clustering criteria. A DCA-based algorithm is developed to solve the resulting robust problem. Preliminary numerical results on real datasets show that the proposed robust optimization approach is superior than MSSC and k-median clustering approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://archive.ics.uci.edu/ml/.

References

  1. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press, Princeton (2009)

    Book  MATH  Google Scholar 

  2. Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 368–374. MIT Press, Cambridge, MA (1997)

    Google Scholar 

  3. Hubert, L., Arabie, P.: Comparing partitions. J CLASSIF 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  4. Gullo, F., Tagarelli, A.: Uncertain centroid based partitional clustering of uncertain data. Proc. VLDB Endowment (ACM) 5(7), 610–621 (2012)

    Article  Google Scholar 

  5. Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham, D.T.: A DC programming approach for feature selection in support vector machines learning. J. Adv. Data Anal. Classif. 2, 259–278 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Le Thi, H.A., Le, H.M., Pham, D.T.: Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. J. Adv. Data Anal. Classif. 2, 1–20 (2007)

    MathSciNet  MATH  Google Scholar 

  7. Le Thi, H.A., Le, H.M., Pham, D.T., Huynh, V.N.: Binary classification via spherical separator by DC programming and DCA. J. Global Optim. 56(4), 1393–1407 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. An, L.T.H., Cuong, N.M.: Efficient algorithms for feature selection in multi-class support vector machine. In: Nguyen, N.T., van Do, T., Thi, H.A. (eds.) ICCSAMA 2013. SCI, vol. 479, pp. 41–52. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  9. Le Thi, H.A., Vo, X.T., Pham, D.T.: Robust feature selection for SVMs under uncertain data. In: Perner, P. (ed.) ICDM 2013. LNCS, vol. 7987, pp. 151–165. Springer, Heidelberg (2013)

    Google Scholar 

  10. Le Thi, H.A., Le, H.M., Pham, D.T.: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. MachineLearning (published online 04.07.14). doi:10.1007/s10994-014-5455-y

    Google Scholar 

  11. Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)

    Article  MATH  Google Scholar 

  12. Le Thi, H.A., Vo, X.T., Pham, D.T.: Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw. 59, 36–50 (2014)

    Article  MATH  Google Scholar 

  13. Le Thi, H.A., Nguyen, M.C., Pham, D.T.: A DC programming approach for finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)

    Article  MathSciNet  Google Scholar 

  14. Le Thi, H.A., Pham, D.T., Le, H.M., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  15. Le Thi, H.A., Pham, D.T.: The DC (difference of convex functions) Programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  16. Le Thi, H.A., Belghiti, T., Pham, D.T.: A new efficient algorithm based on DC programming and DCA for clustering. J. Glob. Optim. 37, 593–608 (2006)

    MathSciNet  MATH  Google Scholar 

  17. Le Thi, H.A., Le, H.M., Pham, D.T.: Optimization based DC programming and DCA for hierarchical clustering. Eur. J. Oper. Res. 183, 1067–1085 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  18. Pham, D.T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–357 (1997)

    MATH  Google Scholar 

  19. Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)

    Article  MATH  Google Scholar 

  20. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  21. Pham, D.T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Oppt. 8, 476–505 (1998)

    Article  MATH  Google Scholar 

  22. Xu, H., Caramanis, C., Mannor, S.: Robustness and regularization of support vector machines. J. Mach. Learn. Res. 10, 1485–1510 (2009)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research is funded by Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT), website: http://fostect.tdt.edu.vn, under Grant FOSTECT.2015.BR.15.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Thanh Vo .

Editor information

Editors and Affiliations

A Appendix

A Appendix

Given \(a, a_i \in \mathbb {R}\), \(b_i \in \mathbb {R}_{++}\) (\(i=1,\dots ,m\)), consider the problem

$$\begin{aligned} \min \left\{ f(x) := \frac{1}{2} (x - a)^2 + \sum _{i=1}^m b_i |x - a_i| : x \in \mathbb {R}\right\} . \end{aligned}$$
(13)

Assume that \(\{a_i\}_{i=1}^m\) is in ascending order \(a_1 \le a_2 \le \dots \le a_m\). Denote by \(f^-(x)\) (resp. \(f^+(x)\)) the left (resp. right) derivative of f at x. We have

$$\begin{aligned} f^-(x) = x - a + \sum _{i=1}^m b_i \delta _i, \quad f^+(x) = x - a + \sum _{i=1}^m b_i \sigma _i, \end{aligned}$$
(14)

where \(\delta _i = -1\) if \(x \le a_i\) and 1 otherwise, \(\sigma _i = -1\) if \(x < a_i\) and 1 otherwise (\(\forall i = 1,\dots ,m\)). For convenience, let \(a_0 = -\infty \) and \(a_{m+1} = +\infty \). Note that (13) is strongly convex, so the solution \(x^*\) is unique. We can find out the place where \(x^*\) is by using the following property.

Proposition 1

Let \(\bar{a} = {{\mathrm{arg\,min}}}\left\{ \sum _{i=1}^m b_i |x - a_i| : x \in \mathbb {R}\right\} \), \(l_b = \min (a,\bar{a})\), and \(u_b = \max (a,\bar{a})\). We have the following assertions:

  1. (i)

    \(x^* \in [l_b,u_b]\).

  2. (ii)

    If \(f^-(a_i) > 0\), then \(x^* \in (a_0,a_i)\). If \(f^+(a_i) < 0\), then \(x^* \in (a_i,a_{m+1})\) . If \(f^-(a_i) \le 0\) and \(f^+(a_i) \ge 0\), then \(x^* = a_i\).

Proof

Given g is any finite convex function on \(\mathbb {R}\). We have \(\partial g(x) = [f^-(x),f^+(x)]\) for any \(x \in \mathbb {R}\), and \(\tilde{x} \in \mathop {{{\mathrm{arg\,min}}}}_{x \in \mathbb {R}} g(x) \Leftrightarrow 0 \in \partial g(\tilde{x}) \Leftrightarrow f^-(\tilde{x}) \le 0 \le f^+(\tilde{x})\). Moreover, if \(\tilde{x}\) is the unique optimum of g on \(\mathbb {R}\), then for any \(x \ne \tilde{x}\),

$$\begin{aligned} 0 > g(\tilde{x}) - g(x) \ge y (\tilde{x}-x), \quad \forall y \in \partial g(x). \end{aligned}$$

Therefore, \(g^+(x) < 0\) if \(x < \tilde{x}\), and \(g^-(x) > 0\) if \(x>\tilde{x}\). We have ii) is proved.

Let \(f_1(x) = \frac{1}{2}(x-a)^2\) and \(f_2(x) = \sum _{i=1}^m b_i |x - a_i|\). We have \(f_1\) and \(f_2\) are finite convex functions on \(\mathbb {R}\), and a (resp. \(\bar{a}\)) is optimum of \(f_1\) (resp. \(f_2\)) on \(\mathbb {R}\). Without loss of generality we assume that \(a \le \bar{a}\). Then \(f^+(a) = f_2^+(a) \le 0\) and \(f^-(\bar{a}) = f_1^-(\bar{a}) \ge 0\). This implies that \(a \le x^* \le \bar{a}\). Thus, i) is proved.

Once we find out the interval where \(x^*\) is and f is differentiable, \(x^*\) is easily determined by solving the equation \(f'(x) = 0\). The specific procedure for finding the solution \(x^*\) of problem (13) is given in Algorithm 1. It is clear that Algorithm 1 terminates after at most m steps.

figure afigure a

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vo, X.T., Le Thi, H.A., Pham Dinh, T. (2016). Robust Optimization for Clustering. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9622. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49390-8_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49390-8_65

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49389-2

  • Online ISBN: 978-3-662-49390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics