Margin maximization in spherical separation

Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

doi:10.1007/s10589-012-9486-7

Margin maximization in spherical separation

Published: 05 May 2012

Volume 53, pages 301–322, (2012)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Annabella Astorino¹,
Antonio Fuduli² &
Manlio Gaudioso³

300 Accesses
32 Citations
Explore all metrics

Abstract

We face the problem of strictly separating two sets of points by means of a sphere, considering the two cases where the center of the sphere is fixed or free, respectively. In particular, for the former we present a fast and simple solution algorithm, whereas for the latter one we use the DC-Algorithm based on a DC decomposition of the error function. Numerical results for both the cases are presented on several classical binary datasets drawn from the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

Mayank Arya Chandra & S. S. Bedi

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

Francis Bach

K-Means algorithm based on multi-feature-induced order

Article 09 April 2024

Benting Wan, Weikang Huang, … Shufen Zhou

References

An, L.T.H., Tao, P.D.: The DC (Difference of Convex Functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005)
Article MathSciNet MATH Google Scholar
Astorino, A., Fuduli, A.: Nonsmooth optimization techniques for semisupervised classification. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2135–2142 (2007)
Article Google Scholar
Astorino, A., Fuduli, A., Gaudioso, M.: DC models for spherical separation. J. Glob. Optim. 48, 657–669 (2010)
Article MathSciNet MATH Google Scholar
Astorino, A., Gaudioso, M.: Polyhedral separability through successive LP. J. Optim. Theory Appl. 112, 265–293 (2002)
Article MathSciNet MATH Google Scholar
Astorino, A., Gaudioso, M.: Ellipsoidal separation for classification problems. Optim. Methods Softw. 20, 261–270 (2005)
Article MathSciNet Google Scholar
Astorino, A., Gaudioso, M.: A fixed-center spherical separation algorithm with kernel transformations for classification problems. Comput. Manag. Sci. 6, 357–372 (2009)
Article MathSciNet MATH Google Scholar
Bagirov, A.M.: Max-min separability. Optim. Methods Softw. 20, 271–290 (2005)
Article MathSciNet Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software available at www.csie.ntu.edu.tw/~cjlin/libsvm
Article Google Scholar
Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Google Scholar
Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using the second order information for training SVM. J. Mach. Learn. Res. 6, 1889–1918 (2005)
MathSciNet MATH Google Scholar
Fuduli, A., Gaudioso, M., Giallombardo, G.: Minimizing nonconvex nonsmooth functions via cutting planes and proximity control. SIAM J. Optim. 14, 743–756 (2004)
Article MathSciNet MATH Google Scholar
Gonzalez-Lima, M., Hager, W., Zhang, H.: An affine-scaling interior-point method for continuous knapsack constraints with application to support vector machines. SIAM J. Optim. 21, 361–390 (2011)
Article MathSciNet MATH Google Scholar
Hao, P.Y., Chiang, J.H., Lin, Y.H.: A new maximal-margin spherical-structured multi-class support vector machine. Appl. Intell. 30, 98–111 (2009)
Article Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods—Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Konno, H., Gotoh, J., Uryasev, S.: Failure discrimination by semi-definite programming. In: Pardalos, P. (ed.) Financial Engineering, Electronic Commerce and Supply Chain, pp. 379–396. Kluwer Academic Publishers, Dordrecht (2002)
Google Scholar
Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases. www.ics.uci.edu/~mlearn/MLRepository.html (1992)
Palagi, L., Sciandrone, M.: On the convergence of a modified version of the SVM^light algorithm. Optim. Methods Softw. 20, 315–332 (2005)
Article MathSciNet Google Scholar
Odewahn, S., Stockwell, E., Pennington, R., Humphreys, R., Zumach, W.: Automated star/galaxy discrimination with neural networks. Astron. J. 103, 318–331 (1992)
Article Google Scholar
Schölkopf, B., Burges, C.J.C., Smola, A.J.: Advances in Kernel Methods. Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Tao, P.D., An, L.T.H.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Control Optim. 8, 476–505 (1998)
MATH Google Scholar
Tax, D.M.J., Duin, R.P.W.: Support vector domain description. Pattern Recognit. Lett. 20, 1191–1199 (1999)
Article Google Scholar
Tax, D.M.J., Duin, R.P.W.: Uniform object generation for optimizing one-class classifiers. J. Mach. Learn. Res. 2, 155–173 (2001)
Google Scholar
Vapnik, V.: The Nature of the Statistical Learning Theory. Springer, New York (1995)
MATH Google Scholar
Wang, J., Neskovic, P., Cooper, L.N.: Pattern Classification via Single Spheres. Lecture Notes in Artificial Intelligence, vol. 3735, pp. 241–252 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Istituto di Calcolo e Reti ad Alte Prestazioni C.N.R., Consiglio Nazionale delle Ricerche, c/o Dipartimento di Elettronica Informatica e Sistemistica, Università della Calabria, 87036, Rende, CS, Italy
Annabella Astorino
Dipartimento di Matematica, Università della Calabria, 87036, Rende, CS, Italia
Antonio Fuduli
Dipartimento di Elettronica Informatica e Sistemistica, Università della Calabria, 87036, Rende, CS, Italia
Manlio Gaudioso

Authors

Annabella Astorino
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Fuduli
View author publications
You can also search for this author in PubMed Google Scholar
Manlio Gaudioso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manlio Gaudioso.

Appendix

In this section we report the proofs of Theorems 2.1, 2.2 and 2.3.

Theorem 2.1

Let S⊆ℝⁿ be a nonempty compact set. If x ₀∈S and C≥1/(2k), then there exists an optimal solution to problem (5).

Proof

If x ₀∈S, then

$$ \|b_l-x_0\|\leq \max_{x\in S} \|b_l-x\|\stackrel {\triangle }{=}D_l\geq0, \quad l=1, \ldots, k. $$

(18)

Taking into account the definition of h, the constraint z≥q and inequality (18), we obtain

(19)

where $D\stackrel {\triangle }{=}\sum_{l=1}^{k} D_{l}$.

From the above inequality, if $C\geq\frac{1}{2k}$, the thesis follows by taking into account that h is coercive on the set $\varOmega \stackrel {\triangle }{=}\{ (x_{0},z,q)~|~x_{0}\in S \mbox{ and } 0\leq q\leq z \}$. □

Theorem 2.2

There exists an optimal solution for problem (6) with z>0.

Proof

Let $(x_{0}^{*},z^{*},q^{*},\xi^{*},\mu^{*})$ be any optimal solution for problem (6) with z ^∗=0. As a consequence q ^∗=0, $\xi^{*}_{i}=\|a_{i}-x_{0}^{*}\|^{2}$ for i=1,…,m and $\mu_{l}^{*}=0$ for l=1,…,k.

We define the sets

$$I\stackrel {\triangle }{=}\bigl\{i\mid 1\leq i\leq m,\ a_i\neq x_0^* \bigr \} \quad \mbox{and}\quad L\stackrel {\triangle }{=}\bigl\{l\mid 1\leq l\leq k,\ b_l\neq x_0^* \bigr\}, $$

which cannot be simultaneously empty, by the assumption that the sets $\mathcal {A}$ and $\mathcal {B}$ are disjoint.

Now we consider the solution

$$ \bigl(\bar{x}_0 =x_0^*, \bar{z}, \bar{q}=q^*, \bar{\xi}, \bar{\mu}\bigr), $$

(20)

with

$$\bar{z}= \min_{\scriptsize i\in I,\ l\in L} \bigl\{\|a_i-x_0^* \|^2,\|b_l-x_0^*\|^2 \bigr\}>0, $$

and the components of the vectors $\bar{\xi}$ and $\bar{\mu}$ defined as follows:

$$\mbox{(i)} \begin{cases} \bar{\xi}_i=\xi_i^*-\bar{z} &i\in I \\ \bar{\mu}_l=\mu_l^*=0& l\in L, \end{cases} $$

$$\mbox{(ii)} \begin{cases} \bar{\xi}_i=\xi_i^*-\bar{z} &i\in I \\ \bar{\xi}_j=\xi_j^*=0& \\ \bar{\mu}_l=\mu_l^*=0& l\in L \end{cases} $$

or

$$\mbox{(iii)} \begin{cases}\bar{\xi}_i=\xi_i^*-\bar{z} &i\in I \\ \bar{\mu}_l=\mu_l^*=0& l\in L \\ \bar{\mu}_j = \bar{z}, \end{cases} $$

according to the three possible cases:

1.
(i) $x_{0}^{*}\neq a_{i}$ for i=1,…,m and $x_{0}^{*}\neq b_{l}$ for l=1,…,k;
2.
(ii) $x_{0}^{*}= a_{j}$ for some j∈{1,…,m};
3.
(iii) $x_{0}^{*}= b_{j}$ for some j∈{1,…,k}.

It is easy to show that the above solution (20), characterized by $\bar{z}>0$, is feasible for problem (6), with an objective function value which is not worse than that one corresponding to solution $(x_{0}^{*},z^{*},q^{*},\xi^{*},\mu^{*})$. □

Theorem 2.3

Let $(x_{0}^{*}, z^{*}, q^{*}, \xi^{*}, \mu^{*})$ be any optimal solution of problem (6), with C>1. Then the sphere $S(x_{0}^{*},R^{*})$, with $R^{*}=\sqrt{z^{*}}$, strictly separates the sets $\mathcal {A}$ and $\mathcal {B}$ if and only if q ^∗>0.

Proof

(⇒) Assume $S(x_{0}^{*},R^{*})$, with $R^{*}=\sqrt{z^{*}}$, strictly separates the sets $\mathcal {A}$ and $\mathcal {B}$, and let q ^∗=0. Thus we have

$$ \begin{cases} -z^*+\|a_i-x_0^* \|^2<0& \forall i=1,\ldots,m \\ z^*-\|b_l-x_0^*\|^2<0& \forall l=1, \ldots,k \end{cases} $$

(21)

and, consequently,

$$0<\epsilon \stackrel {\triangle }{=}\min_{1\leq i\leq m,~1\leq l\leq k } \bigl\{ z^*- \bigl\| a_i-x_0^* \bigr\|^2,\bigl\|b_l-x_0^*\bigr\|^2-z^* \bigr \} \leq z^*. $$

Note that the optimality of $(x_{0}^{*}, z^{*}, q^{*}, \xi^{*}, \mu^{*})$, together with (21), implies ξ ^∗=0 and μ ^∗=0 and the optimal objective function value equal to zero.

Define now a feasible solution $(\bar{x}_{0}, \bar{z}, \bar{q}, \bar{\xi}, \bar{\mu})$, to problem (6) as follows:

$$\bar{x}_0=x_0^*,\qquad \bar{z}=z^*,\qquad \bar{q}=\epsilon,\qquad \bar{\xi}=\xi^*=0, \qquad \bar{\mu}=\mu^*=0. $$

It is characterized by an objective function value equal to −ϵ<0, which contradicts optimality of $(x_{0}^{*}, z^{*}, q^{*}, \xi^{*}, \mu^{*})$.

(⇐) Now suppose q ^∗>0, and assume by contradiction that the sphere $S(x_{0}^{*},R^{*})$, with $R^{*}=\sqrt{z^{*}}$, does not strictly separate $\mathcal {A}$ and $\mathcal {B}$. Then at least one of the two possible cases occur:

1.
there exists $a_{j}\in \mathcal {A}$ such that $\|a_{j}-x_{0}^{*}\|^{2}\geq z^{*}$, which implies
$$ \xi_j^*\geq q^*-z^* + \bigl\|a_j-x_0^* \bigr\|^2 \geq q^*>0; $$
(22)
2.
there exists $b_{j}\in \mathcal {B}$ such that $\|b_{j}-x_{0}^{*}\|^{2}\leq z^{*}$, which implies
$$ \mu_j^*\geq q^*+z^* - \bigl\|b_j-x_0^* \bigr\|^2 \geq q^*>0. $$
(23)

Case 1. The value of the objective function, in correspondence to the optimal solution is

$$C \Biggl(\sum_{i=1,i\neq j}^m \xi_i^*+\xi_j^*+\sum_{l=1}^k \mu^*_l \Biggr) - q^*. $$

Now we consider the solution $(\bar{x}_{0}, \bar{z}, \bar{q}, \bar{\xi}, \bar{\mu})$ defined as follows:

$$ \begin{cases} \bar{x}_0 = x_0^* \\ \bar{z} = z^* \\ \bar{q}=0 \\ \bar{\xi}_i=\xi_i^*\quad i=1,\ldots,m, \ i\neq j \\ \bar{\xi}_j= \xi_j^*-q^* \\ \bar{\mu}= \mu^*, \end{cases} $$

(24)

which can be easily proved to be feasible to problem (6), taking into account (22). Moreover, the corresponding objective function value is

$$C \Biggl(\sum_{i=1,i\neq j}^m \xi_i^*+\xi_j^*-q^*+\sum _{l=1}^k \mu^*_l \Biggr)=C \Biggl( \sum_{i=1,i\neq j}^m \xi_i^*+ \xi_j^*+\sum_{l=1}^k \mu^*_l \Biggr) - Cq^*. $$

If C>1 then

$$C \Biggl(\sum_{i=1,i\neq j}^m \xi_i^*+\xi_j^*+\sum_{l=1}^k \mu^*_l \Biggr) - Cq^*<C \Biggl(\sum _{i=1,i\neq j}^m \xi_i^*+\xi_j^*+ \sum_{l=1}^k \mu^*_l \Biggr) - q^*, $$

which contradicts the optimality of $(x_{0}^{*}, z^{*}, q^{*}, \xi^{*}, \mu^{*})$.

Case 2. Analogous considerations hold. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Astorino, A., Fuduli, A. & Gaudioso, M. Margin maximization in spherical separation. Comput Optim Appl 53, 301–322 (2012). https://doi.org/10.1007/s10589-012-9486-7

Download citation

Received: 26 July 2010
Published: 05 May 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s10589-012-9486-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Margin maximization in spherical separation

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Sum-of-Squares Relaxations for Information Theory and Variational Inference

K-Means algorithm based on multi-feature-induced order

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Theorem 2.1

Proof

Theorem 2.2

Proof

Theorem 2.3

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Margin maximization in spherical separation

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Sum-of-Squares Relaxations for Information Theory and Variational Inference

K-Means algorithm based on multi-feature-induced order

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Theorem 2.1

Proof

Theorem 2.2

Proof

Theorem 2.3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation