Skip to main content

Learning Algorithms for Enclosing Points in Bregmanian Spheres

  • Conference paper
Learning Theory and Kernel Machines

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

Abstract

We discuss the problem of finding a generalized sphere that encloses points originating from a single source. The points contained in such a sphere are within a maximal divergence from a center point. The divergences we study are known as the Bregman divergences which include as a special case both the Euclidean distance and the relative entropy. We cast the learning task as an optimization problem and show that it results in a simple dual form which has interesting algebraic properties. We then discuss a general algorithmic framework to solve the optimization problem. Our training algorithm employs an auxiliary function that bounds the dual’s objective function and can be used with a broad class of Bregman functions. As a specific application of the algorithm we give a detailed derivation for the relative entropy. We analyze the generalization ability of the algorithm by adopting margin-style proof techniques. We also describe and analyze two schemes of online algorithms for the case when the radius of the sphere is set in advance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Azoury, K., Warmuth, M.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine Learning 43, 211–246 (2001)

    Article  MATH  Google Scholar 

  2. Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics 7, 200–217 (1967)

    Article  Google Scholar 

  4. Censor, Y., Zenios, S.A.: Parallel optimization: Theory, Algorithms and Applications. Oxford University Press, Oxford (1997)

    MATH  Google Scholar 

  5. Collins, M., Dasgupta, S., Schapire, R.: A generalization of principal component analysis to the exponential family. In: Adv. in NIPS, vol. 13 (2001)

    Google Scholar 

  6. Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, adaboost and bregman distances. Machine Learning 47(2/3), 253–285 (2002)

    Article  Google Scholar 

  7. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of the Royal Stat. Soc. Ser. B 39, 1–38 (1977)

    Google Scholar 

  8. Fletcher, R.: Practical Methods of Optimization, 2nd edn. John Wiley, Chichester (1987)

    MATH  Google Scholar 

  9. Gentile, C., Warmuth, M.: Linear hinge loss and average margin. In: Advances in Neural Information Processing Systems, vol. 10 (1998)

    Google Scholar 

  10. Kivinen, J., Warmuth, M.K.: Relative loss bounds for multidimensional regression problems. Journal of Machine Learning 45(3), 301–329 (2001)

    Article  MATH  Google Scholar 

  11. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, vol. 13, pp. 556–562 (2000)

    Google Scholar 

  12. Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 1–40 (1999)

    Article  Google Scholar 

  13. Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  14. Schölkopf, B., Burges, C., Vapnik, V.N.: Extracting support data for a given task. In: Fayyad, U.M., Uthurusamy, R. (eds.) 1st International Conference on Knowledge Discovery & Data Mining (KDD). AAAI Press, Menlo Park (1995)

    Google Scholar 

  15. Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7), 1443–1472 (2001)

    Article  MATH  Google Scholar 

  16. Sha, F., Saul, L.K., Lee, D.D.: Multiplicative updates for nonnegative quadratic programming in support vector machines. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15 (2002)

    Google Scholar 

  17. Tax, D.M.J.: One-class classification; Concept-learning in the absence of counterexamples. PhD thesis, Delft University of Technology (2001)

    Google Scholar 

  18. Tax, D.M.J., Duin, R.P.W.: Data domain description using support vectors. In: Verleysen, M. (ed.) Proceedings of the European Symposium on Artificial Neural Networks, April 1999, pp. 251–256 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crammer, K., Singer, Y. (2003). Learning Algorithms for Enclosing Points in Bregmanian Spheres. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45167-9_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40720-1

  • Online ISBN: 978-3-540-45167-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics