Abstract
In this work we present the first efficient algorithm for unsupervised training of multi-class regularized least-squares classifiers. The approach is closely related to the unsupervised extension of the support vector machine classifier known as maximum margin clustering, which recently has received considerable attention, though mostly considering the binary classification case. We present a combinatorial search scheme that combines steepest descent strategies with powerful meta-heuristics for avoiding bad local optima. The regularized least-squares based formulation of the problem allows us to use matrix algebraic optimization enabling constant time checks for the intermediate candidate solutions during the search. Our experimental evaluation indicates the potential of the novel method and demonstrates its superior clustering performance over a variety of competing methods on real world datasets. Both time complexity analysis and experimental comparisons show that the method can scale well to practical sized problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). New York, NY, USA: Springer, 2009.
Bao T, Cao H, Chen E, Tian J, Xiong H. An unsupervised approach to modeling personalized contexts of mobile users. Knowledge and Information Systems, 2012, 31(2): 345–370.
Jain A, Dubes R. Algorithms for Clustering Data. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1988.
Schölkopf B, Smola A. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press, 2001.
Steinwart I, Christmann A. Support Vector Machines. New York, NY, USA: Springer-Verlag, 2008.
Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. In Advances in Neural Information Processing Systems 17, Saul L, Weiss Y, Bottou L (eds.), MIT Press, 2005, pp. 1537–1544.
Pahikkala T, Airola A, Gieseke F, Kramer O. Unsupervised multi-class regularized least-squares classification. In Proc. the 12th IEEE International Conference on Data Mining (ICDM 2012), Dec. 2012, pp. 585–594.
Boyd S, Vandenberghe L. Convex Optimization. New York, NY, USA: Cambridge University Press, 2004.
Valizadegan H, Jin R. Generalized maximum margin clustering and unsupervised kernel learning. In Advances in Neural Information Processing Systems 19, Schölkopf B, Platt J, Hoffman T (eds.), MIT Press, 2007, pp. 1417–1424.
Zhao B, Wang F, Zhang C. Efficient maximum margin clustering via cutting plane algorithm. In Proc. the SIAM International Conference on Data Mining, Apr. 2008, pp. 751–762.
Li Y, Tsang I, Kwok J, Zhou Z. Tighter and convex maximum margin clustering. In Proc. the 12th International Conference on Artificial Intelligence and Statistics, Apr. 2009, pp. 344–351.
Zhang K, Tsang I, Kwok J. Maximum margin clustering made practical. In Proc. the 24th International Conference on Machine Learning, June 2007, pp. 1119–1126.
Gieseke F, Pahikkala T, Kramer O. Fast evolutionary maximum margin clustering. In Proc. the 26th International Conference on Machine Learning, June 2009, pp. 361–368.
Zhao B, Wang F, Zhang C. Efficient multiclass maximum margin clustering. In Proc. the 25th International Conference on Machine Learning, July 2008, pp. 1248–1255.
Xu L, Schuurmans D. Unsupervised and semi-supervised multi-class support vector machines. In Proc. the 20th National Conference on Artificial Intelligence, July 2005, pp. 904–910.
Rifkin R, Klautau A. In defense of one-vs-all classification. Journal of Machine Learning Research, 2004, 5(Jan): 101–141.
Rifkin R, Yeo G, Poggio T. Regularized least-squares classification. In Advances in Learning Theory: Methods, Models and Applications, Suykens J, Horvath G, Basu S, Micchelli C, Vandewalle J (eds.), Amsterdam, The Netherlands: IOS Press, 2003, pp. 131–154.
Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 1971, 33(1): 82–95.
Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995, 7(2): 219–269.
Russell S, Norvig P. Artificial Intelligence: A Modern Approach (3rd edition). Upper Saddle River, NJ, USA: Prentice Hall Press, 2009.
Kirkpatrick S, Gelatt C, Vecchi M. Optimization by simulated annealing. Science, 1983, 220(4598): 671–680.
Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding. In Proc. the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, Jan. 2007, pp. 1027–1035.
Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1): 1–38.
Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888–905.
Schölkopf B, Mika S, Burges C, Knirsch P, Müller K R, Rätsch G, Smola A. Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 1999, 10(5): 1000–1017.
Nene S, Nayar S, Murase H. Columbia object image library (COIL-100). Techical Report, CUCS-006-96, Department of Computer Science, Columbia University, 1996.
Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193–218.
Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332.
Waegeman W, Verwaeren J, Slabbinck B, De Baets B. Supervised learning algorithms for multi-class classification problems with partial class memberships. Fuzzy Sets and Systems, 2011, 184(1): 106–125.
Williams C, Seeger M. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13, Leen T, Dietterich T, Tresp V (eds.), MIT Press, 2001, pp. 682-688.
Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. Journal of Intelligent Information Systems, 2001, 17(2/3): 107–145.
Zhao Q. Cluster validity in clustering methods [Ph.D. Thesis]. University of Eastern Finland, 2012.
Acknowledgments
We would like to thank the anonymous reviewers for their comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Tapio Pahikkala is supported by the Academy of Finland under Grant No. 134020 and Fabian Gieseke by the German Academic Exchange Service (DAAD).
A preliminary version of the paper was published in the Proceedings of ICDM 2012.
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(DOC 28 kb)
Rights and permissions
About this article
Cite this article
Pahikkala, T., Airola, A., Gieseke, F. et al. On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers. J. Comput. Sci. Technol. 29, 90–104 (2014). https://doi.org/10.1007/s11390-014-1414-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-014-1414-0