skip to main content
10.1145/1150402.1150494acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Efficient kernel feature extraction for massive data sets

Published: 20 August 2006 Publication History

Abstract

Maximum margin discriminant analysis (MMDA) was proposed that uses the margin idea for feature extraction. It often outperforms traditional methods like kernel principal component analysis (KPCA) and kernel Fisher discriminant analysis (KFD). However, as in other kernel methods, its time complexity is cubic in the number of training points m, and is thus computationally inefficient on massive data sets. In this paper, we propose an (1+ε)2-approximation algorithm for obtaining the MMDA features by extending the core vector machines. The resultant time complexity is only linear in m, while its space complexity is independent of m. Extensive comparisons with the original MMDA, KPCA, and KFD on a number of large data sets show that the proposed feature extractor can improve classification accuracy, and is also faster than these kernel-based methods by more than an order of magnitude.

References

[1]
M. Bǎdoiu and K. L. Clarkson. Optimal core-sets for balls. In DIMACS Workshop on Computational Geometry, 2002.]]
[2]
T. Friess, N. Cristianini, and C. Campbell. The kernel-adatron: a fast and simple learning procedure for support vector machines. In Proceeding of the Fifteenth International Conference on Machine Learning, pages 188--196, 1998.]]
[3]
W. Kienzle and B. Schölkopf. Training support vector machines with multiple equality constraints. In Proceedings of the European Conference on Machine Learning, 2005.]]
[4]
H.-C. Kim, S. Pang, H.-M. Je, D. Kim, and S. Bang. Constructing support vector machine ensemble. Pattern Recognition, 36(12):2757--2767, 2003.]]
[5]
A. Kocsor, K. Kovács, and C. Szepesvári. Margin maximizing discriminant analysis. In Proceedings of the 15th European Conference on Machine Learning, pages 227--238, Pisa, Italy, Sept. 2004.]]
[6]
O. Mangasarian and E. Wild. Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1):69--74, 2006.]]
[7]
S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41--48, 1999.]]
[8]
J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning, pages 185--208. MIT Press, Cambridge, MA, 1999.]]
[9]
B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.]]
[10]
I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6:363--392, 2005.]]
[11]
I. W. Tsang, J. T. Kwok, and K. T. Lai. Core vector regression for very large regression problems. In Proceedings of the Twentieth-Second International Conference on Machine Learning, pages 913--920, Bonn, Germany, Aug. 2005.]]

Cited By

View all
  • (2022)EEG signal classification via pinball universum twin support vector machineAnnals of Operations Research10.1007/s10479-022-04922-x328:1(451-492)Online publication date: 19-Aug-2022
  • (2022)Boundary‐based Fuzzy‐SVDD for one‐class classificationInternational Journal of Intelligent Systems10.1002/int.2277337:3(2266-2292)Online publication date: 25-Jan-2022
  • (2019)Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851982(1-8)Online publication date: Jul-2019
  • Show More Cited By

Index Terms

  1. Efficient kernel feature extraction for massive data sets

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2006
    986 pages
    ISBN:1595933395
    DOI:10.1145/1150402
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 August 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SVM
    2. extraction
    3. kernel feature
    4. scalability

    Qualifiers

    • Article

    Conference

    KDD06

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)EEG signal classification via pinball universum twin support vector machineAnnals of Operations Research10.1007/s10479-022-04922-x328:1(451-492)Online publication date: 19-Aug-2022
    • (2022)Boundary‐based Fuzzy‐SVDD for one‐class classificationInternational Journal of Intelligent Systems10.1002/int.2277337:3(2266-2292)Online publication date: 25-Jan-2022
    • (2019)Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851982(1-8)Online publication date: Jul-2019
    • (2019)Schizophrenia Classification Using fMRI Data Based on a Multiple Feature Image Capsule Network EnsembleIEEE Access10.1109/ACCESS.2019.29335507(109956-109968)Online publication date: 2019
    • (2019)Health Evaluation of MVB Based on SVDD and Sample ReductionIEEE Access10.1109/ACCESS.2019.2904600(1-1)Online publication date: 2019
    • (2019)Facial expression recognition using iterative universum twin support vector machineApplied Soft Computing10.1016/j.asoc.2018.11.04676(53-67)Online publication date: Mar-2019
    • (2018)Improved universum twin support vector machine2018 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI.2018.8628671(2045-2052)Online publication date: Nov-2018
    • (2018)Information entropy based sample reduction for support vector data descriptionApplied Soft Computing10.1016/j.asoc.2018.02.05371(1153-1160)Online publication date: Oct-2018
    • (2018)A fuzzy twin support vector machine based on information entropy for class imbalance learningNeural Computing and Applications10.1007/s00521-018-3551-9Online publication date: 24-May-2018
    • (2018)A Fuzzy Universum Support Vector Machine Based on Information EntropyMachine Intelligence and Signal Analysis10.1007/978-981-13-0923-6_49(569-582)Online publication date: 8-Aug-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media