skip to main content
10.1145/3549206.3549287acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3Conference Proceedingsconference-collections
research-article

Sample Reduction for Support Vector Data Description (SVDD) by Farthest Boundary Point Estimation (FBPE) using Gradients of Data Density

Authors Info & Claims
Published:24 October 2022Publication History

ABSTRACT

Classification is a quintessential application of machine learning for which support vector machines have been used ubiquitously because of their optimal margins and ease of use. However, they’re rarely used for large datasets due to the cubic time complexity of their training process. This has inspired several papers attempting to reduce the number of features or the number of training samples to lessen the training time of the SVMs. This paper aims to propose a novel approach for reducing the number of training samples for support vector data description (SVDD) while attempting to maximize the knowledge of the target class by selecting the most promising candidates for support vectors, which are the farthest boundary points of the data clusters. The proposed algorithm utilizes the density gradient across the data distribution to uniformly detect the boundary points, which are sampled as potential support vectors to train the support vector machines in a smaller amount of time without significant loss in accuracy. The proposed algorithm is verified via tests conducted on Human Activity Recognition, Breast Cancer Detection, and Heart Disease Detection Datasets.

References

  1. [1] Minter, T. C. (1975, January). Single-class classification. In LARS Symposia (p. 54).Google ScholarGoogle Scholar
  2. [2] Koch, M. W., Moya, M. M., Hostetler, L. D., & Fogler, R. J. (1995). Cueing, feature discovery, and one-class learning for synthetic aperture radar automatic target recognition. Neural Networks, 8(7-8), 1081-1102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3]GeeksForGeeks (2022, Jan 04). Parzen Windows density estimation technique. GeeksForGeeks. https://www.geeksforgeeks.org/parzen-windows-density-estimation-technique/Google ScholarGoogle Scholar
  4. [4] Désir, C., Bernard, S., Petitjean, C., & Heutte, L. (2012, October). A random forest based approach for one class classification in medical imaging. In International Workshop on Machine Learning in Medical Imaging (pp. 250-257). Springer, Berlin, Heidelberg.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Schölkopf, B., Williamson, R. C., Smola, A., Shawe-Taylor, J., & Platt, J. (1999). Support vector method for novelty detection. Advances in neural information processing systems, 12.Google ScholarGoogle Scholar
  6. [6] Hao, P. Y. (2008). Fuzzy one-class support vector machines. Fuzzy Sets and Systems, 159(18), 2317-2336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Ji, M., & Xing, H. J. (2017, May). Adaptive-weighted one-class support vector machine for outlier detection. In 2017 29th Chinese Control and Decision Conference (CCDC) (pp. 1766-1771). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural computation, 13(7), 1443-1471.Google ScholarGoogle Scholar
  9. [9] Tax, D. M., & Duin, R. P. (1999). Support vector domain description. Pattern recognition letters, 20(11-13), 1191-1199.Google ScholarGoogle Scholar
  10. [10] Xing, H. J., & Liu, W. T. (2020). Robust AdaBoost based ensemble of one-class support vector machines. Information Fusion, 55, 45-58.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Zhu, F., Yang, J., Gao, C., Xu, S., Ye, N., & Yin, T. (2016). A weighted one-class support vector machine. Neurocomputing, 189, 1-10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Rohith Gandhi (2018, Jun 7). Support Vector Machine — Introduction to Machine Learning Algorithms. Towards Data Science. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47Google ScholarGoogle Scholar
  13. [13] Liu, Y. G., Chen, Q., & Yu, R. Z. (2003, November). Extract candidates of support vector from training set. In Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 03EX693) (Vol. 5, pp. 3199-3202). IEEE.Google ScholarGoogle Scholar
  14. [14] Platt, J. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines.Google ScholarGoogle Scholar
  15. [15] Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3), 1-27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Alam, S., Sonbhadra, S. K., Agarwal, S., Nagabhushan, P., & Tanveer, M. (2020). Sample reduction using farthest boundary point estimation (FBPE) for support vector data description (SVDD). Pattern Recognition Letters, 131, 268-276.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Jeong, Y. S., Kang, I. H., Jeong, M. K., & Kong, D. (2012). A new feature selection method for one-class classification problems. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6), 1500-1509.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Lian, H. (2012). On feature selection with principal component analysis for one-class SVM. Pattern Recognition Letters, 33(9), 1027-1031.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Nagabhushan, P., & N Meenakshi, H. (2014). Target class supervised feature subsetting. International Journal of Computer Applications, 91(12), 11-23.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Yousef, M., Saçar Demirci, M. D., Khalifa, W., & Allmer, J. (2016). Feature selection has a large impact on one-class classification accuracy for MicroRNAs in plants. Advances in bioinformatics, 2016.Google ScholarGoogle Scholar
  21. [21] Li, Y. (2011). Selecting training points for one-class support vector machines. Pattern recognition letters, 32(11), 1517-1522.Google ScholarGoogle Scholar
  22. [22] Kumar, B., Shukla, A., Singh, A., Ali, M. J., & Vyas, O. P. Reduction of Training Data from Large Datasets Using Encoder and Decoder Algorithm Without Much Compromise of Accuracy. Available at SSRN 3985435.Google ScholarGoogle Scholar
  23. [23] Sun, W., Qu, J., Chen, Y., Di, Y., & Gao, F. (2016). Heuristic sample reduction method for support vector data description. Turkish Journal of Electrical Engineering & Computer Sciences, 24(1), 298-312.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Hens, A. B., & Tiwari, M. K. (2012). Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Systems with Applications, 39(8), 6774-6781.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Koggalage, R., & Halgamuge, S. (2004). Reducing the number of training samples for fast support vector machine classification. Neural Information Processing-Letters and Reviews, 2(3), 57-65.Google ScholarGoogle Scholar
  26. [26] Wang, Y., Yao, H., & Zhao, S. (2016). Auto-encoder based dimensionality reduction. Neurocomputing, 184, 232-242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Javatpoint (n.d.) Concept of Edge Detection. Javatpoint. Retrieved May 16, 2022 from https://www.javatpoint.com/dip-concept-of-edge-detectionGoogle ScholarGoogle Scholar
  28. [28] GeeksForGeeks (2022, March 25). Binary Search. GeeksForGeeks. https://www.geeksforgeeks.org/binary-search/Google ScholarGoogle Scholar
  29. [29] GeeksForGeeks (2018, Nov 28). upperbound in C++. GeeksForGeeks. https://www.geeksforgeeks.org/upper-bound-in-cpp/?ref=gcseGoogle ScholarGoogle Scholar
  30. [30] Marius H. (2020, Jun 15). Tree algorithms explained: Ball Tree Algorithm vs. KD Tree vs. Brute Force. Towards Data Science. https://towardsdatascience.com/tree-algorithms-explained-ball-tree-algorithm-vs-kd-tree-vs-brute-force-9746debcd940Google ScholarGoogle Scholar
  31. [31] Scikit Learn. (n.d.) sklearn.preprocessing.StandardScaler. Scikit learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.htmlGoogle ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    IC3-2022: Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing
    August 2022
    710 pages
    ISBN:9781450396752
    DOI:10.1145/3549206

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 24 October 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format