Abstract
Humans can detect outliers just by using only observations of normal samples. Similarly, one-class classification (OCC) uses only normal samples to train a classification model which can be used for outlier detection. This paper proposes a multi-layer architecture for OCC by stacking various graph-embedded kernel ridge regression (KRR)-based autoencoders in a hierarchical fashion. We formulate the autoencoders under the graph-embedding framework to exploit local and global variance criteria. The use of multiple autoencoder layers allows us to project the input features into a new feature space on which we apply a graph-embedded regression-based one-class classifier. We build the proposed hierarchical OCC architecture in a progressive manner and optimize the parameters of each of the successive layers based on closed-form solutions. The performance of the proposed method is evaluated on 21 balanced and 20 imbalanced datasets. The effectiveness of the proposed method is indicated by the experimental results over 11 existing state-of-the-art kernel-based one-class classifiers. Friedman test is also performed to verify the statistical significance of the obtained results. By using two types of graph-embedding, 4 variants of graph-embedded multi-layer KRR-based one-class classification methods are presented in this paper. All 4 variants have performed better than the existing one-class classifiers in terms of the various performance metrics. Hence, they can be a viable alternative for OCC for a wide range of one-class classification tasks. As a future extension, various other autoencoder variants can be applied within the proposed architecture to increase efficiency and performance.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
One-class classifiers are also known as data descriptors due to their capability to describe the distribution of data and the boundaries of the class of interest
Some researchers [21, 22] followed the name of kernel extreme learning machine (KELM) [24], and some researchers followed the name of KRR [16, 19] (instead of KELM). We do not want to go in the debate of the naming convention. Since there are no differences in the final solution of KELM and KRR, we decide to follow the traditional name KRR instead of KELM.
Here, “/” denotes or. GMKOC uses GKAE and LMKOC uses LKAE.
Here, OCSVM and SVDD yield best results for the same dataset, i.e., Iono(1) dataset.
References
Moya M M, Koch M W, Hostetler L D. One-class classifier networks for target recognition applications. Albuquerque: Technical report, Sandia National Labs.; 1993.
Khan S S, Madden M G. A survey of recent trends in one class classification. Irish conference on Artificial Intelligence and Cognitive Science. Springer; 2009. p. 188–197.
Pimentel M A, Clifton D A, Clifton L, Tarassenko L. A review of novelty detection. Signal Process 2014;99:215–249.
Xu Y, Liu C. A rough margin-based one class support vector machine. Neural Comput Appl 2013;22(6):1077–1084.
Hamidzadeh J, Moradi M. Improved one-class classification using filled function Appl Intell. 2018:1–17.
Xiao Y, Liu B, Cao L, Wu X, Zhang C, Hao Z, Yang F, Cao J. Multi-sphere support vector data description for outliers detection on multi-distribution data. IEEE International Conference on Data Mining Workshops, 2009 (ICDMW’09).. IEEE; 2009 . p. 82–87.
Tax D M J. One-class classification; concept-learning in the absence of counter-examples. ASCI dissertation series. 2001;65.
Liu B, Xiao Y, Cao L, Hao Z, Deng F. Svdd-based outlier detection on uncertain data. Knowl Inf Syst 2013;34(3):597–618.
Hu W, Wang S, Chung F-L, Liu Y, Ying W. Privacy preserving and fast decision for novelty detection using support vector data description. Soft Comput 2015;19(5):1171–1186.
O’Reilly C, Gluhak A, Imran M A, Rajasegarar S. Anomaly detection in wireless sensor networks in a non-stationary environment. IEEE Commun Surv Tutorials 2014;16(3):1413–1432.
Tax D MJ, Duin R PW. Support vector data description. Mach Learn 2004;54(1):45–66.
Schölkopf B, Williamson R C, Smola A J, Shawe-Taylor J, Platt J C. Support vector method for novelty detection. Advances in Neural Information Processing Systems; 1999. p. 582–588.
Hoffmann H. Kernel PCA for novelty detection. Pattern Recogn 2007;40(3):863–874. Software available at http://www.heikohoffmann.de/kpca.html.
Kriegel H-P, Zimek A, et al. Angle-based outlier detection in high-dimensional data. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2008. p. 444–452.
Japkowicz N. Concept-learning in the absence of counter-examples: An autoassociation-based approach to classification. Ph.D. Thesis. Rutgers: The State University of New Jersey; 1999.
Gautam C, Tiwari A, Tanveer M. AEKOC+: Kernel ridge regression-based auto-encoder for one-class classification using privileged information. Cognitive Computation. 2020:1–14.
Saunders C, Gammerman A, Vovk V. Ridge regression learning algorithm in dual variables. Proceedings of the Fifteenth International Conference on Machine Learning, ICML ’98. San Francisco: Morgan Kaufmann Publishers Inc.; 1998. p. 515–521.
Wornyo D K, Shen X-J, Dong Y, Wang L, Huang S-C. Co-regularized kernel ensemble regression. World Wide Web. 2018;1–18.
Zhang L, Suganthan P N. Benchmarking ensemble classifiers with novel co-trained kernel ridge regression and random vector functional link ensembles [research frontier]. IEEE Comput Intell Mag 2017;12(4): 61–72.
He J, Ding L, Jiang L, Ma L. Kernel ridge regression classification. Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE; 2014. p. 2263–2267.
Leng Q, Qi H, Miao J, Zhu W, Su G. One-class classification with extreme learning machine. Math Probl Eng. 2014;1–11.
Gautam C, Tiwari A, Leng Q. On the construction of extreme learning machine for online and offline one-class classification-an expanded toolbox. Neurocomputing 2017;261:126–143. Software available at https://github.com/Chandan-IITI/One-Class-Kernel-ELM.
Gautam C, Tiwari A, Suresh S, Ahuja K. Adaptive online learning with regularized kernel for one-class classification. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2019;1–16.
Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 2011;42(2):513–529.
Iosifidis A, Mygdalis V, Tefas A, Pitas I. One-class classification based on extreme learning and geometric class information. Neural Process Lett. 2016;1–16.
Mygdalis V, Iosifidis A, Tefas A, Pitas I. Exploiting subclass information in one-class support vector machine for video summarization. IEEE International Conference on Acoustics, Speech and Signal Processing. 2015.
Mygdalis V, Iosifidis A, Tefas A, Pitas I. One class classification applied in facial image analysis. IEEE International Conference on Image Processing (ICIP). IEEE; 2016. p. 1644–1648.
Kasun L L C, Zhou H, Huang G-B, Vong C M. Representational learning with extreme learning machine for big data. IEEE Intell Syst 2013;28(6):31–34.
Wong C M, Vong C M, Wong P K, Cao J. Kernel-based multilayer extreme learning machines for representation learning. IEEE Trans Neural Netw Learn Syst 2018;29(3):757–762.
Jose C, Goyal P, Aggrwal P, Varma M. Local deep kernel learning for efficient non-linear svm prediction. International Conference on Machine Learning; 2013. p. 486–494.
Wilson A G, Hu Z, Salakhutdinov R, Xing E P. Deep kernel learning. Artificial Intelligence and Statistics; 2016. p. 370–378.
Yan S, Xu D, Zhang B, Zhang H-J, Yang Q, Lin S. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE transactions on pattern analysis and machine intelligence. 2007;29(1).
Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems. J Mach Learn Res 2014;15(1):3133–3181.
Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 2003;15(6):1373–1396.
Saul L K, Roweis S T. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 2003;4:119–155.
Boyer C, Chambolle A, Castro Y D, Duval V, De Gournay F, Weiss P. On representer theorems and convex regularization. SIAM J Optim 2019;29(2):1260–1281.
Duda R O, Hart P E, Stork D G, et al., Vol. 2. Pattern classification. New York: Wiley; 1973.
Lichman M. 2013. UCI machine learning repository.
Tax D M J, Duin R P W. Support vector domain description. Pattern Recogn Lett 1999;20 (11):1191–1199.
Chang C-C, Lin C-J. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011;2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Tax D M J. 2015. DDtools, the data description toolbox for MATLAB, version 2.1.2.
Iman R L, Davenport J M. Approximations of the critical region of the fbietkan statistic. Commun Stat-Theory Methods 1980;9(6):571–595.
Funding
This research was supported by Department of Electronics and Information Technology (DeITY, Govt. of India) under Visvesvaraya PhD scheme for electronics & IT.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gautam, C., Tiwari, A., Mishra, P.K. et al. Graph-Embedded Multi-Layer Kernel Ridge Regression for One-Class Classification. Cogn Comput 13, 552–569 (2021). https://doi.org/10.1007/s12559-020-09804-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-020-09804-7