Abstract
Extreme learning machine (ELM) has been an important research topic over the last decade due to its high efficiency, easy-implementation, unification of classification and regression, and unification of binary and multi-class learning tasks. Though integrating these advantages, existing ELM algorithms cannot directly handle the case where some features of the samples are missing or unobserved, which is usually very common in practical applications. The work in this paper fills this gap by proposing an absent ELM (A-ELM) algorithm to address the above issue. By observing the fact that some structural characteristics of a part of packed malware instances hold unreasonable values, we cast the packed executable identification tasks into an absence learning problem, which can be efficiently addressed via the proposed A-ELM algorithm. Extensive experiments have been conducted on six UCI data sets and a packed data set to evaluate the performance of the proposed algorithm. As indicated, the proposed A-ELM algorithm is superior to other imputation algorithms and existing state-of-the-art ones.
Similar content being viewed by others
Notes
In our experiments, such a matrix is generated by a Matlab function rand (n, m).
References
Huang G-B, Siew CK (2004) Extreme learning machine: RBF network case. In: International conference of the control, automation, robotics and vision, pp 1029–1036
Huang G-B, Lei C, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1C3):489–501
Huang G-B, Lei C (2007) Convex incremental extreme learning machine. Neurocomputing 70(16–18):3056–3062
Huang G-B, Lei C (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468
Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B 42(2):513–529
Liu Q, He Q, Shi Z (2008) Extreme support vector machine classifier. In: Proceedings of the advances in knowledge discovery and data mining, pp 222–233
Feng G, Huang G-B, Lin Q, Gay R (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357
Zhang R, Lan Y, Huang G-B, Xu Z-B (2012) Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Trans Neural Netw Learn Syst 23(2):365–371
Huang G-B, Wang D (2011) Advances in extreme learning machines (ELM2010). Neurocomputing 74(16):2411–2412
Huang G-B, Wang D, Lan Y (2011) Extreme learning machines: a survey. Int J Machine Learn Cybern 2:107–122
Huang G-B, Wang D (2013) Advances in extreme learning machines (ELM2011). Neurocomputing 102:1–2
Zong W, Huang G-B, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242
Yu Q, Miche Y, Eirola E, van Heeswijk M, Séverin E, Lendasse A (2013) Regularized extreme learning machine for regression with missing data. Neurocomputing 102:45–51
Chen Y, Zhao Z, Wang S, Chen Z (2012) Extreme learning machine-based device displacement free activity recognition model. Soft Comput 16(9):1617–1625
Ghahramani Z, Jordan MI (1993) Supervised learning from incomplete data via an em approach. In: Advances in neural information processing systems 6, pp 120–127
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9:293–300
Inc. CVX Research. CVX: Matlab software for disciplined convex programming, version 2.0 beta. http://cvxr.com/cvx, September 2012
Nesterov Y (2004) Introductory lectures on convex optimization: a basic course (applied optimization), 1st edn. Springer, Netherlands
Chechik G, Heitz G, Elidan G, Abbeel P, Koller D (2008) Max-margin classification of data with absent features. J Mach Learn Res 9:1–21
Egele M, Scholte T, Kirda E, Kruegel C (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv (CSUR) 44(2):6
Debray S, Patel J (2010) Reverse engineering self-modifying code: unpacker extraction. In: IEEE 17th working conference on reverse engineering (WCRE), 2010, pp 131–140
Guo F, Ferrie P, Chiueh T-C (2008) A study of the packer problem and its solutions. In: Recent advances in intrusion detection. Springer, Berlin, pp 98–115
Santos I, Xabier U-P, Sanz B, Laorden C, Bringas PG (2011) Collective classification for packed executable identification. In: Proceedings of the 8th annual collaboration, electronic messaging, anti-abuse and spam conference. ACM, pp 23–30
Perdisci R, Lanzi A, Lee W (2008) Classification of packed executables for accurate computer virus detection. Pattern Recogn Lett 29:1941–1946
Smola A, Vishwanathan SVN, Hofmann T (2005) Kernel methods for missing variables. In: Cowell RG, Ghahramani Z (eds) AISTATS05, pp 325–332
Acknowledgments
The authors would like to thank Prof. Guang-Bin Huang for the valuable comments. This work was supported by the National Natural Science Foundation of China (Project Nos. 61105050, 61170287, 61303264 and 61271252).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xie, P., Liu, X., Yin, J. et al. Absent extreme learning machine algorithm with application to packed executable identification. Neural Comput & Applic 27, 93–100 (2016). https://doi.org/10.1007/s00521-014-1558-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-014-1558-4