ABSTRACT
This paper proposes Begonia, a malware detection system through Pareto ensemble pruning. We convert the malware detection problem into the bi-objective Pareto optimization, aiming to trade off the classification accuracy and the size of classifiers as two objectives. We automatically generate several groups of base classifiers using SVM and generate solutions through bi-objective Pareto optimization. We then select the ensembles with highest accuracy of each group to form the final solutions, among which we hit the optimal solution where the combined loss function is minimal considering the trade-off between accuracy and time cost. We expect users to provide different trade-off levels to their different requirements to select the best solution. Experimental results show that Begonia can achieve higher accuracy with relatively lower overhead compared to the ensemble containing all the classifiers and can make a good trade-off to different requirements.
- D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, and K. Rieck. Drebin: Effective and explainable detection of android malware in your pocket. In NDSS, 2014.Google ScholarCross Ref
- S. Chen, M. Xue, Z. Tang, L. Xu, and H. Zhu. Stormdroid: A streaminglized machine learning-based system for detecting android malware. In Proceedings of the 11th ACM on Asia CCS, pages 377--388. ACM, 2016. Google ScholarDigital Library
- K. N. Khasawneh, M. Ozsoy, C. Donovick, N. B. Abu-Ghazaleh, and D. V. Ponomarev. Ensemble learning for low-level hardware-supported malware detection. In RAID, 2015. Google ScholarDigital Library
- C. Qian, Y. Yu, and Z.-H. Zhou. Pareto ensemble pruning. In AAAI, 2015. Google ScholarDigital Library
- C. Smutz and A. Stavrou. When a tree falls: Using diversity in ensemble classifiers to identify evasion in malware detectors. In NDSS, 2016.Google ScholarCross Ref
Index Terms
- POSTER: Accuracy vs. Time Cost: Detecting Android Malware through Pareto Ensemble Pruning
Recommendations
Malware detection by pruning of parallel ensembles using harmony search
Detection of malware using data mining techniques has been explored extensively. Techniques used for detecting malware based on structural features rely on being able to identify anomalies in the structure of executable files. The structural attributes ...
Using boosting to prune bagging ensembles
Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory ...
A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification
Sentiment analysis is a critical task of extracting subjective information from online text documents. Ensemble learning can be employed to obtain more robust classification schemes. However, most approaches in the field incorporated feature engineering ...
Comments