Abstract
Malware is overgrowing, causing severe loss to different institutions. The existing techniques, like static and dynamic analysis, fail to mitigate newly generated malware. Also, the signature, behavior, and anomaly-based defense mechanisms are susceptible to obfuscation and polymorphism attacks. With machine learning in practice, several authors proposed different classification and visualization techniques for malware detection. Images have proved worth analyzing the behavior of malware. Deep neural networks extract much information from it without having expert domain knowledge. On the other hand, the scarcity of diverse malware data available with clients, and their privacy concerns about sharing data with a centralized curator makes it challenging to build a more reliable model. This paper proposes a lightweight Convolution Neural Network (CNN) based model extracting relevant features using call graph, n-gram, and image transformations. Further, Auxiliary Classifier Generative Adversarial Network (AC-GAN) is used for generating unseen data for training purposes. The model is extended for federated setup to build an effective malware detection system. We have used the Microsoft malware dataset for training and evaluation. The result shows that the federated approach achieves the accuracy closer to centralized training while preserving data privacy at an individual organization.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference, ACSAC 2007, pp. 421–430, December 2007
Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Proc. Comput. Sci. 46, 804–811 (2015)
Carlin, D., Cowan, A., O’Kane, P., Sezer, S.: The effects of traditional anti-virus labels on malware detection using dynamic runtime opcodes. IEEE Access 5, 17 742–17 752 (2017)
Harel, D. (ed.): First-Order Dynamic Logic. LNCS, vol. 68. Springer, Heidelberg (1979). https://doi.org/10.1007/3-540-09237-4
Pechaz, B., Jahan, M.V., Jalali, M.: Malware detection using hidden Markov model based on Markov blanket feature selection method. In: 2015 International Congress on Technology, Communication and Knowledge (ICTCK), pp. 558–563, November 2015
Liu, C., Zhang, Z., Wang, S.: An android malware detection approach using Bayesian inference. In: 2016 IEEE International Conference on Computer and Information Technology (CIT), pp. 476–483, December 2016
Rathore, H., Agarwal, S., Sahay, S.K., Sewak, M.: Malware detection using machine learning and deep learning. CoRR, vol. abs/1904.02441 (2019). http://arxiv.org/abs/1904.02441
Yousefi-Azar, M., Varadharajan, V., Hamey, L., Tupakula, U.: Autoencoder-based feature learning for cyber security applications. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3854–3861, May 2017
Zhao, Y., Xu, C., Bo, B., Feng, Y.: MalDeep: a deep learning classification framework against malware variants based on texture visualization. Secur. Commun. Netw. 2019, 1–12 (2019)
Lu, R.: Malware detection with LSTM using opcode language. ArXiv, vol. abs/1906.04593 (2019)
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B.A.: Federated learning of deep networks using model averaging. CoRR, vol. abs/1602.05629 (2016). http://arxiv.org/abs/1602.05629
Le, Q., Boydell, O., Namee, B.M., Scanlon, M.: Deep learning at the shallow end: malware classification for non-domain experts. CoRR, vol. abs/1807.08265 (2018). http://arxiv.org/abs/1807.08265
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 107138 (2020). http://www.sciencedirect.com/science/article/pii/S1389128619304736
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S. (eds.) ICONIP 2017. LNCS, vol. 10634, pp. 556–564. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70087-8_58
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. CoRR, vol. abs/1702.05983 (2017). http://arxiv.org/abs/1702.05983
Sewak, M., Sahay, S.K., Rathore, H.: An investigation of a deep learning based malware detection system. CoRR, vol. abs/1809.05888 (2018). http://arxiv.org/abs/1809.05888
Shamir, O., Srebro, N., Zhang, T.: Communication efficient distributed optimization using an approximate newton-type method. CoRR, vol. abs/1312.7853 (2013). http://arxiv.org/abs/1312.7853
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, Series Proceedings of Machine Learning Research, 06–11 Aug 2017, vol. 70, pp. 2642–2651. International Convention Centre. PMLR, Sydney, Australia. http://proceedings.mlr.press/v70/odena17a.html
Jiang, H., Turki, T., Wang, J.T.L.: DLGraph: malware detection using deep learning and graph embedding. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1029–1033 (2018)
Acknowledgement
We acknowledge the Ministry of Human Resource Development, Government of India, for providing fellowship to complete this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Singh, N., Kasyap, H., Tripathy, S. (2020). Collaborative Learning Based Effective Malware Detection System. In: Koprinska, I., et al. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-65965-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65964-6
Online ISBN: 978-3-030-65965-3
eBook Packages: Computer ScienceComputer Science (R0)