Supervised Detection of Infected Machines Using Anti-virus Induced Labels

Cohen, Tomer; Hendler, Danny; Potashnik, Dennis

doi:10.1007/978-3-319-60080-2_3

Tomer Cohen¹⁵,
Danny Hendler¹⁵ &
Dennis Potashnik¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10332))

Included in the following conference series:

International Conference on Cyber Security Cryptography and Machine Learning

1630 Accesses

Abstract

Traditional antivirus software relies on signatures to uniquely identify malicious files. Malware writers, on the other hand, have responded by developing obfuscation techniques with the goal of evading content-based detection. A consequence of this arms race is that numerous new malware instances are generated every day, thus limiting the effectiveness of static detection approaches. For effective and timely malware detection, signature-based mechanisms must be augmented with detection approaches that are harder to evade.

We introduce a novel detector that uses the information gathered by IBM’s QRadar SIEM (Security Information and Event Management) system and leverages anti-virus reports for automatically generating a labelled training set for identifying malware. Using this training set, our detector is able to automatically detect complex and dynamic patterns of suspicious machine behavior and issue high-quality security alerts. We believe that our approach can be used for providing a detection scheme that complements signature-based detection and is harder to circumvent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Newer AV versions have the capability of stopping the process and then deleting the file.
2.
More details on the test set are provided in Sect. 5.

References

Hadoop distributed file system. http://hadoop.apache.org/
Spark cluster computing. http://spark.apache.org/
Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou II, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: Detecting the rise of DGA-based malware. In: USENIX Security Symposium, vol.12 (2012)
Google Scholar
Bocchi, E., Grimaudo, L., Mellia, M., Baralis, E., Saha, S., Miskovic, S., Modelo-Howard, G., Lee, S.-J.: Magma network behavior classifier for malware traffic. Comput. Netw. 109, 142–156 (2016)
Article Google Scholar
Dietrich, C.J., Rossow, C., Pohlmann, N.: CoCoSpot: clustering and recognizing botnet command and control channels using traffic analysis. Comput. Netw. 57(2), 475–486 (2013)
Article Google Scholar
Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, vol. 5, pp. 139–154 (2008)
Google Scholar
Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)
Google Scholar
Hall, M.A., Smith, L.A.: Practical feature subset selection for machine learning (1998)
Google Scholar
IBM: IBM Security QRadar SIEM. http://www-03.ibm.com/software/products/en/qradar-siem/
iicybersecurity: International institute of cyber security. https://iicybersecurity.wordpress.com
Jiang, N., Cao, J., Jin, Y., Li, L.E., Zhang, Z.-L.: Identifying suspicious activities through DNS failure graph analysis. In: 2010 18th IEEE International Conference on Network Protocols (ICNP), pp. 144–153. IEEE (2010)
Google Scholar
Kent, J.T.: Information gain and a general measure of correlation. Biometrika 70(1), 163–173 (1983)
Article MathSciNet MATH Google Scholar
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, vol. 2, pp. 129–134 (1992)
Google Scholar
Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Elovici, Y.: Unknown malcode detection via text categorization and the imbalance problem. In: IEEE International Conference on Intelligence and Security Informatics, ISI 2008, pp. 156–161. IEEE (2008)
Google Scholar
Musale, M., Austin, T.H., Stamp, M.: Hunting for metamorphic JavaScript malware. J. Comput. Virol. Hacking Tech. 11(2), 89–102 (2015)
Article Google Scholar
Narang, P., Ray, S., Hota, C., Venkatakrishnan, V.: PeerShark: detecting peer-to-peer botnets by tracking conversations. In: 2014 IEEE Security and Privacy Workshops (SPW), pp. 108–115. IEEE (2014)
Google Scholar
Nari, S., Ghorbani, A.A.: Automated malware classification based on network behavior. In: 2013 International Conference on Computing, Networking and Communications (ICNC), pp. 642–647. IEEE (2013)
Google Scholar
Deep Web News. https://darkwebnews.com
Weka 3: Data mining software in Java. University of Waikato. http://www.cs.waikato.ac.nz/ml/weka/
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI, vol. 10, p. 14 (2010)
Google Scholar
AV TEST: The independent it-security institute. https://www.av-test.org/en/statistics/malware/
Yen, T.-F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., Kirda, E.: Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th Annual Computer Security Applications Conference, pp. 199–208. ACM (2013)
Google Scholar
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA), pp. 297–300. IEEE (2010)
Google Scholar
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, vol. 3, pp. 856–863 (2003)
Google Scholar

Download references

Acknowledgments

This research was supported by IBM’s Cyber Center of Excellence in Beer Sheva and by the Cyber Security Research Center and the Lynne and William Frankel Center for Computing Science at Ben-Gurion University. We thank Yaron Wolfshtal from IBM for allowing Tomer to use IBM’s facilities, for providing us the data on which this research is based, and for many helpful discussions.

Author information

Authors and Affiliations

Department of Computer Science, Ben-Gurion University of the Negev, Beer Sheva, Israel
Tomer Cohen & Danny Hendler
IBM Cyber Center of Excellence, Beer Sheva, Israel
Dennis Potashnik

Authors

Tomer Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Danny Hendler
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Potashnik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danny Hendler .

Editor information

Editors and Affiliations

Ben-Gurion University of the Negev , Beer-Sheva, Israel
Shlomi Dolev
Tata Consultancy Services (India) , Chennai, Tamil Nadu, India
Sachin Lodha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen, T., Hendler, D., Potashnik, D. (2017). Supervised Detection of Infected Machines Using Anti-virus Induced Labels. In: Dolev, S., Lodha, S. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2017. Lecture Notes in Computer Science(), vol 10332. Springer, Cham. https://doi.org/10.1007/978-3-319-60080-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-60080-2_3
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60079-6
Online ISBN: 978-3-319-60080-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics