ABSTRACT
Most intrusion detection systems use supervised machine learning algorithms which allow them to detect only recorded types of malicious attacks. This paper applies a fundamentally different approach to the problem, exploiting Isolation Forests, an unsupervised machine learning algorithm in a new context. One of the most important advantages of the algorithm is that it can identify and record novel intrusion models. We conduct experiments using HTTP log data to explore the algorithm's accuracy under various conditions. We empirically determine the optimal values for the algorithm's parameters and prove that the originally suggested standard Isolation Forest's parameters do not always produce optimal results. Furthermore, we explore which HTTP features achieve the best results for differentiating between malicious and normal data by running a genetic algorithm. After applying the established results, we achieve approximately 300% increase in the accuracy and we decrease the requested time of the algorithm by nearly 50%.
- "Cyber threat hunting." https://sqrrl.com/solutions/cyber-threat-hunting/. Accessed: 2016-07-22.Google Scholar
- D. E. Cole, "Automating the hunt for hidden threats," Oct. 2015.Google Scholar
- A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, "A comparative study of anomaly detection schemes in network intrusion detection," SIAM International Conference on Data Mining, May 2003.Google Scholar
- F. T. Liu, K. M. Ting, and Z.-H. Zhou, "Isolation forest," pp. 413--422, Dec. 2008. Google ScholarDigital Library
- F. T. Liu, K. M. Ting, and Z.-H. Zhou, "Isolation-based anomaly detection," ACM Transactions on Knowledge Discovery from Data, vol. 6, 2012. Google ScholarDigital Library
- J. tsung Chiang, "The masking and swamping effects using the planted mean-shift outliers models," Int. J. Contemp. Math. Sciences, vol. 2, pp. 297--307, 2007.Google ScholarCross Ref
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825--2830, 2011. Google ScholarDigital Library
- M. R. Smith and T. Martinez, "Improving classification accuracy by identifying and removing instances that should be misclassified"," The 2011 International Joint Conference on Neural Networks, pp. 2690--2697, 2011.Google Scholar
- S. Webb, J. Caverlee, and C. Pu, "Predicting web spam with http session information," in Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM '08, (New York, NY, USA), pp. 339--348, ACM, 2008. Google ScholarDigital Library
- Mila, "Contagio. malware dump.." http://contagiodump.blogspot.com/2010/08/malicious-documents-archive-for.html. Accessed: 2016-07-29.Google Scholar
- "Malware domain list." http://www.malwaredomainlist.com/. Accessed: 2016-07-29.Google Scholar
- "The bro network security monitor." https://www.bro.org/index.html. Accessed: 2016-07-29.Google Scholar
- C. E. Shannon, "A mathematical theory of communication," SIGMOBILE Mob. Comput. Commun. Rev., vol. 5, pp. 3--55, Jan. 2001. Google ScholarDigital Library
- C. E. Metz, "Basic principles of roc analysis," Seminars in Nuclear Medicine, vol. 8, no. 4, pp. 283--298, 1978.Google ScholarCross Ref
- F.-A. Fortin, F.-M. De Rainville, M.-A. Gardner, M. Parizeau, and C. Gagné, "DEAP: Evolutionary algorithms made easy," Journal of Machine Learning Research, vol. 13, pp. 2171--2175, jul 2012. Google ScholarDigital Library
- K. S. Tang, K. F. Man, S. Kwong, and Q. He, "Genetic algorithms and their applications," IEEE Signal Processing Magazine, vol. 13, pp. 22--37, Nov 1996.Google ScholarCross Ref
Recommendations
An Overview of Cyber Threat Intelligence Platform and Role of Artificial Intelligence and Machine Learning
Information Systems SecurityAbstractEver enhancing computational capability of digital system along with upgraded tactics, technology and procedure (TTPs) enforced by the cybercriminals, does not match to the conventional security mechanism for detection of intrusion and prevention ...
Strategic evolution of adversaries against temporal platform diversity active cyber defenses
ADS '14: Proceedings of the 2014 Symposium on Agent Directed SimulationAdversarial dynamics are a critical facet within the cyber security domain, in which there exists a co-evolution between attackers and defenders in any given threat scenario. While defenders leverage capabilities to minimize the potential impact of an ...
Enhancements to Threat, Vulnerability, and Mitigation Knowledge for Cyber Analytics, Hunting, and Simulations
Cross-linked threat, vulnerability, and defensive mitigation knowledge is critical in defending against diverse and dynamic cyber threats. Cyber analysts consult it by deductively or inductively creating a chain of reasoning to identify a threat starting ...
Comments