Abstract
Proper classification of the available data into viable Malware classes is very important to analyze the causes, vulnerabilities & intents behind these attacks and to build up systems that are secure from these kinds of attacks. In this paper, we describe an approach that enables us to classify free text documents with good precision and performance. We classify the documents to the malware class that has the highest matching of the characteristics of the document based the ontology model. We have experimented with our integrated approach on a large number of documents and found that it provides a very good classification more precise that than other analysis techniques for documents about malware.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Here, tf represents term-frequency & idf stands for inverse-domain-frequency.
References
Iannacone, M., Bohn, S., Nakamura, G., Gerth, J., Huffer, K., Bridges, R., Ferragut, E., Goodall, J.: Developing an ontology for cyber security knowledge graphs. In: Proceedings of the 10th Annual Cyber and Information Security Research Conference. CISR 2015, pp. 12:1–12:4 (2015)
IBM: Watson developer cloud. https://natural-language-understanding-demo.mybluemix.net/
Zaki, M.J., Meira Jr., W.: Data Mining and Analysis Fundamental Concepts and Algorithms. Cambridge University Press, Cambridge (2014)
Mulwad, V., Li, W., Joshi, A., Finin, T., Viswanathan, K.: Extracting information about security vulnerabilities from web text. In: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. WI-IAT 2011, vol. 3, pp. 257–260 (2011)
Obrst, L., Chase, P., Markeloff, R.: Developing an Ontology of the Cyber Security Domain (2012)
Barnum, S.: Standardizing cyber threat intelligence information with the structured threat information expression. In: MITRE Security (2012). https://stixproject.github.io/
Syed, Z., Padia, A., Mathews, M.L., Finin, T., Joshi, A.: UCO: a unified cybersecurity ontology. In: Proceedings of the AAAI Workshop on Artificial Intelligence for Cyber Security. AAAI Press, February 2016
Trabelsi, S., Plate, H., Abida, A., Aoun, M.M.B., Zouaoui, A., Missaoui, C., Gharbi, S., Ayari, A.: Monitoring software vulnerabilities through social networks analysis. In: SECRYPT, pp. 236–242 (2015)
Trinkle, P.: An Introduction to Unsupervised Document Classification (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rath, M., Agarwal, S., Shyamasundar, R.K. (2017). Semi Supervised NLP Based Classification of Malware Documents. In: Shyamasundar, R., Singh, V., Vaidya, J. (eds) Information Systems Security. ICISS 2017. Lecture Notes in Computer Science(), vol 10717. Springer, Cham. https://doi.org/10.1007/978-3-319-72598-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-72598-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72597-0
Online ISBN: 978-3-319-72598-7
eBook Packages: Computer ScienceComputer Science (R0)