Abstract
Cyber threat intelligence systems provide a way to prioritize alerts and allow security teams to focus on critical threats and utilize their resources more efficiently. One challenge in these systems comes in accurately classifying the data that is input and processed within the system which is critical to producing meaningful output. To tackle this problem, in this paper we research text-based cybersecurity data classification methods using a multi-layer keyword filtering method and unsupervised learning methods using doc2vec. We also look at how we can optimize the accuracy and efficiency of cyber threat intelligence systems through the use of ensemble learning. This research will help with prioritization of cyber threat intelligence systems which allow security teams to use their resources more efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Torres, A.: Building a world-class security operations center: a roadmap. SANS Institute, May 2015
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Liu, Y., Yao, X.: Ensemble learning via negative correlation. Neural Netw. 12(10), 1399–1404 (1999)
Hernandez-Suarez, A., Sanchez-Perez, G., Toscano-Medina, K., Martinez-Hernandez, V., Perez-Meana, H., Olivares-Mercado, J., Sanchez, V.: Social sentiment sensor in Twitter for predicting cyber-attacks using \({l}\)1 regularization. Sensors 18(5), 1380 (2018)
Mittal, S., Das, P.K., Mulwad, V., Joshi, A., Finin, T.: CyberTwitter: using Twitter to generate alerts for cybersecurity threats and vulnerabilities. In: Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 860–867. IEEE Press (2016)
Lee, K.-C., Hsieh, C.-H., Wei, L.-J., Mao, C.-H., Dai, J.-H., Kuang, Y.-T.: Sec-Buzzer: cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation. Soft Comput. 21(11), 2883–2896 (2017)
Le Sceller, Q., Karbab, E.B., Debbabi, M., Iqbal, F.: SONAR: automatic detection of cyber security events over the Twitter stream. In: Proceedings of the 12th International Conference on Availability, Reliability and Security, p. 23. ACM (2017)
Mendsaikhan, O., Hasegawa, H., Yamaguchi, Y., Shimada, H.: Identification of cybersecurity specific content using the Doc2Vec language model. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 396–401 (2019)
Rodriguez, A., Okamura, K.: Generating real time cyber situational awareness information through social media data mining. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 502–507. IEEE (2019)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 142–150. Association for Computational Linguistics (2011)
Rehurek, R., Sojka, P.: Gensim—statistical semantics in python. Statistical semantics; gensim; Python; LDA; SVD (2011)
Acknowledgements
This research was supported by JSPS KAKENHI Grant Number JP16K00480.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Rodriguez, A., Okamura, K. (2020). Cybersecurity Text Data Classification and Optimization for CTI Systems. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds) Web, Artificial Intelligence and Network Applications. WAINA 2020. Advances in Intelligent Systems and Computing, vol 1150. Springer, Cham. https://doi.org/10.1007/978-3-030-44038-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-44038-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44037-4
Online ISBN: 978-3-030-44038-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)