CySecAlert: An Alert Generation System for Cyber Security Events Using Open Source Intelligence Data

Riebe, Thea; Wirth, Tristan; Bayer, Markus; Kühn, Philipp; Kaufhold, Marc-André; Knauthe, Volker; Guthe, Stefan; Reuter, Christian

doi:10.1007/978-3-030-86890-1_24

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12918))

Included in the following conference series:

International Conference on Information and Communications Security

2090 Accesses
9 Citations

Abstract

Receiving relevant information on possible cyber threats, attacks, and data breaches in a timely manner is crucial for early response. The social media platform Twitter hosts an active cyber security community. Their activities are often monitored manually by security experts, such as Computer Emergency Response Teams (CERTs). We thus propose a Twitter-based alert generation system that issues alerts to a system operator as soon as new relevant cyber security related topics emerge. Thereby, our system allows us to monitor user accounts with significantly less workload. Our system applies a supervised classifier, based on active learning, that detects tweets containing relevant information. The results indicate that uncertainty sampling can reduce the amount of manual relevance classification effort and enhance the classifier performance substantially compared to random sampling. Our approach reduces the number of accounts and tweets that are needed for the classifier training, thus making the tool easily and rapidly adaptable to the specific context while also supporting data minimization for Open Source Intelligence (OSINT). Relevant tweets are clustered by a greedy stream clustering algorithm in order to identify significant events. The proposed system is able to work near real-time within the required 15-min time frameand detects up to 93.8% of relevant events with a false alert rate of 14.81%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/PEASEC/CySecAlert.
2.
Twitter4J Version 4.0.7 (twitter4j.org/en/index.html on 14.08.2020).
3.
Weka v3.8.4(https://www.cs.waikato.ac.nz/ml/weka/ on 14.08.2020).

References

Reuter, C., Kaufhold, M.A.: Fifteen years of social media in emergencies: a retrospective review and future directions for crisis informatics. J. Contingencies Crisis Manage. 26(1), 41–57 (2018)
Article Google Scholar
Husák, M., Jirsík, T., Yang, S.J.: SoK: contemporary issues and challenges to enable cyber situational awareness for network security. In: Proceedings of the 15th International Conference on Availability, Reliability and Security. ARES 2020. Association for Computing Machinery, New York, NY, USA (2020)
Google Scholar
Yang, W., Lam, K.Y.: Automated cyber threat intelligence reports classification for early warning of cyber attacks in next generation SOC. In: International Conference on Information and Communication Systems (ICICS), pp. 145–164 (2020)
Google Scholar
Mittal, S., Das, P.K., Mulwad, V., Joshi, A., Finin, T.: CyberTwitter: using Twitter to generate alerts for cybersecurity threats and vulnerabilities. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 860–867. IEEE (2016)
Google Scholar
Behzadan, V., Aguirre, C., Bose, A., Hsu, W.: Corpus and deep learning classifier for collection of cyber threat indicators in Twitter stream. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 5002–5007. IEEE (2018)
Google Scholar
Tundis, A., Ruppert, S., Mühlhäuser, M.: On the automated assessment of open-source cyber threat intelligence sources. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12138, pp. 453–467. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50417-5_34
Chapter Google Scholar
Alves, F., Andongabo, A., Gashi, I., Ferreira, P.M., Bessani, A.: Follow the blue bird: a study on threat data published on Twitter. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds.) ESORICS 2020. LNCS, vol. 12308, pp. 217–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58951-6_11
Chapter Google Scholar
Koops, B.J., Hoepman, J.H., Leenes, R.: Open-source intelligence and privacy by design. Comput. Law Secur. Rev. 29(6), 676–688 (2013)
Article Google Scholar
Sabottke, C., Suciu, O., Dumitras, T.: Vulnerability disclosure in the age of social media: exploiting Twitter for predicting real-world exploits. In: 24th USENIX Security Symposium USENIX Security 15, pp. 1041–1056 (2015)
Google Scholar
Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015)
Article MathSciNet Google Scholar
Alves, F., Bettini, A., Ferreira, P.M., Bessani, A.: Processing tweets for cybersecurity threat awareness. arXiv preprint arXiv:1904.02072 (2019)
Trabelsi, S., et al.: Mining social networks for software vulnerabilities monitoring. In: 2015 7th International Conference on New Technologies, Mobility and Security (NTMS), pp. 1–7. IEEE (2015)
Google Scholar
Hasan, M., Orgun, M.A., Schwitter, R.: A survey on real-time event detection from the Twitter data stream. J. Inf. Sci. 44(4), 443–463 (2018)
Article Google Scholar
Kaufhold, M.A., Bayer, M., Reuter, C.: Rapid relevance classification of social media posts in disasters and emergencies: A system and evaluation featuring active, incremental and online learning. Inf. Process. Manage. 57(1), 102132 (2020)
Google Scholar
Habdank, M., Rodehutskors, N., Koch, R.: Relevancy assessment of tweets using supervised learning techniques: mining emergency related tweets for automated relevancy classification. In: 2017 4th International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 1–8. IEEE (2017)
Google Scholar
Settles, B.: Active learning literature survey. University of Wisconsin (2010)
Google Scholar
Imran, M., Mitra, P., Srivastava, J.: Enabling rapid classification of social media communications during crises. Int. J. Inf. Syst. Crisis Response Manage. (IJISCRAM) 8(3), 1–17 (2016)
Article Google Scholar
Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Machine Learning Proceedings 1994, pp. 148–156. Elsevier (1994)
Google Scholar
Allan, J., Lavrenko, V., Jin, H.: First story detection in TDT is hard. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, pp. 374–381 (2000)
Google Scholar
Ritter, A., Wright, E., Casey, W., Mitchell, T.: Weakly supervised extraction of computer security events from Twitter. In: Proceedings of the 24th International Conference on World Wide Web, pp. 896–905 (2015)
Google Scholar
Concone, F., De Paola, A., Re, G.L., Morana, M.: Twitter analysis for real-time malware discovery. In: 2017 AEIT International Annual Conference, pp. 1–6. IEEE (2017)
Google Scholar
Dionisio, N., Alves, F., Ferreira, P.M., Bessani, A.: Cyberthreat detection from twitter using deep neural networks. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Google Scholar
Bose, A., Behzadan, V., Aguirre, C., Hsu, W.H.: A novel approach for detection and ranking of trendy and emerging cyber threat events in Twitter streams. In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 871–878 (2019)
Google Scholar
Mayring, P.: Qualitative content analysis. Companion Qual. Res. 1(2004), 159–176 (2004)
Google Scholar
Sapienza, A., Ernala, S.K., Bessi, A., Lerman, K., Ferrara, E.: Discover: mining online chatter for emerging cyber threats. In: Companion Proceedings of the The Web Conference 2018, pp. 983–990 (2018)
Google Scholar
Le Sceller, Q., Karbab, E.B., Debbabi, M., Iqbal, F.: Sonar: automatic detection of cyber security events over the Twitter stream. In: Proceedings of the 12th International Conference on Availability, Reliability and Security (ARES), pp. 1–11 (2017)
Google Scholar
Lee, K.C., Hsieh, C.H., Wei, L.J., Mao, C.H., Dai, J.H., Kuang, Y.T.: Sec-buzzer: cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation. Soft. Comput. 21(11), 2883–2896 (2017)
Article Google Scholar
Dionísio, N., Alves, F., Ferreira, P.M., Bessani, A.: Towards end-to-end cyberthreat detection from twitter using multi-task learning. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)
Google Scholar
Fang, Y., Gao, J., Liu, Z., Huang, C.: Detecting cyber threat event from twitter using IDCNN and BiLSTM. Appl. Sci. 10(17), 5922 (2020)
Article Google Scholar
Ji, T., Zhang, X., Self, N., Fu, K., Lu, C.T., Ramakrishnan, N.: Feature driven learning framework for cybersecurity event detection. In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 196–203 (2019)
Google Scholar
Khandpur, R.P., Ji, T., Jan, S., Wang, G., Lu, C.T., Ramakrishnan, N.: Crowdsourcing cybersecurity: Cyber attack detection using social media. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1049–1057 (2017)
Google Scholar
Mittal, S., Joshi, A., Finin, T.: Cyber-all-intel: an AI for security related threat intelligence. arXiv preprint arXiv:1905.02895 (2019)
Simran, K., Balakrishna, P., Vinayakumar, R., Soman, K.P.: Deep learning approach for enhanced cyber threat indicators in Twitter stream. In: Thampi, S.M., Martinez Perez, G., Ko, R., Rawat, D.B. (eds.) SSCC 2019. CCIS, vol. 1208, pp. 135–145. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4825-3_11
Chapter Google Scholar
Bernard, J., Zeppelzauer, M., Lehmann, M., Müller, M., Sedlmair, M.: Towards user-centered active learning algorithms. In: Computer Graphics Forum, vol. 37, pp. 121–132. Wiley Online Library (2018)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

Download references

Acknoledgements

This work was supported by the German Federal Ministry for Education and Research (BMBF) in the projects CYWARN (13N15407) and KontiKat (13N14351), as well as by the BMBF and the Hessian Ministry of Higher Education, Research, Science and the Arts within their joint support of the National Research Center for Applied Cybersecurity ATHENE. We would like to thank the anonymous reviewers for their valuable and constructive comments.

Author information

Authors and Affiliations

Science and Technology for Peace and Security (PEASEC), Department of Computer Science, Technical University of Darmstadt, Darmstadt, Germany
Thea Riebe, Markus Bayer, Philipp Kühn, Marc-André Kaufhold & Christian Reuter
Interactive Graphics Systems Group, Technical University of Darmstadt, Darmstadt, Germany
Tristan Wirth, Volker Knauthe & Stefan Guthe

Authors

Thea Riebe
View author publications
You can also search for this author in PubMed Google Scholar
Tristan Wirth
View author publications
You can also search for this author in PubMed Google Scholar
Markus Bayer
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Kühn
View author publications
You can also search for this author in PubMed Google Scholar
Marc-André Kaufhold
View author publications
You can also search for this author in PubMed Google Scholar
Volker Knauthe
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Guthe
View author publications
You can also search for this author in PubMed Google Scholar
Christian Reuter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thea Riebe .

Editor information

Editors and Affiliations

Singapore Management University, Singapore, Singapore
Debin Gao
Tsinghua University, Beijing, China
Qi Li
Xi'an Jiaotong University, Xi'an, China
Xiaohong Guan
Chongqing University, Chongqing, China
Xiaofeng Liao

Appendices

Appendix A Dataset

Table 4 provides the websites and blogs we used to retrieve 170 accounts of the leading cyber security experts on Twitter, from which we gathered the dataset of 350,061 English tweets (see Sect. 3.1).

Table 4. Sources for cyber security experts on Twitter

Full size table

Appendix B Codebook

In Table 5 the codebook [24] for the annotation of tweets is presented, which is applied to the coding of the dataset (see Sect. 3.1). Table 5 gives an overview of the codes’ definitions.

Table 5. Codebook for tweet relevance classification.

Full size table

Appendix C Classifier Comparison

Figure 3 depicts the results of active classifier comparison. Experiment details are discussed in Sect. 3.2.

Appendix D Alert Generation by Similarity Threshold

Table 6 depicts how recall and alert generation is impacted by the similarity threshold of the greedy clustering (see Sect. 3.3).

Table 6. Performance measures of greedy clustering-based generated alerts for different similarity thresholds and for alert count thresholds 3 and 5 for the datasets S1 and S2, respectively.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Riebe, T. et al. (2021). CySecAlert: An Alert Generation System for Cyber Security Events Using Open Source Intelligence Data. In: Gao, D., Li, Q., Guan, X., Liao, X. (eds) Information and Communications Security. ICICS 2021. Lecture Notes in Computer Science(), vol 12918. Springer, Cham. https://doi.org/10.1007/978-3-030-86890-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-86890-1_24
Published: 17 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86889-5
Online ISBN: 978-3-030-86890-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CySecAlert: An Alert Generation System for Cyber Security Events Using Open Source Intelligence Data

Abstract

Access this chapter

Notes

References

Acknoledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix A Dataset

Appendix B Codebook

Appendix C Classifier Comparison

Appendix D Alert Generation by Similarity Threshold

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation