ABSTRACT
Cybersecurity vulnerability information is often sourced from multiple channels, such as government vulnerability repositories, individually maintained vulnerability-gathering platforms, or vulnerability-disclosure email lists and forums. Integrating vulnerability information from different channels enables comprehensive threat assessment and quick deployment to various security mechanisms. However, automatic integration of vulnerability information, especially those lacking decisive information (e.g., CVE-ID), is hindered by the limitations of today's entity alignment techniques.
In our study, we annotate and release the first cybersecurity-domain vulnerability alignment dataset, and highlight the unique characteristics of security entities, including the inconsistent vulnerability artifacts of identical vulnerability (e.g., impact and affected version) in different vulnerability repositories. Based on these characteristics, we propose an entity alignment model, CEAM, for integrating vulnerability information from multiple sources. CEAM equips graph neural network-based entity alignment techniques with two application-driven mechanisms: asymmetric masked aggregation and partitioned attention. These techniques selectively aggregate vulnerability artifacts to learn the semantic embeddings for vulnerabilities by an asymmetric mask, while ensuring that the artifacts critical to the vulnerability identification are always taken more consideration. Experimental results on vulnerability alignment datasets demonstrate that CEAM significantly outperforms state-of-the-art entity alignment methods.
- 2010. CVEexploit. https://www.exploit-db.com/exploits/15802.Google Scholar
- 2010. ICSA-10--214-01. https://www.cisa.gov/uscert/ics/advisories/ICSA-10-214-01.Google Scholar
- 2011. ICSA-11-017-01. https://www.cisa.gov/uscert/ics/advisories/ICSA-11-017-01.Google Scholar
- 2015. Csoonline. https://www.csoonline.com/article/3122460/over-6000-vulnerabilities-went-unassigned-by-mitres-cve-project-in-2015.html.Google Scholar
- 2015. VulnDB. https://vulndb.cyberriskanalytics.com.Google Scholar
- 2016. InfoSecInstitute. https://resources.infosecinstitute.com/topic/vulnerability-scanners-2/.Google Scholar
- 2016. Synopsys. https://www.synopsys.com/blogs/software-security/cve-ids-missing/.Google Scholar
- 2018. ICSA-10--362-01. https://www.cisa.gov/uscert/ics/advisories/ICSA-10-362-01.Google Scholar
- 2018. RIPS - A static source code analyser for vulnerabilities in PHP scripts. http://rips-scanner.sourceforge.net/.Google Scholar
- 2018. Tenable. https://tenable.my.site.com/s/question/0D5f200005QpXEfCAN/ cve-id-does-not-get-reflected-for-all-the-vulnerabilities-detected.Google Scholar
- 2019. CVE-2019--9009. https://nvd.nist.gov/vuln/detail/CVE-2019--9009.Google Scholar
- 2019. ICSA-19--255-05. https://www.cisa.gov/uscert/ics/advisories/icsa-19-255-05.Google Scholar
- 2019. SecurityTracker. https://securitytracker.com/.Google Scholar
- 2020. SecurityFocus: an online computer security news portal and purveyor of information security services. https://www.securityfocus.com/.Google Scholar
- 2021. BugTraq mail list. https://bugtraq.securityfocus.com/archive.Google Scholar
- 2021. CISA ICS-CERT Advisories. https://www.us-cert.gov/ics/advisories.Google Scholar
- 2021. SecList BugTraq mail list. https://seclists.org/bugtraq.Google Scholar
- 2021. SecList Fulldisclosure mail list. https://seclists.org/fulldisclosure.Google Scholar
- 2021. SecList OSS-SEC mail list. https://seclists.org/oss-sec.Google Scholar
- 2021. Tenable2. https://community.tenable.com/s/question/ 0D53a00007jcmOxCAI/cve-number-is-missing.Google Scholar
- 2022. Castlegarde Security. https://www.castlegarde.com/Virtual-Vulnerability-Assessment-VVA.Google Scholar
- 2022. Connectwise. https://info.connectwise.com/cybersecurity/cybersecurity/ demo/sem/vulnerability-management?.Google Scholar
- 2022. IBM security. https://www.ibm.com/support/pages/how-identify-security-vulnerabilities-within-application-impacts-and-remediation.Google Scholar
- 2022. Nozominetworks. https://www.nozominetworks.com/webinars/nozomi-networks-labs-1st-half-2022-ot-iot-security-review/.Google Scholar
- 2022. OS-Aware Vulnerability Prioritization via Differential Severity Analysis. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA. https://www.usenix.org/conference/usenixsecurity22/presentation/ wu-qiushiGoogle Scholar
- 2022. PAS Cyber Integrity. https://pas.com/products-and-services/ot-ics-cyber-security/pas-cyber-integrity.Google Scholar
- 2022. Red Balloon Security. https://redballoonsecurity.com/.Google Scholar
- 2022. SecurityWeek. https://www.securityweek.com/many-ics-vulnerability-advisories-contain-errors-report/.Google Scholar
- 2022. The Good, Bad and Ugly of CVSS Scores. https://blog.denexus.io/beyond-cvss-scoring.Google Scholar
- 2023. Android Bulletin. https://source.android.com/security/bulletin.Google Scholar
- 2023. Conflicting vulnerability scores can affect patch prioritization, researchers warn. https://www.cybersecurity-help.cz/blog/3111.html.Google Scholar
- 2023. CPE Dictionary - Common Platform Enumeration: IT products and Platforms. https://csrc.nist.gov/Projects/Security-Content-Automation-Protocol/ Specifications/cpe/dictionary.Google Scholar
- 2023. CVE Numbering Authorities. https://cve.mitre.org/cve/cna.html.Google Scholar
- 2023. CVSS (Common Vulnerability Scoring System). https://www.first.org/cvss/.Google Scholar
- 2023. CVSS Specification Document (Version 3.1 Release). https://www.first.org/ cvss/specification-document.Google Scholar
- 2023. CWE ( Common Weakness Enumeration: vulnerability categorizations). https://cwe.mitre.org/.Google Scholar
- 2023. CWE Architecture by Hardware Design. https://cwe.mitre.org/data/ definitions/1194.html.Google Scholar
- 2023. CWE Architecture by Research Concept. https://cwe.mitre.org/data/ definitions/1000.html.Google Scholar
- 2023. CWE Architecture by Software Development. https://cwe.mitre.org/data/ definitions/699.html.Google Scholar
- 2023. Implementation of CEAM: code, datasets, appendices, and full paper. https://sites.google.com/view/vulnerablity-ailignment/home.Google Scholar
- 2023. Palo Alto Networks Security Advisories. https://security.paloaltonetworks. com/.Google Scholar
- 2023. Vulnerability Database. https://vuldb.com/.Google Scholar
- Luca Allodi, Sebastian Banescu, Henning Femmer, and Kristian Beckers. 2018. Identifying relevant information cues for vulnerability assessment using CVSS. In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy. 119--126.Google ScholarDigital Library
- Afsah Anwar, Ahmed Abusnaina, Songqing Chen, Frank Li, and David Mohaisen. 2021. Cleaning the NVD: Comprehensive quality assessment, improvements, and analyses. IEEE Transactions on Dependable and Secure Computing (2021).Google Scholar
- Max Berrendorf, Evgeniy Faerman, Valentyn Melnychuk, Volker Tresp, and Thomas Seidl. 2020. Knowledge graph entity alignment with graph convolutional networks: Lessons learned. Advances in Information Retrieval 12036 (2020), 3.Google Scholar
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).Google Scholar
- Yixin Cao, Zhiyuan Liu, Chengjiang Li, Juanzi Li, and Tat-Seng Chua. 2019. Multi-channel graph neural network for entity alignment. arXiv preprint arXiv:1908.09898 (2019).Google Scholar
- Muhao Chen, Yingtao Tian, Mohan Yang, and Carlo Zaniolo. 2016. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. arXiv preprint arXiv:1611.03954 (2016).Google ScholarDigital Library
- Kenneth Ward Church. 2017. Word2Vec. Natural Language Engineering 23, 1 (2017), 155--162.Google ScholarCross Ref
- CVE. 2023. CVE. https://cve.mitre.org/.Google Scholar
- Ying Dong, Wenbo Guo, Yueqi Chen, Xinyu Xing, Yuqing Zhang, and Gang Wang. 2019. Towards the detection of inconsistencies in public security vulnerability reports. In 28th {USENIX} Security Symposium ({USENIX} Security 19). 869--885.Google Scholar
- Yujie Fan, Yiming Zhang, Shifu Hou, Lingwei Chen, Yanfang Ye, Chuan Shi, Liang Zhao, and Shouhuai Xu. 2019. idev: Enhancing social coding security by cross-platform user identification between github and stack overflow. In 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019.Google ScholarCross Ref
- Sadegh Farhang, Mehmet Bahadir Kirdan, Aron Laszka, and Jens Grossklags. 2020. An empirical study of Android security bulletins in different vendors. In Proceedings of The Web Conference 2020. 3063--3069.Google ScholarDigital Library
- Xuan Feng, Xiaojing Liao, X Wang, Haining Wang, Qiang Li, Kai Yang, Hongsong Zhu, and Limin Sun. 2019. Understanding and securing device vulnerabilities through automated bug report analysis. In SEC'19: Proceedings of the 28th USENIX Conference on Security Symposium.Google ScholarDigital Library
- Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin 76, 5 (1971), 378.Google Scholar
- William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035.Google Scholar
- Yanchao Hao, Yuanzhe Zhang, Shizhu He, Kang Liu, and Jun Zhao. 2016. A joint embedding method for entity alignment of knowledge bases. In China Conference on Knowledge Graph and Semantic Computing. Springer, 3--14.Google ScholarCross Ref
- Ningyu He, Lei Wu, Haoyu Wang, Yao Guo, and Xuxian Jiang. 2020. Characterizing code clones in the ethereum smart contract ecosystem. In International Conference on Financial Cryptography and Data Security. Springer, 654--675.Google ScholarDigital Library
- Wei Hu, Qingheng Zhang, Zequn Sun, and Jiacheng Huang. 2019. MultiKE: a Multi-view Knowledge Graph Embedding Framework for Entity Alignment.. In OM@ ISWC. 189--190.Google Scholar
- Emrah Inan and Oguz Dikenelli. 2018. A Sequence Learning Method for Domain-Specific Entity Linking. In Proceedings of the Seventh Named Entities Workshop, NEWS@ACL 2018, Melbourne, Australia, July 20, 2018, Nancy F. Chen, Rafael E. Banchs, Xiangyu Duan, Min Zhang, and Haizhou Li (Eds.). Association for Computational Linguistics, 14--21. https://doi.org/10.18653/v1/w18-2403Google ScholarCross Ref
- Yuning Jiang, Manfred Jeusfeld, and Jianguo Ding. 2021. Evaluating the Data Inconsistency of Open-Source Vulnerability Repositories. In The 16th International Conference on Availability, Reliability and Security. 1--10.Google Scholar
- Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016).Google Scholar
- Elmar Kiesling, Andreas Ekelhart, Kabul Kurniawan, and Fajar Ekaputra. 2019. The SEPSES Knowledge Graph: An Integrated Resource for Cybersecurity. In The Semantic Web - ISWC 2019, Chiara Ghidini, Olaf Hartig, Maria Maleshkova, Vojtěch Svátek, Isabel Cruz, Aidan Hogan, Jie Song, Maxime Lefrançois, and Fabien Gandon (Eds.). Springer International Publishing, Cham, 198--214.Google ScholarDigital Library
- Jan-Christoph Klie, Richard Eckart de Castilho, and Iryna Gurevych. 2020. From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 6982--6993. https://doi.org/10.18653/v1/2020.acl-main.624Google ScholarCross Ref
- Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Twenty-ninth AAAI conference on artificial intelligence.Google ScholarDigital Library
- Kangjie Lu, Aditya Pakki, and Qiushi Wu. 2019. Detecting {Missing-Check} Bugs via Semantic-and {Context-Aware} Criticalness and Constraints Inferences. In 28th USENIX Security Symposium (USENIX Security 19). 1769--1786.Google Scholar
- Xin Mao, Wenting Wang, Yuanbin Wu, and Man Lan. 2021. From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment. arXiv preprint arXiv:2109.02363 (2021).Google Scholar
- Xin Mao, Wenting Wang, Huimin Xu, Yuanbin Wu, and Man Lan. 2020. Relational reflection entity alignment. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 1095--1104.Google ScholarDigital Library
- Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, and Tudor Dumitras. 2015. The attack of the clones: A study of the impact of shared code on vulnerability patching. In 2015 IEEE symposium on security and privacy. IEEE, 692--708.Google ScholarDigital Library
- Tam Thanh Nguyen, Thanh Trung Huynh, Hongzhi Yin, Vinh Van Tong, Darnbi Sakong, Bolong Zheng, and Quoc Viet Hung Nguyen. 2020. Entity alignment for knowledge graphs with multi-order convolutional networks. IEEE Transactions on Knowledge and Data Engineering (2020).Google Scholar
- NVD. 2023. National Vulnerability Database. https://nvd.nist.gov/.Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle ScholarDigital Library
- Shichao Pei, Lu Yu, Guoxian Yu, and Xiangliang Zhang. 2020. REA: Robust cross-lingual entity alignment between knowledge graphs. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2175--2184.Google ScholarDigital Library
- Henrik Plate, Serena Elisa Ponta, and Antonino Sabetta. 2015. Impact assessment for vulnerabilities in open-source software libraries. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 411--420.Google ScholarDigital Library
- Zhiyuan Qi, Ziheng Zhang, Jiaoyan Chen, Xi Chen, Yuejia Xiang, Ningyu Zhang, and Yefeng Zheng. 2021. Unsupervised Knowledge Graph Alignment by Probabilistic Reasoning and Semantic Embedding. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021, Zhi-Hua Zhou (Ed.). ijcai.org, 2019--2025. https://doi.org/10.24963/ijcai.2021/278Google ScholarCross Ref
- Radim Rehrek, Petr Sojka, et al. 2011. Gensim-statistical semantics in python. Retrieved from genism. org (2011).Google Scholar
- Wei Shen, Jiawei Han, Jianyong Wang, Xiaojie Yuan, and Zhenglu Yang. 2018. SHINE: A General Framework for Domain-Specific Entity Linking with Heterogeneous Information Networks. IEEE Trans. Knowl. Data Eng. 30, 2 (2018), 353--366. https://doi.org/10.1109/TKDE.2017.2730862Google ScholarCross Ref
- Anselm Strauss and Juliet Corbin. 1990. Basics of qualitative research. Sage publications.Google Scholar
- Fabian M. Suchanek, Serge Abiteboul, and Pierre Senellart. 2011. PARIS: Proba- bilistic Alignment of Relations, Instances, and Schema. Proc. VLDB Endow. 5, 3 (2011), 157--168. https://doi.org/10.14778/2078331.2078332Google ScholarDigital Library
- Zequn Sun, Wei Hu, and Chengkai Li. 2017. Cross-lingual entity alignment via joint attribute-preserving embedding. In International Semantic Web Conference. Springer, 628--644.Google ScholarDigital Library
- Zequn Sun, Wei Hu, Qingheng Zhang, and Yuzhong Qu. 2018. Bootstrapping Entity Alignment with Knowledge Graph Embedding.. In IJCAI, Vol. 18. 4396--4402.Google Scholar
- Guido Van Rossum and Fred L. Drake. 2009. Python 3 Reference Manual. CreateS-pace, Scotts Valley, CA.Google Scholar
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google Scholar
- Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).Google Scholar
- Zhichun Wang, Qingsong Lv, Xiaohan Lan, and Yu Zhang. 2018. Cross-lingual knowledge graph alignment via graph convolutional networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 349--357.Google ScholarCross Ref
- Qiushi Wu, Yue Xiao, Xiaojing Liao, and Kangjie Lu. 2022. {OS-Aware} Vulnera-bility Prioritization via Differential Severity Analysis. In 31st USENIX Security Symposium (USENIX Security 22). 395--412.Google Scholar
- Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, and Dongyan Zhao. 2019. Jointly learning entity and relation representations for entity alignment. arXiv preprint arXiv:1909.09317 (2019).Google Scholar
- Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, and Dongyan Zhao. 2020. Neighborhood matching network for entity alignment. arXiv preprint arXiv:2005.05607 (2020).Google Scholar
- Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 363--376.Google ScholarDigital Library
- Yuchen Yan, Lihui Liu, Yikun Ban, Baoyu Jing, and Hanghang Tong. 2021. Dynamic Knowledge Graph Alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4564--4572.Google ScholarCross Ref
- Guanqun Yang, Shay Dineen, Zhipeng Lin, and Xueqing Liu. 2021. Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pretrained Language Models. In Deployable Machine Learning for Security Defense. MLHat 2021. Communications in Computer and Information Science, Vol. 1482. Springer, Cham, 55--78. https://doi.org/10.1007/978-3-030-87839-9_3Google ScholarCross Ref
- Kaisheng Zeng, Chengjiang Li, Lei Hou, Juanzi Li, and Ling Feng. 2021. A comprehensive survey of entity alignment for knowledge graphs. AI Open 2 (2021), 1--13.Google ScholarCross Ref
- Xiang Zhao, Weixin Zeng, Jiuyang Tang, Wei Wang, and Fabian Suchanek. 2020. An experimental study of state-of-the-art entity alignment approaches. IEEE Transactions on Knowledge & Data Engineering 01 (2020), 1--1.Google Scholar
- Hao Zhu, Ruobing Xie, Zhiyuan Liu, and Maosong Sun. 2017. Iterative entity alignment via knowledge embeddings. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI).Google ScholarCross Ref
- Qi Zhu, Hao Wei, Bunyamin Sisman, Da Zheng, Christos Faloutsos, Xin Luna Dong, and Jiawei Han. 2020. Collective Multi-Type Entity Alignment Between Knowledge Graphs. In Proceedings of The Web Conference 2020. https://doi.org/ 10.1145/3366423.3380289Google ScholarDigital Library
- Renbo Zhu, Meng Ma, and Ping Wang. 2021. RAGA: Relation-aware Graph Attention Networks for Global Entity Alignment. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 501--513.Google Scholar
Index Terms
- Vulnerability Intelligence Alignment via Masked Graph Attention Networks
Recommendations
Cross-Modal Graph Attention Network for Entity Alignment
MM '23: Proceedings of the 31st ACM International Conference on MultimediaThe increasing popularity of multi-modal knowledge graphs (MMKGs) has led to a need for efficient entity alignment techniques that can exploit multi-modal information to integrate knowledge from different sources. GNN-based multi-modal entity alignment (...
Adaptive Entity Alignment for Cross-Lingual Knowledge Graph
Knowledge Science, Engineering and ManagementAbstractEntity alignment is a key step in knowledge graph (KG) fusion, which aims to match the same entity from different KGs. Currently, embedding-based entity alignment is the mainstream. It embeds entities into low-dimensional vectors and transfers ...
An Entity Alignment Method Based on Graph Attention Network with Pre-classification
Web Information Systems and ApplicationsAbstractEntity alignment is the process of identifying entities that point to the same object in different knowledge graphs. Entity alignment is a key step in building knowledge graphs, and the result of entity alignment directly affects the quality of ...
Comments