skip to main content
10.1145/3465481.3465767acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Virtual Knowledge Graphs for Federated Log Analysis

Published: 17 August 2021 Publication History

Abstract

Security professionals rely extensively on log data to monitor IT infrastructures and investigate potentially malicious activities. Existing systems support these tasks by collecting log messages in a database, from where log events can be queried and correlated. Such centralized approaches are typically based on a relational model and store log messages as plain text, which offers limited flexibility for the representation of heterogeneous log events and the connections between them. A knowledge graph representation can overcome such limitations and enable graph pattern-based log analysis, leveraging semantic relationships between objects that appear in heterogeneous log streams. In this paper, we present a method to dynamically construct such log knowledge graphs at query time, i.e., without a priori parsing, aggregation, processing, and materialization of log data. Specifically, we propose a method that – for a given query formulated in SPARQL – dynamically constructs a virtual log knowledge graph directly from heterogeneous raw log files across multiple hosts and contextualizes the result with internal and external background knowledge. We evaluate the approach across multiple heterogeneous log sources and machines and see encouraging results that indicate that the approach is viable and facilitates ad-hoc graph-analytic queries in federated settings.

References

[1]
2019. ATT&CK Matrix for Enterprise. https://attack.mitre.org/
[2]
Trevor J Bihl, Robert J Gutierrez, Kenneth W Bauer, Bradley C Boehmke, and Cade Saie. [n.d.]. Topological Data Analysis for Enhancing Embedded Analytics for Enterprise Cyber Log Analysis and Forensics. In Cybersecurity and Privacy in Government. 10. https://doi.org/10.24251/HICSS.2020.238
[3]
Diego Calvanese, Tahir Emre Kalayci, Marco Montali, and Ario Santoso. 2017. OBDA for Log Extraction in Process Mining. In Reasoning Web. Semantic Interoperability on the Web: 13th International Summer School 2017, London, UK, July 7-11, 2017, Tutorial Lectures. Springer International Publishing, Cham, 292–345. https://doi.org/10.1007/978-3-319-61033-7_9
[4]
Anton Chuvakin, Kevin Schmidt, and Chris Phillips. 2012. Logging and log management: the authoritative guide to understanding the concepts surrounding logging and log management. Newnes.
[5]
Andreas Ekelhart, Elmar Kiesling, and Kabul Kurniawan. 2018. Taming the logs - Vocabularies for semantic security analysis. Procedia Computer Science 137, 109–119. https://doi.org/10.1016/j.procs.2018.09.011
[6]
Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutiérrez, Axel Polleres, and Mario Arias. 2013. Binary RDF Representation for Publication and Exchange (HDT). Web Semantics: Science, Services and Agents on the World Wide Web 19 (2013), 22–41. http://www.websemanticsjournal.org/index.php/ps/article/view/328
[7]
Michael R Grimaila, Justin Myers, Robert F Mills, and Gilbert Peterson. 2012. Design and Analysis of a Dynamically Configured Log-based Distributed Security Event Detection Methodology. The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology 9, 3 (July 2012), 219–241. https://doi.org/10.1177/1548512911399303
[8]
Esther Palomar Guillermo Suárez de Tangil. 2013. Advances in Security Information Management: Perceptions and Outcomes. Nova Science Publishers, Incorporated, Commack, NY, USA.
[9]
Steve Harris, Andy Seaborne, and Eric Prud’hommeaux. 2013. SPARQL 1.1 query language. W3C recommendation 21, 10 (2013), 778.
[10]
Tayeb Kenaza and Mahdi Aiash. 2016. Toward an Efficient Ontology-Based Event Correlation in SIEM. Procedia Computer Science 83, 139–146. https://doi.org/10.1016/j.procs.2016.04.109
[11]
Elmar Kiesling, Andreas Ekelhart, Kabul Kurniawan, and Fajar Ekaputra. 2019. The SEPSES Knowledge Graph: An Integrated Resource for Cybersecurity. In The Semantic Web – ISWC 2019. Vol. 11779. Springer International Publishing, Cham, 198–214. https://doi.org/10.1007/978-3-030-30796-7_13
[12]
Igor Kotenko, Olga Polubelova, Andrey Chechulin, and Igor Saenko. 2013. Design and Implementation of a Hybrid Ontological-Relational Data Repository for SIEM Systems. Future Internet 5, 3 (July 2013), 355–375. https://doi.org/10.3390/fi5030355
[13]
Christopher Krügel, Thomas Toth, and Clemens Kerer. 2002. Decentralized Event Correlation for Intrusion Detection. In Information Security and Cryptology — ICISC 2001, Gerhard Goos, Juris Hartmanis, Jan van Leeuwen, and Kwangjo Kim (Eds.), Vol. 2288. Springer Berlin Heidelberg, Berlin, Heidelberg, 114–131. https://doi.org/10.1007/3-540-45861-1_10
[14]
Kabul Kurniawan, Elmar Kiesling, Andreas Ekelhart, and Fajar Ekaputra. 2020. Cross-Platform File System Activity Monitoring and Forensics – A Semantic Approach. In Hölbl M., Rannenberg K., Welzer T. (eds) ICT Systems Security and Privacy Protection. SEC 2020. IFIP Advances in Information and Communication Technology. Springer, Cham.
[15]
Max Landauer, Florian Skopik, Markus Wurzenberger, Wolfgang Hotwagner, and Andreas Rauber. 2021. Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed. IEEE Transactions on Reliability 70, 1 (March 2021), 402–415. https://doi.org/10.1109/TR.2020.3031317
[16]
Adam Oliner, Archana Ganapathi, and Wei Xu. 2012. Advances and Challenges in Log Analysis. Commun. ACM 55, 2 (Feb. 2012), 55–61. https://doi.org/10.1145/2076450.2076466
[17]
Christian Pape, Sven Reissmann, and Sebastian Rieger. 2013. RESTful Correlation and Consolidation of Distributed Logging Data in Cloud Environments. In The Eighth International Conference on Internet and Web Applications and Services. 7.
[18]
Julian Schütte, Roland Rieke, and Timo Winkelvos. 2012. Model-Based Security Event Management. In Computer Network Security. Vol. 7531. Springer Berlin Heidelberg, Berlin, Heidelberg, 181–190. https://doi.org/10.1007/978-3-642-33704-8_16
[19]
Florian Skopik and Roman Fiedler. 2013. Intrusion Detection in Distributed Systems using Fingerprinting and Massive Event Correlation. In GI-Jahrestagung. 15.
[20]
Ruben Taelman, Joachim Van Herwegen, Miel Vander Sande, and Ruben Verborgh. 2018. Comunica: A Modular SPARQL Query Engine for the Web. In The Semantic Web – ISWC 2018. Vol. 11137. Springer International Publishing, Cham, 239–255. https://doi.org/10.1007/978-3-030-00668-6_15
[21]
Guohui Xiao, Diego Calvanese, Roman Kontchakov, Domenico Lembo, Antonella Poggi, Riccardo Rosati, and Michael Zakharyaschev. 2018. Ontology-Based Data Access: A Survey. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, 5511–5519. https://doi.org/10.24963/ijcai.2018/777
[22]
Guohui Xiao, Linfang Ding, Benjamin Cogrel, and Diego Calvanese. 2019. Virtual Knowledge Graphs: An Overview of Systems and Use Cases. Data Intelligence 1, 3 (2019), 201–223. https://doi.org/10.1162/dint_a_00011
[23]
Xiaokui Shu, John Smiy, Danfeng Yao, and Heshan Lin. 2013. Massive distributed and parallel log analysis for organizational security. IEEE, 194–199. https://doi.org/10.1109/GLOCOMW.2013.6824985

Cited By

View all
  • (2024)Threat Detection Framework Based on Industrial Internet of Things LogsIEEE Access10.1109/ACCESS.2024.351409712(195642-195657)Online publication date: 2024
  • (2024)A literature review and existing challenges on software logging practicesEmpirical Software Engineering10.1007/s10664-024-10452-w29:4Online publication date: 18-Jun-2024
  • (2023)GLAD: Content-Aware Dynamic Graphs For Log Anomaly Detection2023 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG59574.2023.00007(9-18)Online publication date: 1-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security
August 2021
1447 pages
ISBN:9781450390514
DOI:10.1145/3465481
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Decentralized Log Querying
  2. Dynamic Log Extraction
  3. Forensics
  4. Semantic Log Analysis
  5. Virtual Log Graphs

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES 2021

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Threat Detection Framework Based on Industrial Internet of Things LogsIEEE Access10.1109/ACCESS.2024.351409712(195642-195657)Online publication date: 2024
  • (2024)A literature review and existing challenges on software logging practicesEmpirical Software Engineering10.1007/s10664-024-10452-w29:4Online publication date: 18-Jun-2024
  • (2023)GLAD: Content-Aware Dynamic Graphs For Log Anomaly Detection2023 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG59574.2023.00007(9-18)Online publication date: 1-Dec-2023
  • (2022)VloGraph: A Virtual Knowledge Graph Framework for Distributed Security Log AnalysisMachine Learning and Knowledge Extraction10.3390/make40200164:2(371-396)Online publication date: 11-Apr-2022
  • (2022)Modeling virtual knowledge graphs using relevant news data by NLP methods for business analysis2022 17th International Conference on Emerging Technologies (ICET)10.1109/ICET56601.2022.10004674(172-177)Online publication date: 29-Nov-2022
  • (2022)Ontology-Driven Artificial Intelligence in IoT ForensicsBreakthroughs in Digital Biometrics and Forensics10.1007/978-3-031-10706-1_12(257-286)Online publication date: 15-Oct-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media