loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Tobias Eljasik-Swoboda 1 and Wilhelm Demuth 2

Affiliations: 1 ONTEC AG, Ernst-Melchior-Gasse 24/DG, 1100 Vienna, Austria ; 2 Schoeller Network Control GmbH, Ernst-Melchior-Gasse 24/DG, 1100 Vienna, Austria

Keyword(s): Industrial Applications of AI, Intelligence and Cybersecurity, Machine Learning, Natural Language Processing, Trainer/Athlete Pattern, Log Analysis, Log Management, Event Normalization, Security Information and Event Management, Big Data.

Abstract: When introducing log management or Security Information and Event Management (SIEM) practices, organizations are frequently challenged by Gartner’s 3 Vs of Big Data: There is a large volume of data which is generated at a rapid velocity. These first two Vs can be effectively handled by current scale-out architectures. The third V is that of variety which affects log management efforts by the lack of a common mandatory format for log files. Essentially every component can log its events differently. The way it is logged can change with every software update. This paper describes the Log Analysis Machine Learner (LAMaLearner) system. It uses a blend of different Artificial Intelligence techniques to overcome variety issues and identify relevant events within log files. LAMaLearner is able to cluster events and generate human readable representations for all events within a cluster. A human being can annotate these clusters with specific labels. After these labels exist, LAMaLearner lev erages machine learning based natural language processing techniques to label events even in changing log formats. Additionally, LAMaLearner is capable of identifying previously known named entities occurring anywhere within the logged event as well identifying frequently co-occurring variables in otherwise fixed log events. In order to stay up-to-date LAMaLearner includes a continuous feedback interface that facilitates active learning. In experiments with multiple differently formatted log files, LAMaLearner was capable of reducing the labeling effort by up to three orders of magnitude. Models trained on this labeled data achieved > 93% F1 in detecting relevant event classes. This way, LAMaLearner helps log management and SIEM operations in three ways: Firstly, it creates a quick overview about the content of previously unknown log files. Secondly, it can be used to massively reduce the required manual effort in log management and SIEM operations. Thirdly, it identifies commonly co-occurring values within logs which can be used to identify otherwise unknown aspects of large log files. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.144.96.159

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Eljasik-Swoboda, T. and Demuth, W. (2020). Leveraging Clustering and Natural Language Processing to Overcome Variety Issues in Log Management. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-395-7; ISSN 2184-433X, SciTePress, pages 281-288. DOI: 10.5220/0008856602810288

@conference{icaart20,
author={Tobias Eljasik{-}Swoboda. and Wilhelm Demuth.},
title={Leveraging Clustering and Natural Language Processing to Overcome Variety Issues in Log Management},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2020},
pages={281-288},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008856602810288},
isbn={978-989-758-395-7},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Leveraging Clustering and Natural Language Processing to Overcome Variety Issues in Log Management
SN - 978-989-758-395-7
IS - 2184-433X
AU - Eljasik-Swoboda, T.
AU - Demuth, W.
PY - 2020
SP - 281
EP - 288
DO - 10.5220/0008856602810288
PB - SciTePress