Abstract:
Conformance checking allows auditors to detect process deviations automatically, resulting in numerous deviations, with only a few being relevant. Identifying notable ite...Show MoreMetadata
Abstract:
Conformance checking allows auditors to detect process deviations automatically, resulting in numerous deviations, with only a few being relevant. Identifying notable items amidst this large data set is challenging. Machine learning techniques offer potential solutions, but questions about the required number of labeled deviations and the impact of label quality remain. Our study investigates these factors’ effects on Decision Trees and Random Forests. Results demonstrate these models’ effectiveness in identifying notable items within imbalanced deviation populations. Achieving 90% precision and recall is feasible with about 400 to 600 labeled deviations, depending on the notable items’ population fraction. A higher fraction of notables reduces the required labeled deviations. Varying label quality produced similar results. Additionally, classifications identifying at least 90% notable items are linked to less complex processes.
Date of Conference: 14-18 October 2024
Date Added to IEEE Xplore: 25 September 2024
ISBN Information: