Authors:
Toon Stuyck
1
and
Eric Demeester
2
Affiliations:
1
BASF Antwerpen, BASF, Antwerpen, Belgium
;
2
Department of Mechanical Engineering, ACRO Research Group, KU Leuven, Diepenbeek, Belgium
Keyword(s):
Synthetic Data, Augmented Data, Generative Adversarial Network, Chemical Foam, Classification, Explainable AI.
Abstract:
One of the main challenges of using machine learning in the chemical sector is a lack of qualitative labeled data. Data of certain events can be extremely rare, or very costly to generate, e.g. an anomaly during a production process. Even if data is available it often requires highly educated observers to correctly annotate the data. The performance of supervised classification algorithms can be drastically reduced when confronted with limited amounts of training data. Data augmentation is typically used in order to increase the amount of available training data but the risk exists of overfitting or loss of information. In recent years Generative Adversarial Networks have been able to generate realistically looking synthetic data, even on small amounts of training data. In this paper the feasibility of utilizing Generative Adversarial Network generated synthetic data to improve classification results will be demonstrated via a comparison with and without standard augmentation methods
such as scaling, rotation,... . In this paper a methodology is proposed on how to combine original data and synthetic data to achieve the best classifier result and to quantitatively verify generalization of the classifier using an explainable AI method. The proposed methodology compares favourably to using no or standard augmentation methods in the case of classification of chemical foam.
(More)