Machine Learning to Detect Inconsistent Data | IEEE Conference Publication | IEEE Xplore

Abstract:

Nowadays, information is one of the main assets of companies and government entities; it facilitates decision-making and the determination of policies. However, their res...Show More

Abstract:

Nowadays, information is one of the main assets of companies and government entities; it facilitates decision-making and the determination of policies. However, their results are not always satisfactory due to the low-quality information. Data sets with duplicate, inconsistent, incomplete, outdated, and imprecise data are the most common causes that affect the quality of the information and therefore the results of its analysis. Cleaning data becomes fundamental for those reasons, it is a process that must be carried out before doing any analysis on a certain set of data, but at the same time is a cumbersome process. This article aims to give an overview of how machine-learning techniques may be used to simplify the task of cleaning data. The data cleaning is made by the use of machine learning to identify the inconsistent data. The study case is a data set from a government entity of a program focused on needs in early childhood. The experimental results show that supervised learning has a better performance to identify inconsistent data than unsupervised learning and that is an efficient way to clean data in this dimension.
Date of Conference: 29 November 2022 - 01 December 2022
Date Added to IEEE Xplore: 03 January 2023
ISBN Information:
Conference Location: Palapye, Botswana

Contact IEEE to Subscribe

References

References is not available for this document.