Abstract:
The process of data generation plays a significant role in various areas of computer science. Software testing is probably the seminal example for usage of artificially c...Show MoreMetadata
Abstract:
The process of data generation plays a significant role in various areas of computer science. Software testing is probably the seminal example for usage of artificially created data. An appropriate data generator is suitable and necessary for almost every type of testing (including automated): the regression tests, null value tests, coverage, security and performance test. With the rise of data science, the data generation is as well used in machine learning, data mining, and data visualization. Other industries such as financial and health-care have great benefits from artificial data as well. Important aspect of the generated data is that the data needs to be realistic but not real, which embrace the confidentiality and privacy. In this paper, we give a short survey on the different types of generators from the architecture point of view and their intended usage, as well as we list their pros and cons. Finally, we give an overview of the used data generation algorithms and the best practices in different areas.
Date of Conference: 08-11 September 2019
Date Added to IEEE Xplore: 23 January 2020
ISBN Information: