Abstract:
Reference datasets are databases that guide the development and the evaluation of tools and methods in several areas of Computer Science. In the field of Information Secu...Show MoreMetadata
Abstract:
Reference datasets are databases that guide the development and the evaluation of tools and methods in several areas of Computer Science. In the field of Information Security, in particular, there is a notable need for tools devoted to detection and classification. Hence, the availability of such datasets is fundamental: the reference dataset is seen as a standard against which a tool must be tested characterize its accuracy. In this sense, the reference dataset is analogous to the classic metrology primary standard, in a sense that it provides the most trustworthy reference against which an object under evaluation can be compared to. It is therefore of great importance to devote efforts to the development of methods that assure the quality of reference datasets. In the present work, we discuss the challenges faced by the currently available datasets and propose directions towards the development of reliable datasets. Finally, we propose a methodology for the construction of reference datasets for Online Social Networks and present a case study for the construction of a Twitter dataset for the detection of social bots.
Published in: 2018 Workshop on Metrology for Industry 4.0 and IoT
Date of Conference: 16-18 April 2018
Date Added to IEEE Xplore: 09 August 2018
ISBN Information: