loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Lucimar de A. Lial Moura 1 ; Marcus Albert A. da Silva 2 ; Kelli de Faria Cordeiro 3 ; 1 and Maria Cláudia Cavalcanti 2 ; 1

Affiliations: 1 Departamento de Sistemas e Computação, Instituto Militar de Engenharia (IME), Rio de Janeiro, RJ, Brazil ; 2 Departamento de Engenharia de Defesa, Instituto Militar de Engenharia (IME), Rio de Janeiro, RJ, Brazil ; 3 Centro de Análise de Sistemas Navais (CASNAV), Rio de Janeiro, RJ, Brazil

Keyword(s): Data Preprocessing, Training and Test Datasets, Ontology, UFO, Provenance.

Abstract: In the knowledge discovery process, a set of activities guide the data preprocessing phase, one of them is the data transformation from raw data to training and test data. This complex and multidisciplinary phase involves concepts and structured knowledge in distinct and particular ways in the literatures and specialized tools, demanding data scientists with suitable expertise. In this work, we present PPO-O, a reference ontology of the data preprocessing operators, to identify and represent the semantics of the concepts related to the data preprocessing phase. Moreover, the ontology highlights data preprocessing operators to the preparation of the training and test datasets. Based on PPO-O, Assistant-PP tool was developed, which made it capable to capture the retrospective data provenance during the execution of data preprocessing operators, facilitating the reproducibility and explainability of the dataset created. This approach might be helpful to non-experts users in data preproc essing. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.235.68.180

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Moura, L.; da Silva, M.; Cordeiro, K. and Cavalcanti, M. (2021). A Well-founded Ontology to Support the Preparation of Training and Test Datasets. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS; ISBN 978-989-758-509-8; ISSN 2184-4992, SciTePress, pages 99-110. DOI: 10.5220/0010460000990110

@conference{iceis21,
author={Lucimar de A. Lial Moura. and Marcus Albert A. {da Silva}. and Kelli de Faria Cordeiro. and Maria Cláudia Cavalcanti.},
title={A Well-founded Ontology to Support the Preparation of Training and Test Datasets},
booktitle={Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS},
year={2021},
pages={99-110},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010460000990110},
isbn={978-989-758-509-8},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 2: ICEIS
TI - A Well-founded Ontology to Support the Preparation of Training and Test Datasets
SN - 978-989-758-509-8
IS - 2184-4992
AU - Moura, L.
AU - da Silva, M.
AU - Cordeiro, K.
AU - Cavalcanti, M.
PY - 2021
SP - 99
EP - 110
DO - 10.5220/0010460000990110
PB - SciTePress