loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Kilho Shin 1 ; Chris Liu 2 ; Katsuyuki Maeda 1 and Hiroaki Ohshima 3

Affiliations: 1 Computer Centre, Gakushuin University, Mejiro, Tokyo, Japan ; 2 Deloitte Tohmatsu Cyber LLC., Marunouchi, Tokyo, Japan ; 3 Graduate School of Information Science, University of Hyogo, Kobe, Hyogo, Japan

Keyword(s): Feature Selection, Categorical Data.

Abstract: In feature selection, we grapple with two primary challenges: devising effective evaluative indices for selected feature subsets and crafting scalable algorithms rooted in these indices. Our study addresses both. Beyond assessing the size and class relevance of selected features, we introduce a groundbreaking index, nuisance. It captures class-uncorrelated information, which can muddy subsequent processes. Our experiments confirm that a harmonious balance between class relevance and nuisance augments classification accuracy. To this end, we present the Balance-Optimized Relevance and Nuisance Feature Selection (BornFS) algorithm. It not only exhibits scalability to handle large datasets but also outperforms traditional methods by achieving better balance among the introduced indices. Notably, when applied to a dataset of 800,000 Windows executables, using LCC as a preprocessing filter, BornFS slashes the feature count from 10 million to under 200, maintaining a high accuracy in malwa re detection. Our findings shine a light on feature selection’s complexities and pave the way forward. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.118.200.136

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Shin, K.; Liu, C.; Maeda, K. and Ohshima, H. (2024). BornFS: Feature Selection with Balanced Relevance and Nuisance and Its Application to Very Large Datasets. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4; ISSN 2184-433X, SciTePress, pages 1100-1107. DOI: 10.5220/0012436000003636

@conference{icaart24,
author={Kilho Shin. and Chris Liu. and Katsuyuki Maeda. and Hiroaki Ohshima.},
title={BornFS: Feature Selection with Balanced Relevance and Nuisance and Its Application to Very Large Datasets},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={1100-1107},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012436000003636},
isbn={978-989-758-680-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - BornFS: Feature Selection with Balanced Relevance and Nuisance and Its Application to Very Large Datasets
SN - 978-989-758-680-4
IS - 2184-433X
AU - Shin, K.
AU - Liu, C.
AU - Maeda, K.
AU - Ohshima, H.
PY - 2024
SP - 1100
EP - 1107
DO - 10.5220/0012436000003636
PB - SciTePress