Abstract:
Feature selection is the process of choosing a subset of pertinent features to use in the model building during the machine learning and data mining process. Irrelevant a...Show MoreMetadata
Abstract:
Feature selection is the process of choosing a subset of pertinent features to use in the model building during the machine learning and data mining process. Irrelevant and redundant features are reduced by this process while improving learning performance. Although in most real-world applications, it is time-consuming and expensive to collect labeled data. Plentiful unlabeled data is easily available. Semisupervised feature selection methods are used to tackle this problem, in which both unlabeled and labeled data are used to assess feature relevance. This paper suggests a new semisupervised feature selection algorithm based on the PageRank centrality concept for the partially labeled data, which is an efficient method for calculating the prominence of web pages on the Internet. The proposed method, called semi-supervised graph-based feature selection (SGFS), crafts a complete weighted graph based on the relevancy with labels and redundancy of features. PageRank algorithm estimates the prominence of each graph vertex (feature). This algorithm has been tested against multiple semi-supervised feature selection methods based on several datasets with various dimensions to prove the accurate performance of the proposed algorithm.
Date of Conference: 23-24 February 2022
Date Added to IEEE Xplore: 27 May 2022
ISBN Information: