Paper
21 May 1999 Stepwise linear discriminant analysis in computer-aided diagnosis: the effect of finite sample size
Author Affiliations +
Abstract
In computer-aided diagnosis, a frequently-used approach is to first extract several potentially useful features from a data set. Effective features are then selected from this feature space, and a classifier is designed using the selected features. In this study, we investigated the effect of finite sample size on classifier accuracy when classifier design involves feature selection. The feature selection and classifier coefficient estimation stages of classifier design were implemented using stepwise feature selection and Fisher's linear discriminant analysis, respectively. The two classes used in our simulation study were assumed to have multidimensional Gaussian distributions, with a large number of features available for feature selection. We investigated the effect of different covariance matrices and means for the two classes on feature selection performance, and compared two strategies for sample space partitioning for classifier design and testing. Our results indicated that the resubstitution estimate was always optimistically biased, except in cases where too few features were selected by the stepwise procedure. When feature selection was performed using only the design samples, the hold-out estimate was always pessimistically biased. When feature selection was performed using the entire finite sample space, and the data was subsequently partitioned into design and test groups, the hold-out estimates could be pessimistically or optimistically biased, depending on the number of features available for selection, number of available samples, and their statistical distribution. All hold-out estimates exhibited a pessimistic bias when the parameters of the simulation were obtained from texture features extracted from mammograms in a previous study.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Berkman Sahiner, Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, and Lubomir M. Hadjiiski "Stepwise linear discriminant analysis in computer-aided diagnosis: the effect of finite sample size", Proc. SPIE 3661, Medical Imaging 1999: Image Processing, (21 May 1999); https://doi.org/10.1117/12.348606
Lens.org Logo
CITATIONS
Cited by 4 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature selection

Statistical analysis

Computer aided design

Matrices

Computer aided diagnosis and therapy

Feature extraction

Mammography

Back to Top