Abstract:
This paper discusses a challenging problem of mining data sets with numerical attributes and, at the same time, with missing attribute values. We distinguish between two ...Show MoreMetadata
Abstract:
This paper discusses a challenging problem of mining data sets with numerical attributes and, at the same time, with missing attribute values. We distinguish between two interpretations of missing attribute values: lost values and ”do not care” conditions. In our experiments, we used the LERS data mining system, inducing certain and possible rule sets, using rough set theory ideas of lower and upper approximations, respectively. The LERS data mining system has two options for computing approximations: global and local. In our experiments we used both options. Additionally, we used a probabilistic approach to missing attribute values, one of the most successful traditional methods to handle missing attribute values. Using the Wilcoxon matched-pairs signed rank test (5% level of significance for two-tailed test), we observed that the probabilistic approach was either worse or not better than rough set approaches.
Published in: 2011 IEEE International Conference on Granular Computing
Date of Conference: 08-10 November 2011
Date Added to IEEE Xplore: 05 January 2012
ISBN Information: