Mining Numerical Data – A Rough Set Approach

Transactions on Rough Sets XI

We present an approach to mining numerical data based on rough set theory using calculus of attribute-value blocks. An algorithm implementing these ideas, called MLEM2, induces high quality rules in terms of both simplicity (number of rules and total number of conditions) and accuracy. MLEM2 induces rules not only from complete data sets but also from data with missing attribute values, with or without numerical attributes. Additionally, we present experimental results on a comparison of three commonly used discretization techniques: equal interval width, equal interval frequency and minimal class entropy (all three methods were combined with the LEM2 rule induction algorithm) with MLEM2. Our conclusion is that even though MLEM2 was most frequently a winner, the differences between all four data mining methods are statistically insignificant.

