Abstract
In today’s fast paced computerized world, many business organizations are overwhelmed with the huge amount of fast growing information. It is becoming difficult for traditional database systems to manage the data effectively. Knowledge Discovery in Databases (KDD) and Data Mining became popular in the 1980s as solutions for this kind of data overload problem. In the past ten years, Rough Sets theory has been found to be a good mathematical approach for simplifying both the KDD and Data Mining processes. In this paper, KDD and Data Mining will be examined from a Rough Sets perspective. Based on the Rough Sets research on KDD that has been done at the University of Regina, we will describe the attribute-oriented approach to KDD. We will then describe the linkage between KDD and Rough Sets techniques and propose to unify KDD and Data Mining within a Rough Sets framework for better overall research achievement. In the real world, the dirty data problem is a critical issue exists on many organizations. In this paper, we will describe in detail how this KDD with Rough Sets approach framework will be applied to solve a real world dirty data problem.
Preview
Unable to display preview. Download preview PDF.
References
Fayyad, U., Piatetsky-Shapiro, and G., Smyth, P. 1996. Knowledge Discovery and Data Mining: Towards a Unifying Framework. Proceedings KDD-96, pp. 82–88.
Fernandez-Baizan, M., Ruiz, E., and Wasilewska, A., 1998. A Model of RSDM Implementation. Rough Sets and Current Trends in Computing, Springer, pp. 186–193.
Han, J., Fu, Y., Wang, W., and etc. 1996. DBMiner: A System for Mining Knowledge in Large Relational Databases. Proceedings KDD-96, pp. 250–255.
Hu, XiaoHua, 1995. Knowledge Discovery in Database: An Attribute-Oriented Rough Set Approach. University of Regina.
Hernandez, M., and Stolfo, S., Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem. URL:http://www.cs.columbia.edu/~sal/merge-purge.html
Pawlak, Z., Grzymala-Busse J., Slowinski, R., Ziarko, W., Rough Sets. Communications of the ACM. vo138, no11, November 1995.
Piatetsky-Shapiro, G., An Overview of Knowledge Discovery in Database: Recent Progress and Challenges, in Ziarko, W.(ed.), Rough Sets, Fuzzy Sets and Knowledge Discovery, Springer-Verlag, pp. 1–10.
Shan, N., Ziarko, W., Hamilton, H, and Cercone, N. 1995. Using Rough Sets as Tools for Knowledge Discovery. Proceedings KDD-95, pp.263–268.
Ziarko, W., Knowledge Discovery by Rough Sets Theory. Communications of the ACM. vol. 42, no. 11, 1999.
Ziarko, W., Rough Sets and Knowledge Discovery: An Overview, in Ziarko, W. (ed.), Rough Sets, Fuzzy Sets and Knowledge Discovery, Springer-Verlag, pp. 11–14.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Johnson, J., Liu, M., Chen, H. (2001). Unification of Knowledge Discovery and Data Mining using Rough Sets Approach in A Real-World Application. In: Ziarko, W., Yao, Y. (eds) Rough Sets and Current Trends in Computing. RSCTC 2000. Lecture Notes in Computer Science(), vol 2005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45554-X_40
Download citation
DOI: https://doi.org/10.1007/3-540-45554-X_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43074-2
Online ISBN: 978-3-540-45554-7
eBook Packages: Springer Book Archive