Abstract
Rough Set Theory (RST) opened a new direction in the development of incomplete information theories and is a powerful data analysis tool. In this investigation, the possibility of using this theory to generate a priori knowledge about a dataset is demonstrated. A proposal is developed for previous characterization of training sets, using RST estimation measurements. This characterization offers an assessment of the quality of data in order to use them as a training set in machine learning techniques. The proposal has been experimentally studied using international databases and some known classifiers such as MLP, C4.5 and K-NN, and satisfactory results have been obtained.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ruiz, R.: Heurísticas de selección de atributos para datos de gran dimensionalidad. Departamento de Lenguajes y Sistemas Informáicos. Universidad de Sevilla, Sevilla (2006)
Rosemblatt, F.: Principles of Neurodynamics, New York (1962)
Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. Institute of Electronical and Electronics Engineers Transactions on Information Theory 13, 21–27 (1967)
Quinlan, J.R.: C-4.5: Programs for machine learning, San Mateo, California (1993)
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Komorowski, J., Pawlak, Z.: Rough Sets: A tutorial. Rough Fuzzy Hybridization: A new trend in decision-making, pp. 3–98. Springer, Heidelberg (1999)
Greco, S.: Rough sets theory for multicriteria decision analysis. European Journal of Operational Research 129, 1–47 (2001)
Pal, S.K.: Web mining in Soft Computing framework: Relevance, State of the art and Future Directions. IEEE Transactions on Neural Networks (2002)
Segovia, M.J.: Predicción de insolvencias con el método Rough Set. Universidad Complutense de Madrid, España (2003)
Tay, F.E., Shen, L.: Fault diagnosis based on Rough Set Theory. Engineering Applications of Artificial Intelligence 16, 39–43 (2003)
Yao, Y.Y.: On Generalizing Rough Set Theory. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Caballero, Y.: Rough Set Theory Measures to Knowledge Generation. In: Proceedings of Seven International Conference on Intelligent Systems Design and Applications, ISDA2007, Rio de Janeiro, Brazil. IEEE Computer Society, Los Alamitos (2007); Order Number P2976. Library of Congress Numbrer 2007930106, ISBN 0-7695-2976-3
Mitra, S.: Computational Intelligence in Bioinformatics. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 134–152. Springer, Heidelberg (2005)
Bello, R., Puris, A., Nowe, A., Martínez, Y., García, M.M.: Two step ant colony system to solve the feature selection problem. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds.) CIARP 2006. LNCS, vol. 4225, pp. 588–596. Springer, Heidelberg (2006)
Chin, K.S., Liang, J., Dang, C.: Rough Set Data Analysis Algorithms for Incomplete Information Systems. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Peters, J.F.: Rough Ethology: Towards a Biologically-Inspired Study of Collective Behavior in Intelligent Systems with Approximation Spaces. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 153–174. Springer, Heidelberg (2005)
Skowron, A., Świniarski, R.W., Synak, P.: Approximation Spaces and Information Granulation. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 175–189. Springer, Heidelberg (2005)
Ślęzak, D.: Rough Sets and Bayes Factor. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 202–229. Springer, Heidelberg (2005)
Wolski, M.: Formal Concept Analysis and Rough Set Theory from the Perspective of Finite Topological Approximations. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 230–243. Springer, Heidelberg (2005)
Koczkodaj, W.W.: Myths about Rough Set Theory. ACM 41 (1998)
Ohrn, A., Komorowski, J., Skowron, A., Synak, P.: The Design and Implementation of a Knowledge Discovery Toolkit Based on Rough Sets. In: Pulkowski, Skorn (eds.) The ROSETTA System. Rough Sets in Knowledge discovery 1: Methodology and Applications. Studies in Fuzziness and Soft Computing, vol. 18, pp. 376–399 (1998)
Lee, S., Propes, N., Zhang, G., Zhao, Y., Vachtsevanos, G.: Rough Set Feature Selection and Diagnostic Rule Generation for Industrial Applications. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, p. 568. Springer, Heidelberg (2002)
Tsumoto, S.: Automated extraction of hierarchical decision rules from clinical databases using rough set model. Expert systems with Applications 24, 189–197 (2003)
Grzymala-Busse, J.W., Siddhaye, S.: Rough set approaches to rule induction from incomplete data. In: 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Bases systems IPMU 2004, Perugia, Italy, vol. 2, pp. 923–930 (2004)
Pawlak, Z.: Rough Sets. Comm. of ACM 38 (1995)
Yao, Y.S., Wong, C.: Methodologies for Knowledge Discovery and Data Mining. In: Zhou, Z.a. (eds.) On Information-Theoretic Measures of attribute importance, pp. 231–238 (1999)
Dunstsh, I., Gunter, G.: Rough set data analysis (2000)
Zhong, N., Dong, J., Ohsuga, S.: Using Rough sets with heuristics for feature selection. Journal of Intelligent Information Systems 16, 199–214 (2001)
Kierczak, M., Rudnicki, W.R., Komorowski, J.: Construction of rough sets-based classifiers for predicting HIV resistance to nucleoside reverse transcriptase inhibitors. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS 2006, Santa Clara, Cuba (2006)
Revett, K., Gorunesco, F., Gorunesco, M.: A Rough Sets based investigation of a Beta-Carotene/Retinol dataset. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS2006, Santa Clara, Cuba (2006)
Caballero, Y., Bello, R., Salgado, Y., Márquez, Y., León, P., Alvarez, D., Zaldívar, J.M.: La Teoría de los Conjuntos Aproximados en el mejoramiento de los conjuntos de entrenamiento en Bioinformática. In: II Congreso Internacional de Bioinformática y Neuroinformática. Informática 2007, La Habana, Cuba (2007)
Pal, S.K., Mitra, P.: Rough Sets, EM Algorithm, MST and Multispectral Image Segmentation. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Hu, X.T., Lin, T.Y., Han, J.: A New Rough Sets Model Based on Database Systems. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Zheng, Z., Wang, G., Wu, Y.: A Rough Set and Rule Tree Based Incremental Knowledge Acquisition Algorithm. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Bazan, J.G., Szczuka, M.: The Rough Set Exploration System. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 37–56. Springer, Heidelberg (2005)
Hor, C.-L., Crossley, P.A.: Knowledge Extraction from Intelligent Electronic Devices. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 82–111. Springer, Heidelberg (2005)
Kostek, B., Szczuko, P., Żwan, P., Dalka, P.: Processing of Musical Data Employing Rough Sets and Artificial Neural Networks. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 112–133. Springer, Heidelberg (2005)
Suraj, Z., Grochowalski, P.: The Rough Set Database System: An Overview. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 190–201. Springer, Heidelberg (2005)
Choubey, S.K.: A comparison of feature selection algorithms in the context of rough classifiers. In: Fifth IEEE International Conference on Fuzzy Systems, vol. 2, pp. 1122–1128 (1996)
Chouchoulas, A., Shen, Q.: A rough set-based approach to text classification. LNAI, vol. 11, pp. 118–127. Springer, Heidelberg (1999)
Piñero, P., Arco, L., García, M.M., Caballero, Y., Yzquierdo, R., Morales, A.: Two New Metrics for Feature Selection in Pattern Recognition. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 488–497. Springer, Heidelberg (2003)
Sugihara, K., Tanaka, H.: Rough Sets approach to information systems with interval decision values in evaluation problems. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS 2006, Santa Clara, Cuba (2006)
Midelfart, H., Komorowski, J., Ñorsett, K., Yadetie, F., Sandvik, A., Laegreid, A.: Learning rough set classifiers from gene expression and clinical data. Fundamenta Informaticae 53, 155–183 (2003)
Miao, D., Hou, L.: An Application of Rough Sets to Monk’s Problems Solving. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Zhao, Y., Zhang, H., Pan, Q.: Classification Using the Variable Precision Rough Set. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Greco, S., Inuiguchi, M., Slowinski, R.: Rough Sets and Gradual Decision Rules. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Bosc, P., Prade, H.: An introduction to fuzzy set and possibility theory based approaches to the treatment of uncertainty and imprecision in database management system. In: Proc. of Second Workshop Uncertainty management, Information Systems: from Needs to Solution, California (1993)
Parsons, S.: Current approaches to handling imperfect information in data and knowledges bases. IEEE Trans. on Knowledge and Data Engineering 8 (1996)
Grabowski, A.: Basic Properties of Rough Sets and Rough Membership Function. Journal of Formalized Mathematics 15 (2003)
Grzymala-Busse, J.W.: Managing uncertainty in machine learning from examples. In: Proceedings of the Workshop Intelligent Information System III, Polonia (1994)
Pal, S.K., Skowron, A.: Rough Fuzzy Hybridization: A New Trend in Decision-Making (1999)
Orlowska, E. (ed.): Incomplete Information. Rough sets analysis. Physica-Verlag (1998)
Skowron, A., Peters, J.F.: Rough Sets: Trends and Challenges. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Bazan, J., Son, N.H., Skowron, A., Szczuka, M.: A View on Rough Set Concept Approximations. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Slowinski, R., Vanderpooten, D.: Similarity relation as a basis for rough approximations. In: Advances in Machine Intelligence & Soft-Computing, vol. IV, pp. 17–33 (1997)
Wilson, D.R., Martínez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
García, J.M.: KNN Workshop. Suite para el Desarrollo de Clasificadores Basados en Instancias. Departamento Computación. Facultad de Matemática, Física y Computación. Universidad Central “Marta Abreu” de Las Villas (2003)
Skowron, A., Stepaniuk, J.: Intelligent systems based on rough set approach. In: International Workshop Rough Sets. State of the Art and Perspectives, pp. 62–64 (1992)
Deogun, J.S.: Exploiting upper approximations in the rough set methodology. In: Fayyad, U.Y.U. (ed.) First International Conference on Knowledge Discovery and Data Mining, Canada, pp. 69–74 (1995)
Kohavi, R., Frasca, B.: Useful Feature Subsets and Rough Set Reducts. In: Third International Workshop on Rough Sets and Soft Computing (1994)
Carlin, U.S.: Rough set analysis of medical datasets and A case of patient with suspected acute appendicitis. In: ECAI 1998 Workshop on Intelligent data analysis in medicine and pharmacology (1998)
Ahn, B.S.: The integrated methodology of rough set theory and artificial neural networks for business failure predictions (2000)
Lazo, M., Ruiz, J., Alba, E.: An overview of the evolution of the concept of testor. Pattern Recognition, 753–762 (2001)
Santiesteban, Y., Pons, A.: LEX: un nuevo algoritmo para el cálculo de los testores típicos. Revista Ciencias Matemáticas 21 (2003)
Skowron, A., et al. (eds.): RSFDGrC 1999. LNCS (LNAI), vol. 1711. Springer, Heidelberg (1999)
Arco, L., Bello, R., García, M.: On clustering validity measures and the Rough Set Theory. 5th Mexican International Conference on Artificial Intelligence. IEEE Computer Society Press (2006)
Caballero, Y., Arco, L., Bello, R., Marx, J.: New Measures for Evaluating Decision Systems using Rough Set Theory: The Application in Seasonal Weather Forecasting. In: Marx, J., Sonnenschein, M., Müller, M., Welsch, H., Rautenstrauch, C. (eds.) Third International ICSC Symposium on Information Technologies in Environmental Engineering (ITEE 2007), pp. 161–174. Springer Verlag, Carl von Ossietzky Universität Oldenburg, Heidelberg (2007)
Witten, I., Frank, E.: Transformation: Engineering the input and output. In: Witten, I., Frank, E. (eds.) Data Mining. Practical Machine Learning Tools and Techniques, pp. 296–304 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Caballero, Y., Bello, R., Arco, L., García, M., Ramentol, E. (2010). Knowledge Discovery Using Rough Set Theory. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-05177-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05176-0
Online ISBN: 978-3-642-05177-7
eBook Packages: EngineeringEngineering (R0)