On Scalability of Rough Set Methods

Kwiatkowski, Piotr; Nguyen, Sinh Hoa; Nguyen, Hung Son

doi:10.1007/978-3-642-14055-6_30

On Scalability of Rough Set Methods

Piotr Kwiatkowski⁴,
Sinh Hoa Nguyen⁵ &
Hung Son Nguyen⁴

Conference paper

999 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 80))

Abstract

This paper presents some recent results of the research on the scalability of rough set based classification methods. The proposed solution is based on the close relationship between reduct calculation problem in rough set theory and association rule generation problem. This is a continuation of our previous results (see, e.g. [10] [11]). In this paper, the set of decision rules satisfying the test object is generated directly from the training data set. To make it scalable, we adopted the idea of the FP-growth algorithm for frequent item-sets [7], [6]. The experimental results on some benchmark data sets are showing the ability of the proposed solution to process a growing data sets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules, Menlo Park, CA, USA. American Association for Artificial Intelligence, pp. 307–328 (1996)
Google Scholar
Bazan, J.G.: A comparison and non-dynamic rough set method for extracting laws decision tables. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 321–365. Physica-Verlag, Heidelberg (1998)
Google Scholar
Bondi, A.B.: Characteristics of scalability and their impact on performance. In: WOSP 2000: Proceedings of the 2nd international workshop on Software and performance, pp. 195–203. ACM, New York (2000)
Google Scholar
Fayyad, U.M., Haussler, D., Stolorz, P.E.: Mining scientific data. Commun. ACM 39(11), 51–57 (1996)
Article Google Scholar
Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: Proceedings of 6th International Workshop on High Performance Data Mining, HPDM 2003 (2003)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2000)
MATH Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, May 2000, pp. 1–12. ACM Press, New York (2000)
Chapter Google Scholar
Komorowski, H.J., Pawlak, Z., Polkowski, L.T., Skowron, A.: Rough Sets: A Tutorial, pp. 3–98. Springer, Singapore (1999)
Google Scholar
Kwiatkowski, P.: Scalable classification method based on FP-growth algorithm (in Polish). Master’s thesis, Warsaw University (2008)
Google Scholar
Nguyen, H.S.: Scalable classification method based on rough sets. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 433–440. Springer, Heidelberg (2002)
Chapter Google Scholar
Nguyen, H.S.: Approximate boolean reasoning: Foundations and applications in data mining 4100, 334–506 (2006)
Google Scholar
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Theory and decision library. D: System theory, knowledge engineering and problem solving, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)
MATH Google Scholar
Shafer, J.C., Agrawal, R., Mehta, M.: Sprint: A scalable parallel classifier for data mining. In: Vijayaraman, T.M., et al. (eds.) VLDB 1996, Proceedings of 22nd International Conference on Very Large Data Bases, Mumbai, India, September 3-6, pp. 544–555. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Skowron, A., Rauszer, C.M.: The discernibility matrices and functions in information systems, ch. 3, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 500–529. Physica-Verlag, Heidelberg (1998)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Wroblewski, J.: Covering with reducts - a fast algorithm for rule generation. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 402–407. Springer, Heidelberg (1998)
Chapter Google Scholar
Ziarko, W.: Rough sets as a methodology for data mining. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 554–571. Physica-Verlag, Heidelberg (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Piotr Kwiatkowski & Hung Son Nguyen
Polish-Japanese Institute of Inf. Technology, Koszykowa 86, 02008, Warszawa, Poland
Sinh Hoa Nguyen

Authors

Piotr Kwiatkowski
View author publications
You can also search for this author in PubMed Google Scholar
Sinh Hoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Hung Son Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Marburg, Germany
Eyke Hüllermeier
Department of Knowledge Processing and Language Engineering, Otto-von-Guericke University of Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Rudolf Kruse
Fakultät für Elektrotechnik und Informationstechnik, Technische Universität Dortmund, 44221, Dortmund, Germany
Frank Hoffmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kwiatkowski, P., Nguyen, S.H., Nguyen, H.S. (2010). On Scalability of Rough Set Methods. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Methods. IPMU 2010. Communications in Computer and Information Science, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14055-6_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-14055-6_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14054-9
Online ISBN: 978-3-642-14055-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics