Skip to main content

On Scalability of Rough Set Methods

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 80))

Abstract

This paper presents some recent results of the research on the scalability of rough set based classification methods. The proposed solution is based on the close relationship between reduct calculation problem in rough set theory and association rule generation problem. This is a continuation of our previous results (see, e.g. [10] [11]). In this paper, the set of decision rules satisfying the test object is generated directly from the training data set. To make it scalable, we adopted the idea of the FP-growth algorithm for frequent item-sets [7], [6]. The experimental results on some benchmark data sets are showing the ability of the proposed solution to process a growing data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules, Menlo Park, CA, USA. American Association for Artificial Intelligence, pp. 307–328 (1996)

    Google Scholar 

  2. Bazan, J.G.: A comparison and non-dynamic rough set method for extracting laws decision tables. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 321–365. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

  3. Bondi, A.B.: Characteristics of scalability and their impact on performance. In: WOSP 2000: Proceedings of the 2nd international workshop on Software and performance, pp. 195–203. ACM, New York (2000)

    Google Scholar 

  4. Fayyad, U.M., Haussler, D., Stolorz, P.E.: Mining scientific data. Commun. ACM 39(11), 51–57 (1996)

    Article  Google Scholar 

  5. Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: Proceedings of 6th International Workshop on High Performance Data Mining, HPDM 2003 (2003)

    Google Scholar 

  6. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2000)

    MATH  Google Scholar 

  7. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, May 2000, pp. 1–12. ACM Press, New York (2000)

    Chapter  Google Scholar 

  8. Komorowski, H.J., Pawlak, Z., Polkowski, L.T., Skowron, A.: Rough Sets: A Tutorial, pp. 3–98. Springer, Singapore (1999)

    Google Scholar 

  9. Kwiatkowski, P.: Scalable classification method based on FP-growth algorithm (in Polish). Master’s thesis, Warsaw University (2008)

    Google Scholar 

  10. Nguyen, H.S.: Scalable classification method based on rough sets. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 433–440. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Nguyen, H.S.: Approximate boolean reasoning: Foundations and applications in data mining 4100, 334–506 (2006)

    Google Scholar 

  12. Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Theory and decision library. D: System theory, knowledge engineering and problem solving, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)

    MATH  Google Scholar 

  13. Shafer, J.C., Agrawal, R., Mehta, M.: Sprint: A scalable parallel classifier for data mining. In: Vijayaraman, T.M., et al. (eds.) VLDB 1996, Proceedings of 22nd International Conference on Very Large Data Bases, Mumbai, India, September 3-6, pp. 544–555. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  14. Skowron, A., Rauszer, C.M.: The discernibility matrices and functions in information systems, ch. 3, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)

    Google Scholar 

  15. Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 500–529. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

  16. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  17. Wroblewski, J.: Covering with reducts - a fast algorithm for rule generation. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 402–407. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  18. Ziarko, W.: Rough sets as a methodology for data mining. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1. Methodology and Applications. Studies in Fuzziness and Soft Computing, pp. 554–571. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kwiatkowski, P., Nguyen, S.H., Nguyen, H.S. (2010). On Scalability of Rough Set Methods. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Methods. IPMU 2010. Communications in Computer and Information Science, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14055-6_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14055-6_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14054-9

  • Online ISBN: 978-3-642-14055-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics