Skip to main content

Dependency and Granularity in Data -Mining

  • Reference work entry
Encyclopedia of Complexity and Systems Science

Definition of the Subject

The degree of granularity of a contingency table is closely related with that of dependence of contingency tables. We investigate these relations from the viewpoints of determinantal devisors and determinants. From the results of determinantal divisors , it seems that the devisors provide information on the degree of dependencies between the matrix of all elements and its submatrices and that an increase in degree of granularity may lead to an increase in dependency. However, another approach shows that a constraint on the sample size of a contingency table is very strong, which leads to an evaluation formula in which an increase of degree of granularity gives a decrease of dependency.

Introduction

Independence (dependence) is a very important concept in data mining, especially for feature selection. In rough sets [2], if two attribute‐value pairs, say \( { [c=0] } \) and \( { [d=0] } \) are independent, their supporting sets, denoted by C and Ddo not have...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 3,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The threshold  δ is the degree of the closeness of overlapping sets, which will be given by domain experts. For more information, please refer to Sect. “ Rank of Contingency Table (\( { {2 \times 2} } \))”.

Bibliography

  1. Butz C (2002) Exploiting contextual independencies in web search and user profiling. In: Proceedings of World Congress on Computational Intelligence (WCCI'2002) (CD-ROM)

    Google Scholar 

  2. Pawlak Z (1991) Rough sets. Kluwer, Dordrecht

    MATH  Google Scholar 

  3. Skowron A, Grzymala-Busse J (1994) From rough set theory to evidence theory. In: Yager R, Fedrizzi M, Kacprzyk J (eds) Advances in the Dempster–Shafer Theory of Evidence. Wiley, New York, pp 193–236

    Google Scholar 

  4. Tsumoto S (2000) Knowledge discovery in clinical databases and evaluation of discovered knowledge in outpatient clinic. Inf Sci 124:125–137

    Google Scholar 

  5. Tsumoto S (2003) Statistical independence as linear independence. In: Skowron A, Szczuka M (eds) Electronic Notes in Theoretical Computer Science, vol 82. Elsevier

    Google Scholar 

  6. Tsumoto S, Tanaka H (1996) Automated discovery of medical expert system rules from clinical databases based on rough sets. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining 96. AAAI Press, Palo Alto, pp 63–69

    Google Scholar 

Download references

Acknowledgment

This work was supported by the Grant-in-Aid for Scientific Research (13131208) on PriorityAreas (No.759) “Implementation of Active Mining in the Era of Information Flood” by the Ministry of Education, Science, Culture, Sports,Science and Technology of Japan.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag

About this entry

Cite this entry

Tsumoto, S., Hirano, S. (2009). Dependency and Granularity in Data -Mining . In: Meyers, R. (eds) Encyclopedia of Complexity and Systems Science. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30440-3_119

Download citation

Publish with us

Policies and ethics