Skip to main content

Dependence Factor as a Rule Evaluation Measure

  • Chapter
  • First Online:

Part of the book series: Studies in Computational Intelligence ((SCI,volume 605))

Abstract

Certainty factor and lift are known evaluation measures of association rules. Nevertheless, they do not guarantee accurate evaluation of the strength of dependence between rule’s constituents. In particular, even if there is a strongest possible positive or negative dependence between rule’s constituents X and Y, these measures may reach values quite close to the values indicating independence of X and Y. Recently, we have proposed a new measure called a dependence factor to overcome this drawback. Unlike in the case of the certainty factor, when defining the dependence factor, we took into account the fact that for a given rule \(X \rightarrow Y\), the minimal conditional probability of the occurrence of Y given X may be greater than 0, while its maximal possible value may less than 1. In this paper, we first recall definitions and properties of all the three measures. Then, we examine the dependence factor from the point of view of an interestingness measure as well as we examine the relationship among the dependence factor for X and Y with those for \(\bar{X}\) and Y, X and \(\bar{Y}\), as well as \(\bar{X}\) and \(\bar{Y}\), respectively. As a result, we obtain a number of new properties of the dependence factor.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD international conference on management of data, pp 207–216

    Google Scholar 

  2. Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD 1997 international conference on management of data, pp 255–264

    Google Scholar 

  3. Hilderman RJ, Hamilton HJ (2001) Evaluation of interestingness measures for ranking discovered knowledge. LNCS 2035:247–259

    MATH  Google Scholar 

  4. Kryszkiewicz M (2015) Dependence factor for association rules. In: Proceedings of ACIIDS 2015, part II LNAI, vol 9012, pp 135–145, Springer, New York

    Google Scholar 

  5. Lavrac N, Flach P, Zupan B (1999) Rule evaluation measures: a unifying view. In: Proceedings of ILP-1999. LNAI, vol 1634, pp 174–185. Springer, New York

    Google Scholar 

  6. Lenca P, Meyer P, Vaillant B, Lallich S (2008) On selecting interestingness measures for association rules: user oriented description and multiple criteria decision aid. In: European journal of operational research, vol 184, pp 610–626. Elsevier, France

    Google Scholar 

  7. Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. Knowledge discovery in databases, pp 229–248. AAAI/MIT Press, Cambridge

    Google Scholar 

  8. Sheikh LM, Tanveer B, Hamdani SMA (2004) Interesting measures for mining association rules. In: Proceedings of INMIC 2004, IEEE

    Google Scholar 

  9. Shortliffe E, Buchanan B (1975) A model of inexact reasoning in medicine. Math Biosci 23:351–379

    Article  MathSciNet  Google Scholar 

  10. Silberschatz A, Tuzhilin A (1995) On subjective measures of interestingness in knowledge discovery. Proc KDD 1995:275–281

    Google Scholar 

  11. Suzuki E (2008) Pitfalls for categorizations of objective interestingness measures for rule discovery. In: Statistical implicative analysis: theory and applications, pp 383–395. Springer, New York

    Google Scholar 

  12. Suzuki E (2009) Interestingness measures—limits, desiderata, and recent results. In: QIMIE/PAKDD

    Google Scholar 

Download references

Acknowledgments

We wish to thank an anonymous reviewer for constructive comments, which influenced the final version of this paper positively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marzena Kryszkiewicz .

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Lemma 2

In the proof, we will use the following equations:

  • \(P(\bar{X}) = 1 - P(X)\),         \(P(\bar{Y}) = 1 - P(Y)\),

  • \(P(\bar{X}Y)=P(Y)-P(XY),\quad P(X\bar{Y}) = P(X)-P(XY),\)

  • \(P(\bar{X}\bar{Y}) = P(\bar{X}) - P(\bar{X}Y) = 1 - P(X)-P(Y)+P({ XY})\).

Ad (a)

Case \(P(\bar{X}\bar{Y}) >P(\bar{X}) \times P(\bar{Y})\):

This case is equivalent to the case when \(P({ XY}) >P(X)\times P(Y)\) (by Lemma 1a). Then:

\(df(\bar{X} \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&=\frac{P(\bar{X}\bar{Y})-P(\bar{X})\times P(\bar{Y})}{{max}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}-P(\bar{X})\times P(\bar{Y})} = \text { /* by Proposition 6a */}\\&= \frac{(1 - P(X)-P(Y)+P({XY}))-(1 - P(X))\times (1 - P(Y))}{(1-\max \{ P({X}),P({Y})\}) - (1-P({X})) \times (1-P({Y}))}\\&=\frac{P({X}{Y})-P({X})\times P({Y})}{\min \{ P({X}),P({Y})\} -P({X})\times P({Y})} = \text { /* by Theorem 3 */ }\\&=df(X \rightarrow Y). \end{aligned}$$

Case \(P(\bar{X}\bar{Y}) = P(\bar{X})\times P(\bar{Y})\):

This case is equivalent to the case when \(P(XY) = P(X) \times P(Y)\) (by Lemma 1b). Then:

\({ df}(\bar{X} \rightarrow \bar{Y} ) = \text { /* by Proposition 3a */}\)

         \(\begin{aligned}&= 0 = \text { /* by Proposition 3a */}\\&= { df}(X \rightarrow Y). \end{aligned}\)

Case \(P(\bar{X}\bar{Y}) < P(\bar{X}) \times P(\bar{Y})\) and \(P(\bar{X}) + P(\bar{Y}) \le 1\):

This case is equivalent to the case when \(P({ XY}) < P(X) \times P(Y)\) (by Lemma 1c) and \(P(X)+P(Y)\ge 1\) (by Proposition 5c). Then:

\({ df}(\bar{X} \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&-\frac{P(\bar{X})\times P(\bar{Y})-P(\bar{X}\bar{Y})}{P(\bar{X})\times P(\bar{Y})-\textit{min}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}} = \text { /* by Proposition 6b */}\\&=- \frac{(1 - P(X))\times (1 - P(Y))-1 (- P(X)-P(Y)+P({XY}))}{(1-P({X})) \times (1-P({Y}))-\text {max}\{0,1- P({X}),P({Y})\} }\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(1-P({X})-P({Y})+P({X}) \times P({Y}))-(0)}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})- (P({X}) + P({Y})-1)}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{P({X})\times P({Y})- \max \{0,P({X}) + P({Y})-1\}} = \text { /* by Theorem 3 */ }\\&= df(X\rightarrow Y). \end{aligned}$$

Case \(P(\bar{X}\bar{Y}) < P(\bar{X}) \times P(\bar{Y})\,\mathrm{and}\,P(\bar{X}) + P(\bar{Y}) > 1\):

This case is equivalent to the case when \(P(XY) < P(X) \times P(Y)\) (by Lemma 1c) and \(P(X)+P(Y) <\) 1 (by Proposition 5c). Then:

\(df(\bar{X}\rightarrow \bar{Y}) = \text { /* by Proposition 3a */} \)

$$\begin{aligned}&= -\frac{P(\bar{X})\times P(\bar{Y})-P(\bar{X}\bar{Y})}{P(\bar{X})\times P(\bar{Y})-\textit{min}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}} \text { /* by Proposition 6b */} \\&=- \frac{(1 - P(X))\times (1 - P(Y))- (1- P(X)-P(Y)+P({XY}))}{(1-P({X})) \times (1-P({Y}))-\text {max}\{0,1- P({X}),P({Y})\} }\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(1-P({X})-P({Y})+P({X}) \times P({Y}))-(1-P(X)-P(Y))}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})-0}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})- \max \{0,P({X}) + P({Y})-1\}} = \text { /* by Theorem 3 */} \\&= df(X\rightarrow Y). \end{aligned}$$

Ad (b)

The proof is analogous to the proof of Lemma 1a.

Ad (c)

Case \(P(X\bar{Y})>P(X) \times P(\bar{Y})\) and \(P(X)\le P(\bar{Y})\):

This case is equivalent to the case when \(P(XY)< P(X)\times P(Y)\) (by Lemma 1c) and \(P(X) \le 1 - P(Y)\). Then:

\(df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&=\frac{P(X\bar{Y})-P(X)\times P(\bar{Y})}{max \,{\_}P(X\bar{Y}|_{P(X), P(\bar{Y})})-P(X)\times P(\bar{Y})} = \text { /* by Proposition 6c */} \\&= \frac{(P(X)-P(XY))-P(X)\times (1-P(Y))}{\min \{P(X), 1-P(Y)\}-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-0}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-\max \{ 0, P(X)+P(Y)-1\}} = \text { /* by Theorem 3 */}\\&= - df(X \rightarrow Y). \end{aligned}$$

Case \(P(X\bar{Y}) > P(X) \times P(\bar{Y})\,\mathrm{and}\,P(X) > P(\bar{Y})\).

This case is equivalent to the case when \(P(XY)< P(X) \times P(Y)\) (by Lemma 1c) and \(P(X) > 1 - P(Y)\). Then:

\(df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&=\frac{P(X\bar{Y})-P(X)\times P(\bar{Y})}{max {\_}P(X\bar{Y}|_{P(X), P(\bar{Y})})-P(X)\times P(\bar{Y})} = \text { /* by Proposition 6c */}\\&= \frac{(P(X)-P(XY))-P(X)\times (1-P(Y))}{\min \{P(X), 1-P(Y)\}-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{(1- P(Y))-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-\max \{ 0, P(X)+P(Y)-1\}} = \text { /* by Theorem 3 */}\\&= - df(X \rightarrow Y). \end{aligned}$$

Case \(P(X\bar{Y})=P(X) \times P(\bar{Y})\):

This case is equivalent to the case when \(P(XY)=P(X)\times P(Y)\) (by Lemma 1b). Then:

\(df(\bar{X}\rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

         \(\begin{aligned}&= 0 = \text { /* by Proposition 3a */} \\&= -df(X\rightarrow Y). \end{aligned}\)

Case \(P(X\bar{Y}) < P(X)\times P(\bar{Y})\) and \(P(X)+ P(\bar{Y}) \le 1\).

This case is equivalent to the case when \(P(XY) > P(X) \times P(Y)\) (by Lemma 1a) and \(P(X)\le P(Y)\). Then:

\(df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&= -\frac{P(X)\times P(\bar{Y})-P(X\bar{Y})}{P(X)\times P(\bar{Y})- min \, {\_}P(X\bar{Y}|_{P(X),P(\bar{Y})})}= \text { /* by Proposition 6d */} \\&=- \frac{P(X)\times (1-P(Y))-(P(X)-P(XY))}{P(X)\times (1-P(Y))-\max \{0, P(X)-P(Y)\}}\\&= - \frac{P(XY)-P(X)\times P(Y)}{(P(X)-P(X)\times P(Y))-(0)}\\&= - \frac{P(XY)-P(X)\times P(Y)}{\min \{P(X),P(Y)\}-P(X)\times P(Y)}= \text { /* by Theorem 3 */}\\&= - df(X\rightarrow Y). \end{aligned}$$

Case \(P(X\bar{Y}) < P(X) \times P(\bar{Y})\) and \(P(X)+P(\bar{Y}) > 1\).

This case is equivalent to the case when \(P(XY) > P(X) \times P(Y)\) (by Lemma 1a) and \(P(X)>P(Y)\). Then:

\(df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}\)

$$\begin{aligned}&= -\frac{P(X)\times P(\bar{Y})-P(X\bar{Y})}{P(X)\times P(\bar{Y})- min\, {\_}P(X\bar{Y}|_{P(X),P(\bar{Y})})}= \text { /* by Proposition 6d */}\\&=- \frac{P(X)\times (1-P(Y))-(P(X)-P(XY))}{P(X)\times (1-P(Y))-\max \{0, P(X)-P(Y)\}}\\&= - \frac{P(XY)-P(X)\times P(Y)}{P(X)\times (1-P(Y))-(P(X)-P(Y))}\\&= - \frac{P(XY)-P(X)\times P(Y)}{P(Y) -P(X)\times P(Y)}\\&= - \frac{P(XY)-P(X)\times P(Y)}{\min \{P(X),P(Y)\}-P(X)\times P(Y)}= \text { /* by Theorem 3 */}\\&= - df(X\rightarrow Y). \square \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Kryszkiewicz, M. (2016). Dependence Factor as a Rule Evaluation Measure. In: Matwin, S., Mielniczuk, J. (eds) Challenges in Computational Statistics and Data Mining. Studies in Computational Intelligence, vol 605. Springer, Cham. https://doi.org/10.1007/978-3-319-18781-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18781-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18780-8

  • Online ISBN: 978-3-319-18781-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics