Skip to main content

Abstract

Performing probabilistic inference in multi-target dynamic systems is a challenging task. When the system, its evidence and/or its targets evolve, most of the inference algorithms either recompute everything from scratch, even though incremental changes do not invalidate all the previous computations, or do not fully exploit incrementality to minimize computations. This incurs strong unnecessary overheads when the system under study is large. To alleviate this problem, we propose in this paper a new junction tree-based message-passing inference algorithm that, given a new query, minimizes computations by identifying precisely the set of messages that differ from the preceding computations. Experimental results highlight the efficiency of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://agrum.lip6.fr.

  2. 2.

    In Fig. 3a, due to the small number of nodes and arcs in the BNs, percentages of modifications lower than 10 % imply no modification at all, hence the lack of error bars.

  3. 3.

    If \(k'=r\) then \(k\ne r'\) and the equality also verified.

  4. 4.

    All the nodes are computationally equivalent if \(\forall i\in \mathcal {V}(\mathcal {T}),i\in \mathcal {V}_1\) since \(\mathcal {V}(\mathcal {T})=\mathcal {V}_1\).

References

  1. Buchanan, B.G., Shortliffe, E.H.: Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)

    Google Scholar 

  2. Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell. 42(2–3), 393–405 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  3. Dagum, P., Luby, M.: Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif. Intell. 60(1), 141–153 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  4. D’Ambrosio, B.: Incremental probabilistic inference. In: Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 301–308 (1993)

    Google Scholar 

  5. Darwiche, A.: Dynamic join trees. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 97–104 (1998)

    Google Scholar 

  6. Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comput. Intell. 5(2), 142–150 (1989)

    Article  Google Scholar 

  7. Flores, M.J., Gámez, J.A., Olesen, K.G.: Incremental compilation of Bayesian networks. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 233–240 (2003)

    Google Scholar 

  8. Heckerman, D.E., Shortliffe, E.H.: From certainty factors to belief networks. Artif. Intell. Med. 4(1), 35–52 (1992)

    Article  Google Scholar 

  9. Jensen, F., Lauritzen, S., Olesen, K.: Bayesian updating in causal probabilistic networks by local computations. Comput. Stat. Q. 4, 269–282 (1990)

    MathSciNet  MATH  Google Scholar 

  10. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  11. Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 580–587 (1998)

    Google Scholar 

  12. Lauritzen, S., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems. J. Roy. Stat. Soc. 50(2), 157–224 (1988)

    MathSciNet  MATH  Google Scholar 

  13. Li, W., van Beek, P., Poupart, P.: Performing incremental Bayesian inference by dynamic model counting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 1173–1179 (2006)

    Google Scholar 

  14. Lin, Y., Druzdzel, M.J.: Relevance-based sequential evidence processing in Bayesian networks. In: Proceedings of the Eleventh International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 446–450 (1998)

    Google Scholar 

  15. Madsen, A.L., Jensen, F.V.: Lazy propagation: a junction tree inference algorithm based on lazy evaluation. Artif. Intell. 113(12), 203–245 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  16. Murphy, K.P.: Dynamic Bayesian networks: representation, inference and learning. Ph.D. thesis, UC Berkeley (2002)

    Google Scholar 

  17. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)

    MATH  Google Scholar 

  18. Pfeffer, A.J.: Probabilistic reasoning for complex systems. Ph.D. thesis, Stanford University (2000)

    Google Scholar 

  19. Robinson, J., Hartemink, A.: Non-stationary dynamic Bayesian networks, pp. 1369–1376 (2009)

    Google Scholar 

  20. Shenoy, P., Shafer, G.: Axioms for probability and belief-function propagation. In: Proceedings of the Conference Uncertainty in Artificial Intelligence, vol. 4, pp. 169–198 (1990)

    Google Scholar 

  21. Torti, L., Gonzales, C., Wuillemin, P.H.: Speeding-up structured probabilistic inference using pattern mining. Int. J. Approximate Reasoning 54(7), 900–918 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was partially supported by IBM France Lab/ANRT CIFRE grant #2014/421.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christophe Gonzales .

Editor information

Editors and Affiliations

Appendix: Proofs

Appendix: Proofs

Proof of Proposition 1 : Note that \( \mathcal {V}_{\text {-}{j}}({i}) =\{i\}\cup \bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}{ \mathcal {V}_{\text {-}{i}}({k}) }\) and, for \(k\in \mathrm{Adj}_{\text {-}{j}}(i)\), \(l'\in \mathcal {V}_{\text {-}{i}}({k}) \), we have \(Adj_j(l')=Adj_i(l')\). Using Definition 4, one can thus rewrite \(\mu _{{i}\rightarrow {j}}\) into:

$$\mu _{{i}\rightarrow {j}}=\displaystyle \bigcup _{\underset{\{l\}= Adj_{j}(l')}{l' \in \mathcal {V}_{\text {-}{j}}({i}) }} \lambda _{{l'}\rightarrow {l}} = \lambda _{{i}\rightarrow {j}}\cup \bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}\overbrace{ \bigcup _{\underset{\{l\}=Adj_{j}(l')}{l'\in \mathcal {V}_{\text {-}{i}}({k}) }} \lambda _{{l'}\rightarrow {l}}}^{\mu _{{k}\rightarrow {i}}}=\lambda _{{i}\rightarrow {j}}\cup \,\bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}{\mu _{{k}\rightarrow {i}}}$$

   \(\blacksquare \)

Proof of Theorem 1 – mutual exclusivity: if property (a) is satisfied, then \(\mathcal {T}\) contains no edge, therefore properties (b) and (c) cannot be satisfied.

Now, assume that there exist \(r_1,r_1'\) such that \(\mu _{{r_1'}\rightarrow {r_1}}=\mu _{{r_1}\rightarrow {r_1'}}=T\epsilon \) (property b). Let \(r_2\) be any clique in \(\mathcal {V}(\mathcal {T})\). Without loss of generality, assume that \(r_1\) lies on the path \(i_1=r_2,i_2,\ldots ,i_p=r_1'\) between \(r_2\) and \(r_1'\). Then, by Proposition 1, \(\mu _{{i_2}\rightarrow {r_2}} \supseteq \mu _{{i_3}\rightarrow {i_2}} \supseteq \cdots \supseteq \mu _{{r_1'}\rightarrow {r_1}} = T\epsilon \). Therefore, properties (b) and (c) cannot hold simultaneously.    \(\blacksquare \)

Proof of Theorem 1 r ’s existence: if \(\mathcal {A}(\mathcal {T}) = \emptyset \), then property (a) holds and r is the unique node of \(\mathcal {T}\). Now, assume that \(\mathcal {A}(\mathcal {T}) \ne \emptyset \). If there exists an edge \((i,j) \in \mathcal {E}(\mathcal {T})\) such that \(\mu _{{i}\rightarrow {j}} = \mu _{{j}\rightarrow {i}} = T\epsilon \), then \(r = i\) satifies property (b). Otherwise, neither properties (a) nor (b) hold. Assume that property (c) neither holds. Then, for all edges (ij), exactly one of \(\mu _{{i}\rightarrow {j}}\) or \(\mu _{{j}\rightarrow {i}}\) is equal to \(T\epsilon \) and the other one belongs to \(\{\emptyset ,\epsilon ,T\}\). Let \((i_0,j_0)\) be such that \(\mu _{{i_0}\rightarrow {j_0}} = T\epsilon \) and \(\mu _{{j_0}\rightarrow {i_0}} \ne T\epsilon \). Then, if \(|\mathrm{Adj}(i_0)|=1\), clique \(i_0\) satisfies property (c), a contradiction. As we assume that property (b) neither holds, there exists \(i_1 \in \mathrm{Adj}(i_0)\) such that \(\mu _{{i_1}\rightarrow {i_0}} = T\epsilon \) and \(\mu _{{i_0}\rightarrow {i_1}} \ne T\epsilon \). The same reasoning holds for \(i_1\), hence either \(i_1\) is a leaf, which contradicts property (c) or \(i_1\) has another neighbor \(i_2\) such that \(\mu _{{i_2}\rightarrow {i_1}} = T\epsilon \) and \(\mu _{{i_1}\rightarrow {i_2}} \ne T\epsilon \). By induction, we create a path \(i_1,\ldots ,i_n\) of maximal size. This path is necessarily finite since \(\mathcal {T}\) is a finite tree, hence clique \(i_n\) is a leaf which, therefore, satisfies property (c), a contradiction. Consequently, when properties (a) and (b) do not hold, property (c) holds.   \(\blacksquare \)

One can now prove separately the optimality for each property of Theorem 1, since these properties are mutually exclusive:

Proof of Theorem 1 – property a’s optimality: r is the only node in \(\mathcal {T}\). Choosing it as a root is therefore optimal.    \(\blacksquare \)

Lemma 1

Let \(i,j\in \mathcal {V}(\mathcal {T})\) be such that \(\epsilon \in \mu _{{j}\rightarrow {i}}\) and \(\mu _{{i}\rightarrow {j}}=\emptyset \), then \(\forall l \in \mathcal {V}_{\text {-}{j}}({i}) : \delta (l)= \delta (j)+len(l\!-\!j)\).

Proof

Note that when \(\epsilon \notin \mu _{{j}\rightarrow {i}}\), \(\mathcal {T}\) is up-to-date in the current inference and there is no need to perform any computation. The proof is achieved by induction on \(n=len(l\!-\!j)\). For \(n=1\), we have \(l=i\), so by Eq. (2) and the fact that \(\epsilon \in \mu _{{j}\rightarrow {i}}\) and \(i\in Adj_i(j)\), we get \(\delta _{{j}\rightarrow {i}}(i)=1\). As a consequence, \(\delta (i)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(j,i)\}}}\delta _{{k'}\rightarrow {k}}(i)+1\). Yet, as \(T\notin \mu _{{i}\rightarrow {j}}\) we have \(\delta _{{j}\rightarrow {i}}(j)=0\); so \(\delta (j)=\sum _{(k',k)\in \mathcal {A}(\mathcal {T}))\setminus \{(j,i)\}}\delta _{{k'}\rightarrow {k}}(j)\). Since \(\epsilon \notin \mu _{{i}\rightarrow {j}}\), \(\delta _{{i}\rightarrow {j}}(i)=\delta _{{i}\rightarrow {j}}(j)=0\). For \((k',k)\ne (i,j),(j,i)\), we have \(Adj_i(k)=Adj_j(k)\) and \(Adj_i(k')=Adj_j(k')\). In this case, it follows that \(\delta _{{k'}\rightarrow {k}}(i)=\delta _{{k'}\rightarrow {k}}(j)\). We conclude that \(\delta (i)=\delta (j)+1\).

Now suppose this property is satisfied for \(n-1>1\), let us prove that it remains true for n. Let l be such that \(len(l\!-\!j) = n-1\). Let \(\{p\}=Adj_i(l)\). Then \(\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)\) because \(\delta _{{p}\rightarrow {l}}(l)=1\) (since \(\epsilon \in \mu _{{p}\rightarrow {l}}\) and \(\{l\}= Adj_l(p)\)). Knowing that \(T\notin \mu _{{l}\rightarrow {p}}\), we get \(\delta _{{p}\rightarrow {l}}(p)=0\), it follows that \(\delta (p)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)\). Now using the same reasoning as in the case \(n=1\) and by remarking \( \delta _{{l}\rightarrow {p}}(p)=\delta _{{l}\rightarrow {p}}(l)=0\) because \(\epsilon \notin \mu _{{l}\rightarrow {p}}\), we conclude that \(\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)=1+\delta (p)\). By applying the induction hypothesis on l, where \(len(l\!-\!j)=n-1\), we obtain: \(\delta (l)=1+\delta (p)=1+n-1+\delta (j)=\delta (j)+n\).    \(\blacksquare \)

Lemma 2

Let \(\mathcal {V}_1=\{r\in \mathcal {V}(\mathcal {T}): \exists k\in \mathrm{Adj}(r), \mu _{{r}\rightarrow {k}}=\mu _{{k}\rightarrow {r}}=T\epsilon \}\), then for any \( r,r'\) in \(\mathcal {V}_1\) we have \(\delta (r)=\delta (r')\).

Proof

Assume that \(|\mathcal {V}_1|>1\). By Proposition 1, the nodes in \(\mathcal {V}_1\) form a connected subgraph. Let \(r,r' \in \mathcal {V}_1\) be such that \((r,r') \in \mathcal {E}(\mathcal {T})\). Finally, let \((k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}\). If \(k'\notin \{r,r'\}\), then either \(k=r\), \(k=r'\) or \(k\notin \{r,r'\}\) and in all these cases we have: \(Adj_r(k')=Adj_{r'}(k')\), hence \(\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')\). Otherwise, let \(k'=r'\) then \(k\ne r\) and we have alsoFootnote 3 \(Adj_r(k)=Adj_{r'}(k)\) and again \(\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')\). As a consequence: \(\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r')\). By Eq. (2), we get: \(\delta _{{r}\rightarrow {r'}}(r)+\delta _{{r'}\rightarrow {r}}(r)= \delta _{{r}\rightarrow {r'}}(r')+\delta _{{r'}\rightarrow {r}}(r')=2\). We conclude that \(\delta (r)-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r)= \delta (r')-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r')\). Hence \(\delta (r)=\delta (r')\).    \(\blacksquare \)

Proof of Theorem 1 – property b’s optimality: Under the notations of property b), it is sufficient to prove that for any i not in \(\mathcal {V}_1,\delta (r)\le \delta (i)\) Footnote 4. Without loss of generality, assume that \(i\in \mathcal {V}_{\text {-}{r'}}({r}) \). Let \( (k,k')\in \mathcal {A}(i\!-\!r)\), where \(\mathcal {A}(i\!-\!r)\) is the set of arcs induced from \(i\!-\!r\). We either have \(\{k'\}=Adj_r(k)\) or \(\{k\}=Adj_r(k')\). Assume for instance that \(\{k'\}=Adj_r(k)\,,k\ne r\), the second case should be treated similarly. Then \(\mu _{{k'}\rightarrow {k}}=T\epsilon \) and by applying Eq. 2, we summarize the results on the following table:

figure b

we conclude that \(\sum _{{(k',k)\in \mathcal {A}(i\!-\!r)}}\delta _{{k'}\rightarrow {k}}(r)\le \sum _{(k',k)\in \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(i)\). (1)

Now for \((k,k')\notin \mathcal {A}(i\!-\!r)\) it is easy to see that \(\delta _{{k}\rightarrow {k'}}(i)=\delta _{{k}\rightarrow {k'}}(r)\) and hence: \(\sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(r)\!=\! \sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r) }\delta _{{k'}\rightarrow {k}}(i)\). (2).

By comparing (1) and (2) we get that \(\delta (r)\le \delta (i)\) for \(i\notin \mathcal {V}_1\). So far, we obtain, by Lemma 2, for any i in \(\mathcal {V}_1\), \(\delta (r)=\delta (i)\) and for any i not in \(\mathcal {V}_1,\delta (r)\le \delta (i)\), therefore we have \(r\in Argmin_{i\in \mathcal {V}(\mathcal {T})}\) \(\delta (i)\).    \(\blacksquare \)

Proof of Theorem 1 – property c’s optimality: Let i in \(\mathcal {V}(\mathcal {T})\) s.t. \(i\ne r\).

first case: \(\mu _{{Adj_i(r)}\rightarrow {r}}=\emptyset \). Assume that \(T,\epsilon \in \mathcal {V}_{\text {-}{i}}({r}) \), because otherwise there is no need to perform any computation, as either there is no query or no modification in \(\mathcal {T}\); so by Lemma 1 we have \(\delta (i)=\delta (r)+len(i\!-\!r)\) because \(i\in \mathcal {V}_{\text {-}{r}}({Adj_i(r)}) \). Hence \( \delta (r)<\delta (i)\).

second case: we omit the case \(\mu _{{Adj_i(r)}\rightarrow {r}}\in \{T,\epsilon \}\), but one should use the same methodology as in property b)’s proof and the fact that for any \(k,k'\) in \(i\!-\!r\) s.t \(\{k'\}=Adj_r(k): \mu _{{k}\rightarrow {k'}}=\mu _{{i}\rightarrow {Adj_r(i)}}\) and examine \(\delta _{{k'}\rightarrow {k}}(r)\) and \(\delta _{{k'}\rightarrow {k}}(i)\).    \(\blacksquare \)

Proof of Proposition 2 : Given a root r, \(\delta _{{i}\rightarrow {j}}(r)\) corresponds, by construction, to the fact that \(\psi _{{i}\rightarrow {j}}\) is necessary during the current inference and was invalidated in the previous one. As a consequence, the current inference needs to recompute only such a message for any ij in \(\mathcal {V}(\mathcal {T})\).   \(\blacksquare \)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Agli, H., Bonnard, P., Gonzales, C., Wuillemin, PH. (2016). Incremental Junction Tree Inference. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40596-4_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40595-7

  • Online ISBN: 978-3-319-40596-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics