Abduction with probabilistic logic programming under the distribution semantics

https://doi.org/10.1016/j.ijar.2021.11.003Get rights and content

Abstract

In Probabilistic Abductive Logic Programming we are given a probabilistic logic program, a set of abducible facts, and a set of constraints. Inference in probabilistic abductive logic programs aims to find a subset of the abducible facts that is compatible with the constraints and that maximizes the joint probability of the query and the constraints. In this paper, we extend the PITA reasoner with an algorithm to perform abduction on probabilistic abductive logic programs exploiting Binary Decision Diagrams. Tests on several synthetic datasets show the effectiveness of our approach.

Introduction

Probabilistic Logic Programming (PLP) [1], [2] has recently attracted a lot of interest thanks to its ability to represent several scenarios [3], [4] with a simple yet powerful language. Furthermore, the possibility of integrating it with sub-symbolic systems makes it a relevant component of explainable probabilistic models [5].

An extension of Logic Programming that can manage incompleteness in the data is given by Abductive Logic Programming (ALP) [6], [7]. The goal of abduction is to find, given a set of hypotheses called abducibles, a subset of these that explains an observed fact. With ALP, users can perform logical abduction from an expressive logic model possibly subject to constraints. However, a limitation is that observations are often noisy since they come from real-world data. Furthermore, there may be different levels of belief among rules. It is thus fundamental to extend ALP and associate probabilities to observations, to both handle these scenarios and test the reliability of the computed solutions.

Starting from the probabilistic logic language of LPADs, in this paper we introduce probabilistic abductive logic programs (PALP), i.e., probabilistic logic programs including a set of abducible facts and a (possibly empty) set of (possibly probabilistic) integrity constraints. Probabilities associated with integrity constraints can represent how strong the belief is that the constraint is true and can help in defining a more articulated probability distribution of queries. These programs define a probability distribution over abductive logic programs inspired by the distribution semantics in PLP [8]. Given a query, the goal is to maximize the joint probability distribution of the query and the constraints by selecting the minimal subsets of abducible facts to be included in the abductive logic program while ensuring that constraints are not violated.

Consider the following motivating example: suppose you work in the city center and, starting from your home, you may choose several alternative routes to reach your office. However, streets are often congested, but you want to avoid traffic and reach the destination with the lowest probability of encountering a car accident. You can associate different probabilities (representing beliefs or noisy data that came from historical measurements) of encountering (or not encountering) a car accident in all the possible alternative streets, and impose an integrity constraint that states that only one path (combination of streets) can be selected (clearly, you cannot travel two routes simultaneously). Then, you look for the best combination of streets to maximize the probability of not encountering a car accident. A possible encoding for this situation is presented in Section 6 (experiments on graph datasets). Alternatively, suppose that you want to study more in depth a natural phenomenon that may happen in a region. In the model, there may be some variables that describe the land morphological characteristics and some variables that relate the possible events that can occur, such as eruption or earthquake. Moreover, you want to impose that some of these cannot be observed together (or it is unlikely that they will be). The goal may consist in finding the optimal combination of variables (representing possible events) that better describes a possible scenario and maximizes its probability. This will be the running example we use through the paper, starting from Example 1, where we model events possibly occurring in the island of Stromboli.

To perform inference on PALP, we extend the PITA system [9], which computes the probability of a query from an LPAD by means of Binary Decision Diagrams (BDD). One of the key points of this extension is that it has the version of PITA used to make inference on LPADs as a special case: when both the set of abducibles and the set of constraints are empty, the program is treated as a probabilistic logic program. This has an important implication: we do not need to write an ad hoc algorithm to treat the probabilistic part of the LPAD, we just need to extend the already-existing algorithm. Furthermore, (probabilistic) integrity constraints are implemented by means of operations on BDDs and so they can be directly incorporated in the representation. The extended system has been integrated into the web application “cplint on SWISH” [10], [11], available online at https://cplint.eu/.

To test our implementation, we performed several experiments on five synthetic datasets. The results show that PALP with probabilistic or deterministic integrity constraints often require comparable inference time. Moreover, through a series of examples, we compare inference on PALP with other related tasks, such as Maximum a Posteriori (MAP), Most Probable Explanation (MPE), and Viterbi proof.

The paper is structured as follows: Section 2 and Section 3 present respectively an overview of Abductive and Probabilistic Logic Programming. Section 4 introduces probabilistic abductive logic programs and some illustrative examples. Section 5 describes the inference algorithm we developed, which was tested on several datasets in Section 6. Section 7 provides an analysis of related works, and Section 8 concludes the paper.

Section snippets

Abductive logic programming and well-founded semantics

Abduction is the inference strategy that copes with incompleteness in the data by guessing information that was not observed. Abductive Logic Programming [6], [7] extends Logic Programming [12] by considering some atoms, called abducibles, to be only indirectly and partially defined using a set of constraints. The reasoner may derive abductive hypotheses, i.e., sets of abducible atoms, as long as such hypotheses do not violate the given constraints. Let us now introduce more formally some

Probabilistic logic programming

The distribution semantics [8] is becoming increasingly important in Probabilistic Logic Programming. According to this semantics, a probabilistic logic program defines a probability distribution over a set of normal logic programs (called worlds). The distribution is extended to a joint distribution over worlds and a ground query, and the probability that the query is true is obtained from this distribution by marginalization. The languages based on the distribution semantics differ in the way

Probabilistic abductive logic programs

To introduce the concept of probabilistic abductive logic programs, consider again Example 1. Suppose we want to maximize the probability of the query eruption. However, we do not know whether there was a fault rupture in the southwest-northeast or east-west direction. Furthermore, suppose that the fault rupture may happen along only one of the two directions simultaneously. In the following, we formally introduce this problem.

Definition 5 Probabilistic Integrity Constraint

A probabilistic integrity constraint is an integrity constraint with

Algorithm

In PLP, the probability of the query is computed by building a BDD and by applying a dynamic programming algorithm that traverses it, such as the one presented in [26] and reported in Algorithm 1 for the sake of clarity. var(node) represents the variable associated with the BDD node node and comp is a flag that indicates whether a node pointer is complemented or not. Intermediate results are stored in a table to avoid the execution of the same computation in case the algorithm encounters an

Experiments

We conducted some experiments to analyze the execution time of the proposed algorithm. We executed them on a cluster8 with Intel® Xeon® E5-2630v3 running at 2.40 GHz on five synthetic datasets9 taken from [28]: growing head (gh), growing negated body (gnb), blood, probabilistic graph (graph) and probabilistic complete graph (complete graph). As stated in Section

Related work

Traditionally defined as inference to the best explanation, abduction embeds the implicit assumption that many possible explanations exist and raises the issue about which one should be selected. Adopting a purely logical setting, one may leverage the candidate explanations' complexity, preferring minimal ones. Still, different minimal but incomparable explanations are possible (there is no total ordering on them). Intuitively, one might want to select candidate explanations based on their

Conclusions

In this paper, we extended the PITA system to perform abductive reasoning on probabilistic abductive logic programs: given a probabilistic logic program, a set of abducible facts, and a set of (possibly probabilistic) integrity constraints, we want to compute minimal sets of abductive explanations (the probabilistic abductive explanation) such that the joint probability of the query and the constraints is maximized. The algorithm is based on Binary Decision Diagrams and was tested on several

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research was partly supported by the “National Group of Computing Science (GNCS-INDAM)”.

References (66)

  • A.C. Kakas et al.

    Abductive logic programming

  • A.C. Kakas et al.

    Database updates through abduction

  • T. Sato

    A statistical learning method for logic programs with distribution semantics

  • F. Riguzzi et al.

    Tabling and answer subsumption for reasoning on logic programs with annotated disjunctions

  • F. Riguzzi et al.

    Probabilistic logic programming on the web

    Softw. Pract. Exp.

    (2016)
  • M. Alberti et al.

    cplint on SWISH: probabilistic logical inference with a web browser

    Intell. Artif.

    (2017)
  • J.W. Lloyd

    Foundations of Logic Programming

    (1987)
  • A. Van Gelder et al.

    The well-founded semantics for general logic programs

    J. ACM

    (1991)
  • T.C. Przymusinski

    Every logic program has a natural stratification and an iterated least fixed point model

  • A. Van Gelder et al.

    Unfounded sets and well-founded semantics for general logic programs

  • J. Vennekens et al.

    Logic programs with annotated disjunctions

  • R. Zese et al.

    Tableau reasoning for description logics and its extension to probabilities

    Ann. Math. Artif. Intell.

    (2018)
  • E. Bellodi et al.

    Structure learning of probabilistic logic programs by searching the clause space

    Theory Pract. Log. Program.

    (2015)
  • F. Riguzzi et al.

    Applying the information bottleneck to statistical relational learning

    Mach. Learn.

    (2012)
  • F. Riguzzi et al.

    Well-definedness and efficient inference for probabilistic logic programming under the distribution semantics

    Theory Pract. Log. Program.

    (2013)
  • L.G. Valiant

    The complexity of enumeration and reliability problems

    SIAM J. Comput.

    (1979)
  • A. Darwiche et al.

    A knowledge compilation map

    J. Artif. Intell. Res.

    (2002)
  • A. Thayse et al.

    Optimization of multivalued decision algorithms

  • L. De Raedt et al.

    ProbLog: a probabilistic Prolog and its application in link discovery

  • T. Sang et al.

    Performing bayesian inference by weighted model counting

  • E. Bellodi et al.

    MAP inference for probabilistic logic programming

    Theory Pract. Log. Program.

    (2020)
  • D. Poole

    Logic programming, abduction and probability - a top-down anytime algorithm for estimating prior and posterior probabilities

    New Gener. Comput.

    (1993)
  • T. Sato et al.

    A viterbi-like algorithm and em learning for statistical abduction

  • Cited by (12)

    View all citing articles on Scopus
    View full text