Abduction with probabilistic logic programming under the distribution semantics
Introduction
Probabilistic Logic Programming (PLP) [1], [2] has recently attracted a lot of interest thanks to its ability to represent several scenarios [3], [4] with a simple yet powerful language. Furthermore, the possibility of integrating it with sub-symbolic systems makes it a relevant component of explainable probabilistic models [5].
An extension of Logic Programming that can manage incompleteness in the data is given by Abductive Logic Programming (ALP) [6], [7]. The goal of abduction is to find, given a set of hypotheses called abducibles, a subset of these that explains an observed fact. With ALP, users can perform logical abduction from an expressive logic model possibly subject to constraints. However, a limitation is that observations are often noisy since they come from real-world data. Furthermore, there may be different levels of belief among rules. It is thus fundamental to extend ALP and associate probabilities to observations, to both handle these scenarios and test the reliability of the computed solutions.
Starting from the probabilistic logic language of LPADs, in this paper we introduce probabilistic abductive logic programs (PALP), i.e., probabilistic logic programs including a set of abducible facts and a (possibly empty) set of (possibly probabilistic) integrity constraints. Probabilities associated with integrity constraints can represent how strong the belief is that the constraint is true and can help in defining a more articulated probability distribution of queries. These programs define a probability distribution over abductive logic programs inspired by the distribution semantics in PLP [8]. Given a query, the goal is to maximize the joint probability distribution of the query and the constraints by selecting the minimal subsets of abducible facts to be included in the abductive logic program while ensuring that constraints are not violated.
Consider the following motivating example: suppose you work in the city center and, starting from your home, you may choose several alternative routes to reach your office. However, streets are often congested, but you want to avoid traffic and reach the destination with the lowest probability of encountering a car accident. You can associate different probabilities (representing beliefs or noisy data that came from historical measurements) of encountering (or not encountering) a car accident in all the possible alternative streets, and impose an integrity constraint that states that only one path (combination of streets) can be selected (clearly, you cannot travel two routes simultaneously). Then, you look for the best combination of streets to maximize the probability of not encountering a car accident. A possible encoding for this situation is presented in Section 6 (experiments on graph datasets). Alternatively, suppose that you want to study more in depth a natural phenomenon that may happen in a region. In the model, there may be some variables that describe the land morphological characteristics and some variables that relate the possible events that can occur, such as eruption or earthquake. Moreover, you want to impose that some of these cannot be observed together (or it is unlikely that they will be). The goal may consist in finding the optimal combination of variables (representing possible events) that better describes a possible scenario and maximizes its probability. This will be the running example we use through the paper, starting from Example 1, where we model events possibly occurring in the island of Stromboli.
To perform inference on PALP, we extend the PITA system [9], which computes the probability of a query from an LPAD by means of Binary Decision Diagrams (BDD). One of the key points of this extension is that it has the version of PITA used to make inference on LPADs as a special case: when both the set of abducibles and the set of constraints are empty, the program is treated as a probabilistic logic program. This has an important implication: we do not need to write an ad hoc algorithm to treat the probabilistic part of the LPAD, we just need to extend the already-existing algorithm. Furthermore, (probabilistic) integrity constraints are implemented by means of operations on BDDs and so they can be directly incorporated in the representation. The extended system has been integrated into the web application “cplint on SWISH” [10], [11], available online at https://cplint.eu/.
To test our implementation, we performed several experiments on five synthetic datasets. The results show that PALP with probabilistic or deterministic integrity constraints often require comparable inference time. Moreover, through a series of examples, we compare inference on PALP with other related tasks, such as Maximum a Posteriori (MAP), Most Probable Explanation (MPE), and Viterbi proof.
The paper is structured as follows: Section 2 and Section 3 present respectively an overview of Abductive and Probabilistic Logic Programming. Section 4 introduces probabilistic abductive logic programs and some illustrative examples. Section 5 describes the inference algorithm we developed, which was tested on several datasets in Section 6. Section 7 provides an analysis of related works, and Section 8 concludes the paper.
Section snippets
Abductive logic programming and well-founded semantics
Abduction is the inference strategy that copes with incompleteness in the data by guessing information that was not observed. Abductive Logic Programming [6], [7] extends Logic Programming [12] by considering some atoms, called abducibles, to be only indirectly and partially defined using a set of constraints. The reasoner may derive abductive hypotheses, i.e., sets of abducible atoms, as long as such hypotheses do not violate the given constraints. Let us now introduce more formally some
Probabilistic logic programming
The distribution semantics [8] is becoming increasingly important in Probabilistic Logic Programming. According to this semantics, a probabilistic logic program defines a probability distribution over a set of normal logic programs (called worlds). The distribution is extended to a joint distribution over worlds and a ground query, and the probability that the query is true is obtained from this distribution by marginalization. The languages based on the distribution semantics differ in the way
Probabilistic abductive logic programs
To introduce the concept of probabilistic abductive logic programs, consider again Example 1. Suppose we want to maximize the probability of the query eruption. However, we do not know whether there was a fault rupture in the southwest-northeast or east-west direction. Furthermore, suppose that the fault rupture may happen along only one of the two directions simultaneously. In the following, we formally introduce this problem.
Definition 5 Probabilistic Integrity Constraint A probabilistic integrity constraint is an integrity constraint with
Algorithm
In PLP, the probability of the query is computed by building a BDD and by applying a dynamic programming algorithm that traverses it, such as the one presented in [26] and reported in Algorithm 1 for the sake of clarity. represents the variable associated with the BDD node node and comp is a flag that indicates whether a node pointer is complemented or not. Intermediate results are stored in a table to avoid the execution of the same computation in case the algorithm encounters an
Experiments
We conducted some experiments to analyze the execution time of the proposed algorithm. We executed them on a cluster8 with Intel® Xeon® E5-2630v3 running at 2.40 GHz on five synthetic datasets9 taken from [28]: growing head (gh), growing negated body (gnb), blood, probabilistic graph (graph) and probabilistic complete graph (complete graph). As stated in Section
Related work
Traditionally defined as inference to the best explanation, abduction embeds the implicit assumption that many possible explanations exist and raises the issue about which one should be selected. Adopting a purely logical setting, one may leverage the candidate explanations' complexity, preferring minimal ones. Still, different minimal but incomparable explanations are possible (there is no total ordering on them). Intuitively, one might want to select candidate explanations based on their
Conclusions
In this paper, we extended the PITA system to perform abductive reasoning on probabilistic abductive logic programs: given a probabilistic logic program, a set of abducible facts, and a set of (possibly probabilistic) integrity constraints, we want to compute minimal sets of abductive explanations (the probabilistic abductive explanation) such that the joint probability of the query and the constraints is maximized. The algorithm is based on Binary Decision Diagrams and was tested on several
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research was partly supported by the “National Group of Computing Science (GNCS-INDAM)”.
References (66)
Abducing through negation as failure: stable models within the independent choice logic
J. Log. Program.
(2000)The distribution semantics for normal programs with function symbols
Int. J. Approx. Reason.
(2016)Probabilistic logic
Artif. Intell.
(1986)Probabilistic Horn abduction and Bayesian networks
Artif. Intell.
(1993)- et al.
The IFF proof procedure for abductive logic programming
J. Log. Program.
(1997) Foundations of Probabilistic Logic Programming: Languages, Semantics, Inference and Learning
(2018)- et al.
Studying transaction fees in the bitcoin blockchain with probabilistic logic programming
Information
(2019) - et al.
Probabilistic logic programming in action
- et al.
Deepproblog: neural probabilistic logic programming
Abductive logic programming
Database updates through abduction
A statistical learning method for logic programs with distribution semantics
Tabling and answer subsumption for reasoning on logic programs with annotated disjunctions
Probabilistic logic programming on the web
Softw. Pract. Exp.
cplint on SWISH: probabilistic logical inference with a web browser
Intell. Artif.
Foundations of Logic Programming
The well-founded semantics for general logic programs
J. ACM
Every logic program has a natural stratification and an iterated least fixed point model
Unfounded sets and well-founded semantics for general logic programs
Logic programs with annotated disjunctions
Tableau reasoning for description logics and its extension to probabilities
Ann. Math. Artif. Intell.
Structure learning of probabilistic logic programs by searching the clause space
Theory Pract. Log. Program.
Applying the information bottleneck to statistical relational learning
Mach. Learn.
Well-definedness and efficient inference for probabilistic logic programming under the distribution semantics
Theory Pract. Log. Program.
The complexity of enumeration and reliability problems
SIAM J. Comput.
A knowledge compilation map
J. Artif. Intell. Res.
Optimization of multivalued decision algorithms
ProbLog: a probabilistic Prolog and its application in link discovery
Performing bayesian inference by weighted model counting
MAP inference for probabilistic logic programming
Theory Pract. Log. Program.
Logic programming, abduction and probability - a top-down anytime algorithm for estimating prior and posterior probabilities
New Gener. Comput.
A viterbi-like algorithm and em learning for statistical abduction
Cited by (12)
Lifted inference for statistical statements in probabilistic answer set programming
2023, International Journal of Approximate ReasoningCombining theory of mind and abductive reasoning in agent-oriented programming
2023, Autonomous Agents and Multi-Agent SystemsOn the Development of PASTA: Inference in Probabilistic Answer Set Programming under the Credal Semantics
2023, Electronic Proceedings in Theoretical Computer Science, EPTCSA Defeasible Description Logic for Abduction
2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Prolog for Scientific Explanation
2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Approximate Inference in Probabilistic Answer Set Programming for Statistical Probabilities
2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)