Table Of ContentLogic Programming(cid:0) Abduction and
Probability(cid:1)
a top(cid:2)down anytime algorithm for
estimating prior and posterior probabilities
(cid:0)
David Poole
Department of Computer Science(cid:0)
University of British Columbia(cid:0)
Vancouver(cid:0) B(cid:1)C(cid:1)(cid:0) Canada V(cid:2)T (cid:3)Z(cid:4)
poole(cid:5)cs(cid:1)ubc(cid:1)ca
March (cid:3)(cid:6)(cid:0) (cid:3)(cid:7)(cid:7)(cid:8)
Abstract
Probabilistic Horn abduction is a simple framework to combine
probabilistic and logical reasoning into a coherent practical frame(cid:0)
work(cid:1) The numbers can be consistently interpreted probabilistically(cid:2)
and all of the rules can be interpreted logically(cid:1) The relationship be(cid:0)
tween probabilistic Horn abduction and logic programming is at two
levels(cid:1) At the (cid:3)rst level probabilistic Horn abduction is an extension
of pure Prolog(cid:2) that is useful for diagnosis and other evidential rea(cid:0)
soning tasks(cid:1) At another level(cid:2) current logic programming implemen(cid:0)
tation techniques can be used to e(cid:4)ciently implement probabilistic
Horn abduction(cid:1) This forms the basis of an (cid:5)anytime(cid:6) algorithm for
estimating arbitraryconditional probabilities(cid:1) The focus ofthis paper
is on the implementation(cid:1)
(cid:0)
Scholar(cid:0) Canadian Institute for Advanced Research
(cid:0)
Logic Programming(cid:0) Abduction and Probability (cid:1)
(cid:0) Introduction
Probabilistic Horn Abduction (cid:2)(cid:1)(cid:1)(cid:3) (cid:1)(cid:0)(cid:3) (cid:1)(cid:4)(cid:5) is a framework for logic(cid:6)based ab(cid:6)
duction that incorporates probabilities with assumptions(cid:7) It is being used
as a framework for diagnosis (cid:2)(cid:1)(cid:1)(cid:5) that incorporates both pure Prolog and
discrete Bayesian Networks (cid:2)(cid:0)(cid:8)(cid:5) as special cases (cid:2)(cid:1)(cid:0)(cid:5)(cid:7) This paper is about
the relationship of probabilistic Horn abduction to logic programming(cid:7) This
simpleextension to logic programmingprovides a wealth of new applications
in diagnosis(cid:3) recognition and evidential reasoning (cid:2)(cid:1)(cid:4)(cid:5)(cid:7)
This paper also presents a logic(cid:6)programming solution to the problem in
abduction of searching for the (cid:9)best(cid:10) diagnoses (cid:11)rst(cid:7) The main features of
the approach are(cid:12)
(cid:0) We are using Horn clause abduction(cid:7) The procedures are simple(cid:3) both
conceptuallyand computationally(cid:13)for a certain class of problems(cid:14)(cid:7) We
develop a simple extension of SLD resolution to implement our frame(cid:6)
work(cid:7)
(cid:0) The search algorithms form (cid:9)anytime(cid:10) algorithms that can give an
estimateofthe conditionalprobabilityatany time(cid:7) Wedo not generate
the unlikely explanations unless we need to(cid:7) We have a bound on the
probabilitymassoftheremainingexplanationswhichallowsusto know
the error in our estimates(cid:7)
(cid:0) A theory of (cid:9)partial explanations(cid:10) is developed(cid:7) These are partial
proofs that can be stored in a priority queue until they need to be
further expanded(cid:7) We show how this is implementedin a Prolog inter(cid:6)
preter in Appendix A(cid:7)
(cid:1) Probabilistic Horn abduction
The formulation of abduction used is a simpli(cid:11)ed form of Theorist (cid:2)(cid:1)(cid:8)(cid:3) (cid:0)(cid:15)(cid:5)
with probabilities associated with the hypotheses(cid:7) It is simpli(cid:11)ed in being
restricted to de(cid:11)nite clauses with simple forms of integrity constraints (cid:13)sim(cid:6)
ilar to that of Goebel et(cid:7) al(cid:7) (cid:2)(cid:0)(cid:16)(cid:5)(cid:14)(cid:7) This can also be seen as a generalisation
of an ATMS (cid:2)(cid:1)(cid:17)(cid:5) to be non(cid:6)propositional(cid:7)
Logic Programming(cid:0) Abduction and Probability (cid:4)
The language is that of pure Prolog (cid:13)i(cid:7)e(cid:7)(cid:3) de(cid:11)nite clauses(cid:14) with special
disjoint declarations that specify a set of disjoint hypotheses with associated
probabilities(cid:7) There are some restrictions on the forms of the rules and the
probabilisticdependence allowed(cid:7) The language presentedhere is that of (cid:2)(cid:1)(cid:4)(cid:5)
rather than that of (cid:2)(cid:1)(cid:1)(cid:3) (cid:1)(cid:0)(cid:5)(cid:7)
The main design considerations were to make a language the simplest
extension to pure Prolog that also included probabilities (cid:13)not just numbers
associated with rules(cid:3) but numbersthat follow the laws of probability(cid:3)and so
can be consistently interpreted as probabilities (cid:2)(cid:1)(cid:4)(cid:5)(cid:14)(cid:7) We are also assuming
very strong independence assumptions(cid:18) this is not intended to be a tempo(cid:6)
rary restriction on the language that we want to eventually remove(cid:3) but as
a feature(cid:7) We can represent any probabilistic information using only inde(cid:6)
pendent hypotheses (cid:2)(cid:1)(cid:4)(cid:5)(cid:18) if there is any dependence amongst hypotheses(cid:3) we
invent a new hypothesis to explain that dependency(cid:7)
(cid:0)(cid:1)(cid:2) The language
Our language uses the Prolog conventions(cid:3) and has the same de(cid:11)nitions of
variables(cid:3) terms and atomic symbols(cid:7)
De(cid:0)nition (cid:1)(cid:2)(cid:3) A de(cid:0)nite clause (cid:13)or just (cid:9)clause(cid:10)(cid:14) is of the form(cid:12) a(cid:0) or
a (cid:1) a(cid:0)(cid:2)(cid:3)(cid:3)(cid:3)(cid:2)an(cid:0) where a and each ai are atomic symbols(cid:7)
De(cid:0)nition (cid:1)(cid:2)(cid:1) A disjoint declaration is of the form
disjoint(cid:13)(cid:2)h(cid:0) (cid:12) p(cid:0)(cid:1)(cid:3)(cid:3)(cid:3)(cid:1)hn (cid:12) pn(cid:5)(cid:14)(cid:0)
where the hi are atoms(cid:3) and the pi are real numbers (cid:16) (cid:2) pi (cid:4) (cid:0) such that
p(cid:0) (cid:19) (cid:3)(cid:3)(cid:3) (cid:19) pn (cid:20) (cid:0)(cid:7) Any variable appearing in one hi must appear in all of
the hj (cid:13)i(cid:7)e(cid:7)(cid:3) the hi share the same variables(cid:14)(cid:7) The hi will be referred to as
hypotheses(cid:7)
Anyground instanceofa hypothesisinone disjointdeclaration(cid:3)cannot be
an instance of another hypothesis in any of the disjoint declarations (cid:13)either
the same declaration or a di(cid:21)erent declaration(cid:14) nor can it be an instance of
the head of any clause(cid:7) This restricts the language so that we cannot encode
arbitrary clauses as disjoint declarations(cid:7) Each instance of each disjoint dec(cid:6)
laration is independent of all of the other instances of disjoint declarations
and the clauses(cid:7)
Logic Programming(cid:0) Abduction and Probability (cid:22)
De(cid:0)nition (cid:1)(cid:2)(cid:4) A probabilistic Horn abduction theory (cid:13)which will be
referred to as a (cid:9)theory(cid:10)(cid:14) is a collection of de(cid:11)nite clauses and disjoint dec(cid:6)
larations(cid:7)
Given theory T(cid:3) we de(cid:11)ne
FT the facts(cid:3) is the set of de(cid:11)nite clauses in T together with the clauses of
the form
false (cid:1) hi (cid:2)hj
where hi and hj both appear in the same disjoint declaration in T(cid:3) and
(cid:0)
i (cid:5)(cid:20) j(cid:7) Let FT be the set of ground instances of elements of FT(cid:7)
HT to be the set of hypotheses(cid:3) the set of hi such that hi appears in a
(cid:0)
disjoint declaration in T(cid:7) Let HT be the set of ground instances of
elements of HT(cid:7)
(cid:0) (cid:0) (cid:0)
PT is a function HT (cid:6)(cid:7) (cid:2)(cid:16)(cid:1)(cid:0)(cid:5)(cid:7) PT(cid:13)hi(cid:14) (cid:20) pi where hi is a ground instance of
hypothesis hi(cid:3) and hi (cid:12) pi is in a disjoint declaration in T(cid:7)
Where T is understood from context(cid:3) we omit the subscript(cid:7)
De(cid:0)nition (cid:1)(cid:2)(cid:5) (cid:2)(cid:1)(cid:8)(cid:3) (cid:0)(cid:23)(cid:5) If g is a closed formula(cid:3) an explanation of g from
(cid:0)
hF(cid:1)Hi is a set D of elements of H such that
(cid:0) F (cid:8)D j(cid:20) g and
(cid:0) F (cid:8)D (cid:5)j(cid:20) false(cid:7)
The (cid:11)rst condition says that D is a su(cid:24)cientcause for g(cid:3) and the second says
that D is possible(cid:7)
De(cid:0)nition (cid:1)(cid:2)(cid:6) A minimal explanation of g is an explanation of g such
that no strict subset is an explanation of g(cid:7)
Logic Programming(cid:0) Abduction and Probability (cid:8)
(cid:0)(cid:1)(cid:0) Assumptions about the rule base
Probabilistic Horn abduction also contains some assumptions about the rule
base(cid:7) It can be argued that these assumptions are natural(cid:3) and do not really
restrict what can be represented (cid:2)(cid:1)(cid:4)(cid:5)(cid:7) Here we list these assumptions(cid:3) and
use them in order to show how the algorithms work(cid:7)
The (cid:11)rst assumption we makeis about the relationship between hypothe(cid:6)
ses and rules(cid:12)
Assumption (cid:1)(cid:2)(cid:7) There are no rules with head unifying with a member of
H(cid:7)
Instead of having a ruleimplyinga hypothesis(cid:3) wecan inventa new atom(cid:3)
make the hypothesis imply this atom(cid:3) and all of the rules imply this atom(cid:3)
and use this atom instead of the hypothesis(cid:7)
(cid:0)
Assumption (cid:1)(cid:2)(cid:8) (cid:13)acyclicity(cid:14)IfF isthe set of ground instances of elements
of F(cid:3) then it is possible to assign a natural number to every ground atom
(cid:0)
such that for every rule in F the atoms in the body of the rule are strictly
less than the atom in the head(cid:7)
This assumption is discussed by Apt and Bezem (cid:2)(cid:0)(cid:5)(cid:7)
(cid:0)
Assumption (cid:1)(cid:2)(cid:9) The rules in F for a ground non(cid:6)assumable atom are cov(cid:6)
ering(cid:7)
(cid:0)
That is(cid:3) if the rules for a in F are
a (cid:1) B(cid:0)
a (cid:1) B(cid:1)
(cid:7)
(cid:7)
(cid:7)
a (cid:1) Bm
if a is true(cid:3) one of the Bi is true(cid:7) Thus Clark(cid:25)s completion (cid:2)(cid:22)(cid:5) is valid for
every non(cid:6)assumable(cid:7) Often we get around this assumption by adding a rule
a (cid:1) some other reason for a
and making (cid:9)some other reason for a(cid:10) a hypothesis (cid:2)(cid:1)(cid:4)(cid:5)(cid:7)
Logic Programming(cid:0) Abduction and Probability (cid:26)
(cid:0)
Assumption (cid:1)(cid:2)(cid:10) The bodies of the rules in F for an atom are mutually
exclusive(cid:7)
Given the above rules for a(cid:3) this means that for each i (cid:5)(cid:20) j(cid:3) Bi and Bj
(cid:0)
cannot both be true in the domain under consideration (cid:7) We can make this
true by adding extra conditions to the rules to make sure they are disjoint(cid:7)
Associated with each possible hypothesis is a prior probability(cid:7) We use
this prior probability to compute arbitrary probabilities(cid:7) Our (cid:11)nal assump(cid:6)
tion is to assumethat logical dependencies imposethe only statistical depen(cid:6)
dencies on the hypotheses(cid:7) In particular we assume(cid:12)
Assumption (cid:1)(cid:2)(cid:3)(cid:11) Groundinstancesofhypothesesthatarenotinconsistent
(cid:13)with FT(cid:14) are probabilistically independent(cid:7)
The only ground instances of hypotheses that are inconsistent with FT
are those of the form hi(cid:3) and hj(cid:3)(cid:3) where hi and hj are di(cid:21)erent elements
of the same disjoint declaration(cid:7) Thus di(cid:21)erent disjoint declarations de(cid:11)ne
independent hypotheses(cid:7) Di(cid:21)erent instances of hypotheses are also indepen(cid:6)
dent(cid:7) Note that the hypotheses in a minimalexplanation are always logically
independent(cid:7) The language has been carefully set up so that the logic does
not force any dependencies amongst the hypotheses(cid:7) If we could prove that
somehypotheses impliedother hypotheses or their negations(cid:3) the hypotheses
could not be independent(cid:7) The language is deliberately designed to be too
weak to be able to state such logical dependencies between hypotheses(cid:7)
See (cid:2)(cid:1)(cid:4)(cid:5) for more justi(cid:11)cation of these assumptions(cid:7)
(cid:0)(cid:1)(cid:3) Some Consequents of the Assumptions
Lemma (cid:1)(cid:2)(cid:3)(cid:3) Underassumption(cid:1)(cid:7)(cid:0)(cid:16)(cid:3) ifthereisatleastonyhypothesiswith
a free variable(cid:3) then distinct ground terms denote di(cid:21)erent individuals(cid:7)
Proof(cid:12) If t(cid:0) and t(cid:1) are distinct ground terms(cid:3) then t(cid:0) (cid:5)(cid:20) t(cid:1)(cid:7)
Otherwise(cid:3) suppose t(cid:0) (cid:20) t(cid:1)(cid:7) If h(cid:13)X(cid:14) is a hypothesis(cid:3) then h(cid:13)t(cid:0)(cid:14)
cannot be independent of h(cid:13)t(cid:1)(cid:14) as they are logically equivalent(cid:3)
(cid:0)
which violates assumption (cid:1)(cid:7)(cid:0)(cid:16)(cid:7)
(cid:0)
Wedo not insist that we are able to prove neg(cid:1)Bi(cid:1)bj(cid:2)(cid:3) Typicallywe cannot(cid:3) This is
a semantic restriction(cid:3) There is(cid:0) however(cid:0) one syntactic check we can make(cid:3) If there is a
set of independent (cid:1)see assumption (cid:4)(cid:3)(cid:5)(cid:6)(cid:2) hypotheses from which both Bi and Bj can be
derived then the assumption is violated(cid:3)
Logic Programming(cid:0) Abduction and Probability (cid:17)
The following lemma can be easily proved(cid:12)
Lemma (cid:1)(cid:2)(cid:3)(cid:1) Underassumptions(cid:1)(cid:7)(cid:26)and(cid:1)(cid:7)(cid:15)(cid:3)minimalexplanationsofground
atoms or conjunctions of ground atoms are mutually inconsistent(cid:7)
Lemma (cid:1)(cid:2)(cid:3)(cid:4) A minimal explanation of a ground atom cannot contain any
free variables(cid:7)
Proof(cid:12) If a minimal explanation of a ground atom contains
free variables(cid:3) there will be in(cid:11)nitely many ground explanations
of the atom (cid:13)substituting di(cid:21)erent ground termsfor the free vari(cid:6)
ables(cid:14)(cid:7) Each of these ground explanations will have the same
probability(cid:7) Thus this probability must be zero (cid:13)as we can sum
the probabilities of disjoint formulae(cid:3) and this sum must be less
than or equal to one(cid:14)(cid:7) This contradicts the fact that the explana(cid:6)
tions are minimal(cid:3)and the hypotheses are independent(cid:3) and each
(cid:0)
positive(cid:7)
Lemma (cid:1)(cid:2)(cid:3)(cid:5) (cid:2)(cid:8)(cid:3) (cid:0)(cid:15)(cid:5) Under assumptions (cid:1)(cid:7)(cid:26)(cid:3) (cid:1)(cid:7)(cid:17) and (cid:1)(cid:7)(cid:23)(cid:3) if expl(cid:13)g(cid:1)T(cid:14) is
the set of minimal explanations of ground g from theory T(cid:3) and comp(cid:13)T(cid:14) is
Clark(cid:25)s completion of the non assumables (cid:13)and includes Clark(cid:25)s equational
(cid:1)
theory(cid:14) then
(cid:0) (cid:3)
comp(cid:13)T(cid:14) j(cid:20) g (cid:9) ei
(cid:2)
(cid:1) ei(cid:1)expl(cid:2)g(cid:0)T(cid:3) A
The following is a corollary of lemmata (cid:1)(cid:7)(cid:0)(cid:22) and (cid:1)(cid:7)(cid:0)(cid:1)
Lemma (cid:1)(cid:2)(cid:3)(cid:6) Under assumptions (cid:1)(cid:7)(cid:26)(cid:3) (cid:1)(cid:7)(cid:17)(cid:3) (cid:1)(cid:7)(cid:23)(cid:3) (cid:1)(cid:7)(cid:15) and (cid:1)(cid:7)(cid:0)(cid:16)(cid:3) if expl(cid:13)g(cid:1)T(cid:14) is
the set of minimalexplanations of conjunction of atoms g from probabilistic
Horn abduction theory T(cid:12)
(cid:0) (cid:3)
P(cid:13)g(cid:14) (cid:20) P ei
(cid:2)
(cid:1)ei(cid:1)expl(cid:2)g(cid:0)T(cid:3) A
(cid:20) P(cid:13)ei(cid:14)
X
ei(cid:1)expl(cid:2)g(cid:0)T(cid:3)
(cid:1)
For the case we have here(cid:0) Console et(cid:3) al(cid:3)(cid:7)s results (cid:8)(cid:9)(cid:10) can be extended to acyclic
theories (cid:1)the set of ground instances of the acyclic theory is a hierarchical theory(cid:0) and we
only need really consider the ground instances(cid:2)(cid:3)
Logic Programming(cid:0) Abduction and Probability (cid:23)
Thus to compute the prior probability of any g we sum the probabilities
of the explanations of g(cid:7) Poole (cid:2)(cid:1)(cid:4)(cid:5) proves this directly using a semantics
that incorporates the above assumptions(cid:7)
To compute arbitrary conditional probabilities(cid:3) we use the de(cid:11)nition of
conditional probability(cid:12)
P(cid:13)(cid:4)(cid:2)(cid:5)(cid:14)
P(cid:13)(cid:4)j(cid:5)(cid:14) (cid:20)
P(cid:13)(cid:5)(cid:14)
Thus to (cid:11)nd arbitrary conditional probabilities P(cid:13)(cid:4)j(cid:5)(cid:14)(cid:3) we (cid:11)nd P(cid:13)(cid:5)(cid:14)(cid:3)
which is the sum of the explanations of (cid:5)(cid:3) and P(cid:13)(cid:4) (cid:2) (cid:5)(cid:14) which can be
found by explaining (cid:4) from the explanations of (cid:5)(cid:7) Thus arbitrary condi(cid:6)
tional probabilities can be computed from summing the prior probabilities
of explanations(cid:7)
It remains only to compute the prior probability of an explanation D of
g(cid:7)
Under assumption(cid:1)(cid:7)(cid:0)(cid:16)(cid:3) if fh(cid:0)(cid:1)(cid:3)(cid:3)(cid:3)(cid:1)hng are part of a minimalexplanation(cid:3)
then
n
P(cid:13)h(cid:0) (cid:2)(cid:3)(cid:3)(cid:3)(cid:2)hn(cid:14) (cid:20) P(cid:13)hi(cid:14)
Y
(cid:0)(cid:4)(cid:0)
To compute the prior of the minimal explanation we multiply the priors of
the hypotheses(cid:7) The posterior probability of the explanation is proportional
to this(cid:7)
(cid:0)(cid:1)(cid:4) An example
In this section we show an example that we use later in the paper(cid:7) It is
intended to be as simple as possible to show how the algorithm works(cid:7)
(cid:5)
Suppose we have the rules and hypotheses (cid:12)
rule(cid:0)(cid:0)a (cid:1)(cid:2) b(cid:3) h(cid:4)(cid:4)(cid:5)
rule(cid:0)(cid:0)a (cid:1)(cid:2) q(cid:3)e(cid:4)(cid:4)(cid:5)
rule(cid:0)(cid:0)q (cid:1)(cid:2) h(cid:4)(cid:4)(cid:5)
rule(cid:0)(cid:0)q (cid:1)(cid:2) b(cid:3)e(cid:4)(cid:4)(cid:5)
rule(cid:0)(cid:0)h (cid:1)(cid:2) b(cid:3) f(cid:4)(cid:4)(cid:5)
(cid:2)
Here we have used a syntax rule(cid:0)(cid:0)H (cid:1)(cid:2) B(cid:3)(cid:3) to represent the rule H (cid:2) B(cid:3) This is
the syntax that is accepted by the code in Appendix A(cid:3)
Logic Programming(cid:0) Abduction and Probability (cid:15)
rule(cid:0)(cid:0)h (cid:1)(cid:2) c(cid:3) e(cid:4)(cid:4)(cid:5)
rule(cid:0)(cid:0)h (cid:1)(cid:2) g(cid:3) b(cid:4)(cid:4)(cid:5)
disjoint(cid:0)(cid:6)b(cid:1)(cid:7)(cid:5)(cid:8)(cid:3)c(cid:1)(cid:7)(cid:5)(cid:9)(cid:10)(cid:4)(cid:5)
disjoint(cid:0)(cid:6)e(cid:1)(cid:7)(cid:5)(cid:11)(cid:3)f(cid:1)(cid:7)(cid:5)(cid:8)(cid:3)g(cid:1)(cid:7)(cid:5)(cid:12)(cid:10)(cid:4)(cid:5)
There are four minimalexplanations of a(cid:3) namelyfc(cid:1)eg(cid:3)fb(cid:1)eg(cid:3)ff(cid:1)bgand
fg(cid:1)bg(cid:7)
The priors of the explanations are as follows(cid:12)
P(cid:13)c(cid:2)e(cid:14) (cid:20) (cid:16)(cid:0)(cid:17)(cid:10)(cid:16)(cid:0)(cid:26) (cid:20) (cid:16)(cid:0)(cid:22)(cid:1)(cid:0)
Similarly P(cid:13)b(cid:2)e(cid:14) (cid:20) (cid:16)(cid:0)(cid:0)(cid:23)(cid:3) P(cid:13)f (cid:2)b(cid:14) (cid:20) (cid:16)(cid:0)(cid:16)(cid:15) and P(cid:13)g (cid:2)b(cid:14) (cid:20) (cid:16)(cid:0)(cid:16)(cid:4)(cid:7) Thus
P(cid:13)a(cid:14) (cid:20) (cid:16)(cid:0)(cid:22)(cid:1)(cid:19)(cid:16)(cid:0)(cid:0)(cid:23)(cid:19)(cid:16)(cid:0)(cid:16)(cid:15)(cid:19)(cid:16)(cid:0)(cid:16)(cid:4) (cid:20) (cid:16)(cid:0)(cid:17)(cid:1)
There are two explanations of e (cid:2) a(cid:3) namely fc(cid:1)eg and fb(cid:1)eg(cid:7) Thus
P(cid:13)e(cid:2) a(cid:14) (cid:20) (cid:16)(cid:0)(cid:26)(cid:16)(cid:7) Thus the conditional probability of e given a is P(cid:13)eja(cid:14) (cid:20)
(cid:16)(cid:0)(cid:26)(cid:6)(cid:16)(cid:0)(cid:17)(cid:1) (cid:20) (cid:16)(cid:0)(cid:23)(cid:4)(cid:4)(cid:7)
What is important about this example is that all of the probabilistic
calculations reduce to (cid:11)nding the probabilities of explanations(cid:7)
(cid:0)(cid:1)(cid:5) Tasks
The following tasks are what we expect to implement(cid:12)
(cid:0)(cid:7) Generate the explanations of some goal (cid:13)conjunction of atoms(cid:14)(cid:3) in or(cid:6)
der(cid:7)
(cid:1)(cid:7) Estimate the prior probability of some goal(cid:7) This is implemented by
enumerating some of the explanations of the goal(cid:7)
(cid:4)(cid:7) Estimate the posterior probabilities of the explanations of a goal (cid:13)i(cid:7)e(cid:7)(cid:3)
the probabilities of the explanations given the goal(cid:14)(cid:7)
(cid:22)(cid:7) Estimate the conditional probability of one formula given another(cid:7)
That is(cid:3) determining P(cid:13)(cid:4)j(cid:5)(cid:14) for any (cid:4) and (cid:5)(cid:7)
Each of these will be computed using the preceding task(cid:7) The last task is
the essential task that we need in order to makedecisions under uncertainty(cid:7)
All of these will be implemented by enumerating the explanations of a
goal(cid:3) and estimating the probability mass in the explanations that have not
been enumerated(cid:7) It is this problem that we consider for the next few sec(cid:6)
tions(cid:3) and then return to the problem of the tasks we want to compute(cid:7)
Logic Programming(cid:0) Abduction and Probability (cid:0)(cid:16)
(cid:2) A top(cid:3)down proof procedure
In this section we show how to carry out a best(cid:6)(cid:11)rst search of the explana(cid:6)
tions(cid:7) In order to do this we build a notion of a partial proof that we can
add to a priority queue(cid:3) and restart when necessary(cid:7)
(cid:3)(cid:1)(cid:2) SLD(cid:6)BF resolution
In this section we outline an implementation based on logic programming
technology and a branch and bound search(cid:7)
The implementation keeps a priority queue of sets of hypotheses that
could be extended into explanations (cid:13)(cid:9)partial explanations(cid:10)(cid:14)(cid:7) At any time
the set of all the explanations is the set of already generated explanations(cid:3)
plus those explanations that can be generated from the partial explanations
in the priority queue(cid:7)
De(cid:0)nition (cid:4)(cid:2)(cid:3) a partial explanation is a structure
hg (cid:1) C(cid:1)Di
where g is an atom (cid:13)or conjunction of atoms(cid:14)(cid:3) C is a conjunction of atoms
and D is a set of hypotheses(cid:7)
Figure (cid:0) gives an algorithm for (cid:11)nding explanations of q in order of prob(cid:6)
ability (cid:13)most likely (cid:11)rst(cid:14)(cid:7)
We have the following data structures(cid:12)
Q is a set of partial explanations (cid:13)implemented as a priority queue(cid:14)(cid:7)
(cid:27) is a set of explanations of g that have been generated (cid:13)initially empty(cid:14)(cid:7)
NG is the set of pairs hhi(cid:1)hji such that hi and hj are di(cid:21)erent hypotheses
in a disjoint declaration (cid:13)N(cid:7)B(cid:7) hi and hj share the same variables(cid:14)(cid:7)
At each step we choose an element
hg (cid:1) C(cid:1)Di
of the priority queue Q(cid:7) Which element is chosen depends on the search
strategy (cid:13)e(cid:7)g(cid:7)(cid:3) we could choose the element with maximumprior probability
of D(cid:14)(cid:7)
Description:Logic Programming, Abduction and Probability. 14 where there are a number of equally minimal partial explanations, we can choose any one. When we follow this de nition of \best", we enumerate the minimal explanations of q in order of probability. Other search strategies can be used, for example,