Table Of ContentlacitamehtaM Programming Study 81 )2891( 11-1
dnalloH-htroN gnihsilbuP ynapmoC
THE DUALITY BETWEEN ESTIMATION AND CONTROL
FROM A VARIATIONAL VIEWPOINT:
THE DISCRETE TIME CASE*
Michele PAVON
LADSEB-CNR, Padova, Italy 35100
Roger J.-B. WETS
University of Kentucky, Lexington, KY ,60504 U.S.A.
devieceR 81 July 0891
desiveR manuscript deviecer 6 April 1891
ehT between duality noitamitse dna control si follow to shown from basic duality .selpicnirp
To od os ew formulate eht noitamitse problem ni terms of a problem variational dna rely no
eht for duality eht convex noitazimitpo to obtain problem eht control associated .melborp ehT
seitreporp of eht of this problem solution era detiolpxe to obtain eht recursive snoitaler that
dleiy eht estimators optimal of a dynamical .metsys
Key words: ,noitamitsE ,gniretliF Duality, lanoitairaV .elpicnirP
1. Introduction
The duality between estimation and control, first exhibited by Kalman [4], is
shown to follow from a basic variational principle. Earlier derivations rely on
formal arguments, cf. for example [1; 3, Chapter V, Section 9]. We first show
that the estimation problem can be embedded in a class of stochastic variational
problems of the Bolza type, studied by Rockafellar and Wets [7-9]. The dual of
this problem is a stochastic optimizaiton problem, which under the standard
modeling assumption is equivalent to a deterministic control problem whose
structure turns out to be that of the linear-quadratic regulator problem. In this
context the duality between estimation and control takes on a precise meaning
which until now was believed to be of a purely formal nature. In particular, we
gain new insight into the two system-theoretic concepts of controllability and
observability. They appear as the same property of two dual problems. A part of
these results were sketched out in [5] relying on a variant of the arguments used
here.
This derivation clearly exhibits those features of the problem that can easily
be modified without impairing the main-results. Also, since it relies on basic
* detroppuS ni part grants by of eht oilgisnoC Nazionale elled Ricerche )10.00700.97-RNC( eht dna
lanoitaN Foundation Science .)1373097-GNE(
2 M. Pavon and R.J.-B. Wets/The duality between estimation
principles, one may hope that the insight so gained will be useful to study
nonlinear filtering problems.
2. The one-step estimation problem
Let (wt, t = 0, 1 ..... T) be a gaussian process defined on the probability space
(~, ~/, P) and with values in R e. The dynamics of the state variables given by
the finite difference equations" for t = 1 ..... T
x,~(oJ) = A,xt(o~) + Btw,(o~) a.s.,
with initial conditions
xl(to) = Bowo(tO) a.s.,
where we write t 1 for t + 1, in particular T 1 = T + .1 The n-vector x represents
the state of the system. The matrices At, Bt are n (cid:141) n and n x p. With
AXt = Xtl -- Xt,
we can also express the dynamics by the relations
Axt(w) = (A, - I)x,(w) + Btwt(to)
with the same initial conditions. The vector-valued process (xt, t = 1 ..... T1) is
also gaussian, since for t = 0 ..... T, we have that
x,,(w, = ~ (fi As)B,w,(w)
T:0 I=S
with the convention t=~st[-I As = .1
Rather than observing the actual state xt, we have only access to tY E R ,m a linear
function of the state disturbed by an additive noise, specifically
y,(to) = C,x,(to) + Dtwt(o~) a.s.
The matrices C, and /9, are m x n and n x p, respectively. The information
process (y,, t = 1 ..... T) is also a gaussian process, since t = 0 ..... T we have
that a.s.
Ytl(tO) = Ctl As B~-wr(to) + Dtlwtl(to). !
i Note that there is no loss of generality in the use of only one gaussian process to represent the
measurements noise and the dynamics disturbances, in fact this model is more general in that it
allows for arbitrary cross-correlation between the two noise processes. If (w~) are the dynaimcs'
disturbances and (w~') the measurements' noise, and they are independent, simply set
w,= w[' ' B,=[I, 0I, Dt=[0, ll
and we are in the framework of the proposed model.
M. Pavon and R.J.-B. Wets[ The duality between estimation 3
Let t30 = or-(y,, s-< t) be the ~-fields induced by the information process, we
simply write 30 for 03r. A function which is 03t-measurable depends only on the
information that can be collected up to time t. If the function is 03-measurable it
means that no more than the total information collected can be used in
determining it value. We always have that t30 C 30 C M.
The one-step estimation (or prediction) problem consists in finding a 'best'
estimator 3' of the final state Xrl, on the basis of the total information collected,
in other words we seek a 03-measurable function '3 from ~ into R n that
minimizes
J(3,) = E{(cid:1)89 ,}211)ot(,3
where I1 II is the euclidean norm. An excellent, and more detailed, description of
estimation in dynamical systems can be found in [2, Chapter 4]. Since xr] E
3f2(O, sg, P; R )n = 5f~(M), it is natural to restrict ,3 to the same class of func-
tions; besides the functional J might fail to be well defined otherwise. On the
other hand, ,3 must be 03-measurable, thus we must further restrict ,3 to 3f2(03), a
closed linear subspace of 37~(~/). The one-step estimation problem can then be
formulated as follows:
EP Find 3' E ~2(03) C ~g2,(s/) such that J(7) is minimized.
This is an optimal recourse problem, the recourse function ,3 must satisfy the
nonanticipativity constraint: -30 rather than ~-measurability. The objective
function is strictly convex; thus EP admits a unique solution *,3 that must satisfy
the following conditions [7]: for almost all ot
7*(to) = argmin[(cid:1)89 - 3,112- p(to)' (cid:12)9 ],3
where p ~ Le2,(M) with Eep = 0 a.s. or equivalently
~t(Orx~E(
3,*(to) ~- a.s
where the components of p are the multipliers associated to the nonanticipativity
constraints. The optimal estimator *,3 is the orthogonal projection of ~rX on
2?2(03) and thus belongs to the linear hull generated by the observations, i.e.,
T
t-I
where for t = 1 ..... T, the U, are n (cid:141) n matrices. The minus sign is introduced
for esthetic purposes that will come to light in the ensuing development. We can
view these matrices as (estimator) weights; they are the weights required to
construct the optimal estimator. Note that we can thus restrict the search for the
optimal estimator to the class of linear estimators, i.e., those that are linear
combinations of the observations.
4 M. Pauon and R.J.-B. Wets/The duality between estimation
3. A variational formulation
In view of the above, the original problem is equivalent to finding the weights
U*, t = 1 ..... T that in turn, yield the optimal estimator. Each observation tY
contributing incrementally to the construction of this estimator. Define
= A3,t(oJ) - U,(C,x,(oJ) + D,w, Oo)), t = 1 ..... T,
with
7t - = A 1 7t )'t-
We can view these equations as the dynamics of the estimation process. Through
U, the available information is processed to yield :1T'3 an estimator of xT1. The
original problem EP has the equivalent formulation:
WP Find U = (Ut, 1 <- t <- T) that minimizes F(U)
where
F(U) = Inf EI~t, L0o, x(o), y(o~); U)J (x, y) E ~(M) x Le~(M)],
T
~t.L(to, x, y; U) = I(to, xl, ,IY XTI, 7rl)-+ ..~ Lt(to, xt, 7t, Axt, Ayt; U),
t-I
I ,21~1[[IxTrl'- 3 if lX = Bowo(w), Yl = ,O
l(oJ, Xl, yi, XTI, )1"77 L + %o otherwise,
I
0, if Axt = (At - l)xt + Btwt(w),
L,(oJ, x,, y,, Ax,, Ay,; U,) = Ay, : - U,(C,x, + D,w,(~)),
+ %0 otherwise
and N =n.Tl.
For each choice of weights U, the value of the function F(U) is obtained by
solving a variational problem of the Boiza type (discrete-time). Since there are
nonanticipative restrictions on the choice of the decision variables, the functional
(oJ, (x, *--))13 ,Jo(~q x, ;/~ U) is a convex normal integrand and the space ~2(sg) is
decomposable, we have that
F(U) = Ef(,o; U)
with
f(oJ; U) = Inf[Ol.L0o, x, y; U ) l (x, )/~ E R N (cid:141) RN],
cf. [6, 7]. Given U, for each fixed ,oc the value of f0o; U) is obtained by solving a
deterministic discrete-time problem of the Bolza type. The dual of this varia-
tional problem yields a 'dual' representation of f. It is in that form that we are
able to exploit the specific properties of this problem.
M. Pavon and R.J.-B. Wets[ The duality between estimation 5
4. The dual representation of f
Given U, and for fixed to, we consider the (discrete time) Bolza problem:
VP Inf[q~,L(~O,X, ;'~ U) x l ~ RN, y ~ R ]N
and associate to VP the dual problem
VD Inf[~,M(to, q,a;U)[qERN, aER ]N
with
m(to, qo, ao, ,Tq aT) = /*(~, qo, ao,--qT,--at),
Mr(to, qt, at, Aqt, Aat ; U) = L *( to, zlqt, zlat, qt, at; U),
T
M.m~c = m ~ + Mt
t=l
where l* and L* denote the conjugates of l and Lr and
Aqt = qt-1 and Aat = a t -- at-1.
'This dual problem is derived as follows: First embed VP in a class of Bolza
problems, obtained from VP by submitting the state variables (x, 7) to (global)
variations, viz.
T
n VPr, Inf[l(to, Xl+ro,?l+'rlo, ,TX YT)+~_~Lt(to, ,tX yt, Axt+rt, Ayt+~t;Ut)] (cid:12)9
,x V t=l
Let ~b(to, r, ~; U) be the infimum value; it is convex in (r, ~1). The costate
variables (q, a) are paired with the variations through the bilinear form
T
((o, a), (r, n)) = ~ (o'," r, + a',- n,).
t=0
The problem VD is then obtained as the conjugate of :b~
(cid:12)9 ,ot(M,m q, a; U) = Sup[((q, a), (r, ~))- th(to, r, ;q" U)]
= Sup(,,~,x,v)/qo (cid:12)9 ro+ a~ (cid:12)9 ~o
i(to, lX r0, 1/~ + 7/0, xr, yr)
- +
T
+~ (q~. r, + a',. n,
t-1
- Lt(to, x,, ,,V Ax, + r,, Ayt + rl,; Ut))
T T
+ Y. (q',. ax, + a',. av,) + ~Y (aq', (cid:12)9 x, + a a',. v,)
t=l t=l
"1
-[" q~ " X 1 -[" a~ " 1I)" -- q~ " XT1 -- at'' ")/TIJ-
Regrouping terms yields immediately the desired expression.
6 M. Pavon and R.J.-B. Wets/The duality between estimation
Calculating m and Nit, we get that
q~Bowo(tO) + l[[err[[2 , if qr = - err,
m(o), qo, ero, qr, err) = I
L+ ~, otherwise
and for t=l ..... T,
( q',Bt - re t' UtDt )wt w ( ),
if -Aq't = q't(At - I) - er'tUtCt,
Mt(~o, qt, err, aq,, Aer,; U) =
-Aer~= 0,
+ %0 otherwise
Thus any feasible solution (q, a) to VD must satisfy for all t
~lat =O and hence at=aT.
Since also -aT = qT, by substitution in the equations,
q'H = q'tAt - re TUtCt
we obtain by recursion that for I = 0, 1 ..... T
q ~-t = - re ~Qr-i
where the matrices Qt are defined by the relations
Qt-l=QtAt+UtCt and Qr=I.
Now for t = 0, 1 ..... T, set
Zt = Q, Bt + UtDt
with the understanding that UoDo = 0. We get the following version of VD:
Find err C R", (Q, t = 0 ..... T), (Zt, t = 0 ..... T) such that
T
[lrre[[~ 2- re ~" ~'Y Ztwt(w) is minimized, and
t=O
VD' Qr =/,
Qt-, = Q,A, + U, Ct, t = 1 ..... T,
Zt = QtBt + UtDt, t = 0 ..... T.
For given U, the problem VD' is solved by setting
Qr-i = Un-,Crz-s l--I A,
s=O T='~
with the conventions UrlCn = I, and
Tl-I+s
1--[ A~=I if TI-I+s>T, i.e., s+l>l.
r=T
This in turn gives a similar expression for the Zt; the problem is thus feasible.
M. Pavon and R.J.-B. Wets/The duality between estimation 7
The optimal solution is given by
T
t=O
and the optimal value is
oZ, t l)O(W
Thus both VD' (or )DV and VP are solvable; VP has only one feasible--and
thus optimal--solution. It remains to observe that at the optimal the values are
equal. This follows from the lower semicontinuity of
(r, "O)~ ,ot(b< r, r/; U)
--the objective of VP is coercive--and the relations
f(6o; U)= Inf ,JO(L,I)4 ,X y; U)= ,o~(Sq 0, 0; U)
=- Inf ,~o(*Sq q, a; U)=- Inf ,ot(M,m>q q, a; U)
where the Zt--and the Q,--are defined by the dynamics of the (deterministic)
problem VD'.
5. The linear-quadratic regulator problem associated to WP
The results of the previous section yield a new representation for F(U) and
consequently of the optimization problem WP, viz.
Find U that minimizes F(U), where
LQR Zt = QtBt + UtDt, t = 0 ..... T,
Qt-l = QtAt + UtCt, t= 1 ..... T,
QT = I.
This is a deterministic problem, in fact a matrix-version of the linear-quadratic
regulator problem ,3[ Chapter II, Section_ .]2 The objective is quadratic, the
coefficients of this quadratic form are determined by the covariances matrices of
the random vectors (ws, wt). If the (wt, t = 0 ..... T) are uncorrelated normalized
centered random gaussian variables, then this problem takes on the form:
8 M. Pavon and R.J.-B. Wets/The duality between estimation
Find U that minimizes ~ trace[ QoPoQ~ + ~ ZtZ't]
QT = I,
LQR'
Qt-i = QtAt + UtCt, t = 1 ..... T,
Zt = QtBt + UtDt,
where P0 = BOBS. The optimal weights can thus be computed without recourse to
the stochastics of the system. This derivation shows that the basic results do not
depend on the fact that the tw are uncorrelated gaussian random variables. In
fact it shows that if the w, are arbitrary random variables, correlated or not (but
with finite second order moments), and the class of admissible estimators is
restricted to those that are linear in the observations, then precisely LQR can be
used to find the optimal weights; we then obtain the wide sense best estimator.
The linear-quadratic regulator problem LQR has an optimal solution, in
feedback form of the type:
Ut = - QtK,
for t -- 1 ..... T. This follows directly from the usual optimality conditions--the
discrete version of Pontryagin's Maximum Principle--for example, cf. [3, Chap-
ter II, Section 7]. Thus the search for an optimal set of weights U can be
replaced by the search of the optimal (Kalman) gains K = (K, t = 1 ..... T), i.e.,
Find K that minimizes G(K), where
1 T 2
GP Zt = Qt(Bt - KtDt), t = 0 ..... T,
Qt-, = Q,(At- KrCr), t = 1 ..... T,
QT =L
If K* solves GP, and U* solves LQR, we have that
F(U*) = G(K*)
and in fact U* may always be chosen so that U* = - K Q t * * ,. Thus if K* solves
GP and Q* is the associate solution of the finite differences equations describing
the dynamics of the problem, we have that the optimal estimator of xr, is given
by
T
)o~(*,)" = ~ Q*K*y,(to).
t=l
Problem LQR' simply becomes
M. Pavon and R.J.-B. Wets/The duality between estimation 9
Find K that minimizes G(K), where
G(K) = ~ trace [ QoPoQ6 + ~
GP' Zt = Q,(Bt - KID,),
t=l ..... T,
Qt-i = Q,(At - KtCt),
Qr = I
with the resulting simplifications in the derivations of the optimal estimator.
6. The Kaiman filter
The characterization of the optimal gains (and hence weights) for the one-step
estimator problem derived in the previous section allows us to obtain the optimal
estimator at every time t, not just at some terminal time T. The optimal estimator
"y*l at time t 1 being derived recursively from *,3 and the new information carried
by the observation at time tl. We obtain this expression for y*j by relying once
more on the duality for variational problems invoked in Section 4.
We have seen that there is no loss in generality in restricting the weights to the
form Ut = - Q,K, for t = 1 ..... T. The optimal solution of VD will thus have
at Ut = q~Kt. We can reformulate the original one-step optimization problem as
follows:
Find K that minimizes G(K) = E[g(to; K)] where
g(to; K) =--Inf[~r,R(tO, q; K) q J ERN],
DG r(r q0, qT) = qgBowo(w)+ 98(cid:1) ,2
q;(Bt - KtDt)wt(to), if -aq't = ~ q[(At - I - CtKt),
Rt(to, qt, Aqt) = + ~, otherwise.
We rely here on the dual representation. From our preceding remarks we know
that for almost all ot E O
g(to; K*) = f(~o, U*).
The value of g(~o; K) being defined as the infimum of a variational problem. By
relying on the dual of this (deterministic) variational problem, we find a new
representation for g(0o; K). The arguments are similar to those used in Section 4.
We get
g(to; K) = Inf[~,s(tO, e; K) e [ ] E N R
where
s(oo, eo, erO = r*(~o, e0,-erO
98(cid:1) ,2 if e, = Bowo(tO),
%0+ otherwise
01 M. Pavon and R.J.-B. Wets/The duality between estimation
and
S,(co, et, zlet ; K) = R*(to, ket, et ; K)
0, if (A, - I - K,C,)e, + (B, - KtD,)wt(o)) = ae,,
+ ~, otherwise
with ket = e,- e,. Thus DG is equivalent to
Find K that minimizes G(K) = E[g(to; K)] where
(cid:1)89 ,2
; g(~o K) = Inf
PG
e.(~o) = (At - KtCt)et(o)) + (Bt - KtDt)wt(o)), t = 1 ..... T,
e1(~o) = Bowo(o~).
For fixed K,, the solution of the optimization problem defining g(~o; K) is unique
and given by
r=0 S=t
for t = 1 ..... T, with the usual convention that ,~stI-1 ~F = I, where
Ft=At-KtCf and At=Bt-K,D,.
The process (e,, t = 0 ..... TI) is the error process, i.e., et = xt - -it; for each t it
yields the error between the actual state
x,(r = Atxt(r + Btwt(to)
and the estimated state
"I,1(o9) = At-1t(~o) + Kt(Ctet(~o) + Dtw,(o)))
= At-it(co) + Kt(Yt(O)) -Cty,),
Kt representing the weight attributed to the gain in information at time t. Note
that Kt only affects the equation defining ~te and thus the functional G(K) will be
E[(cid:1)89
minimized if given e,, each K, is chosen so as to minimize i.e., so
that the incremental error is minimized.
The sequence K* of optimal gains can now be found recursively in the
following way: suppose that Kl' ..... K*_~ have already been obtained and
e~' ..... e* are the corresponding values of the state variables. Let
s = E{e*(o)) (cid:12)9 e*(~o)'}
be the covariance of the state variable e*. Then K* must be chosen so that
E[~lle,,(,o)ll2l
is minimized, or equivalently