Table Of ContentTheAnnalsofProbability
2008,Vol.36,No.6,2050–2091
DOI:10.1214/08-AOP392
(cid:13)c InstituteofMathematicalStatistics,2008
SPECTRAL GAPS IN WASSERSTEIN DISTANCES AND THE 2D
9
0 STOCHASTIC NAVIER–STOKES EQUATIONS
0
2
By Martin Hairer1 and Jonathan C. Mattingly2
n
a University of Warwick and Duke University
J
2 We develop a general method to prove the existence of spectral
2 gaps for Markov semigroups on Banach spaces. Unlike most previ-
ous work, the type of norm we consider for this analysis is neither
] a weighted supremum norm nor an L p-type norm, but involves the
R
derivativeoftheobservableaswellandhencecanbeseenasatypeof
P
1-Wasserstein distance. This turns out to be a suitable approach for
.
h infinite-dimensional spaces where theusual Harris or Doeblin condi-
t tions,whicharegeared towardtotalvariationconvergence,oftenfail
a
to hold. In the first part of this paper, we consider semigroups that
m
haveuniformbehaviorwhichonecanviewastheanalogofDoeblin’s
[
condition. We then proceed to study situations where the behavior
2 is not so uniform, but the system has a suitable Lyapunov struc-
v ture, leading to a typeof Harris condition. Wefinally show that the
9 latterconditionissatisfiedbythetwo-dimensionalstochasticNavier–
7 Stokes equations, even in situations where the forcing is extremely
4
degenerate.Usingtheconvergenceresult,weshowthatthestochastic
2
Navier–Stokesequations’invariantmeasuresdependcontinuouslyon
0
theviscosity and thestructureof theforcing.
6
0
/
h 1. Introduction. This work is motivated by the study of the two-dimen-
t
a sional stochastic Navier–Stokes equations on the torus. However, the results
m
andtechniques aremoregeneral.Themainabstractresultofthepapergives
: a criterion guaranteeing that a Markov semigroup on a Banach space has a
v
i spectral gap in a particular 1-Wasserstein distance. (In the sequel, we will
X
simply write Wasserstein for 1-Wasserstein.) To the best of our knowledge,
r
a
Received May 2006; revised May 2007.
1Supportedin part by EPSRCAdvanced Research Fellowship Grant EP/D071593/1.
2Supported in part by NSF PECASE Award DMS-04-49910 and an Alfred P. Sloan
foundation fellowship.
AMS 2000 subject classifications. 37A30, 37A25, 60H15.
Key words and phrases. Stochastic PDEs, Wasserstein distance, ergodicity, mixing,
spectral gap.
This is an electronic reprint of the original article published by the
Institute of Mathematical Statistics in The Annals of Probability,
2008,Vol. 36, No. 6, 2050–2091. This reprint differs from the original in
pagination and typographic detail.
1
2 M. HAIRERANDJ. MATTINGLY
these results are thefirstresults providinga spectral gap in this, or any sim-
ilar, setting. Inturn,the existence of a spectral gap implies that theMarkov
semigroup possesses a unique, exponentially mixing invariant measure.
Theresultsofthisarticlerelyontheexistenceofa“gradientestimate”in-
troduced in [21] in the study of the degenerately forced Navier–Stokes equa-
tions on the two-dimensional torus. This estimate was used there in order
to show that the corresponding Markov semigroup satisfies the “asymptotic
strong Feller” property, also introduced in [21]. In this work, we show that
gradient estimates of this form not only are usefulto show uniquenessof the
invariant measure, but are an essential ingredient to obtain the existence
of a spectral gap for a large class of systems. In this introductory section,
we concentrate on the two-dimensional stochastic Navier–Stokes equations
on a torus to show how the main results can be applied. At the end of this
section, we give an overview of the paper.
Recall that the Navier–Stokes equations describing the evolution of the
velocity field v(x,t) (with x T2) of a fluid under the influence of a body
force F¯(x)+F(x,t) are given∈by
(SNS) ∂ v=ν∆v (v )v p+F¯+F, divv=0,
t
− ·∇ −∇
where the pressure p(x,t) is determined by the algebraic condition divv=0.
We consider for F a Gaussian stochastic forcing, that is, centered, white in
time, colored in space and such that F¯(x)dx= F(x)dx=0. Since the
gradient part of the forcing is canceleRd by the prRessure term, we assume
without loss of generality that divF¯=divF =0. More precisely, we assume
that for i,j 1,2
∈{ }
EF (x,t)F (x′,t′)=δ(t t′)Q (x x′),
i j ij
− −
2
∂2Q =0, Q (x)dx=0.
ij ij ij
Z
iX,j=1
AlthoughweareconfidentthatourresultsarevalidforQsufficientlysmooth,
we restrict ourselves to the case where Q is a trigonometric polynomial, so
that we can make use of the bounds obtained in [21, 39].
Instead of considering the velocity (SNS) directly, we will consider the
equation for the vorticity w= v=∂ v ∂ v . Note that v is uniquely
1 2 2 1
∇∧ −
determined from w (we will write v= w) through the conditions
K
w= v, divv=0, v(x)dx=0.
∇∧ Z
When written in terms of w, (SNS) is equivalent to
(1) ∂ w=ν∆w ( w) w+f¯+f,
t
− K ·∇
SPECTRAL GAPS IN WASSERSTEINDISTANCES 3
where we have defined f = F and f¯= F¯. Note that since f is
∇∧ ∇∧
translation invariant, one can write it as
f(x,t)=Re q eikxξ (t),
k k
k∈ZX2\{0}
where the ξ are independent white noises and where q = q . We can
k k −k
therefore identify the correlation function Q with a vector q in ℓ2, the set of
+
square integrable sequences with positive entries. Denoting by the set of
Z
indices k for which q =0,wewillmake throughoutthisarticle thefollowing
k
6
assumptions:
Assumption 1. Only finitely many of the q ’s are nonzero and f¯ lies
k
in the span of eikx q =0 . Furthermore, generates Z2 and there exist
k
{ | 6 } Z
k,ℓ with k = ℓ .
∈Z | |6 | |
Remark1.1. Theassumptionthatonlyafinitenumberofq arenonzero
k
isonlyatechnicalassumptionreflectingadeficiencyin[39].Alloftheresults
of this article certainly hold if the first part of Assumption 1 is replaced by
an appropriate decay property for the q . Note, for example, that in [21],
k
Section 4.5, it is shown that there exists an N such that if the range of Q
∗
contains eikx k <N , f¯is as in Assumption 1, and q2< , then all of
{ || | ∗} k ∞
the results of this paper hold. In particular, this allowPs infinitely many q
k
to be nonzero.
Remark1.2. Usingtheresultsin[3]onecanremovetherestrictionthat
theforcingneedconsistofFouriermodesandreplaceitwiththerequirement
that the forced functions span the Fourier modes required above. Since this
is not the main point of this article, we do not elaborate further here.
Remark1.3. Itisclearthattheassumptionthatf¯ span eikx q =0
k
is far from optimal. The correct result likely places no re∈stricti{on on|f¯o6the}r
thanitbesufficientlysmooth.Thismoredelicateresultrequiresanimproved
understanding of the control problem obtained by replacing the noise by
controls. Some steps in this direction have been made [1, 2, 47], but the
current results are not sufficient for our needs. Nonetheless, the present
assumption on f¯seems reasonable from a modeling perspective where one
wouldlikely havesomenoiseinallof thedirections onwhichthebodyforces
act.
We will consider (SNS) as an evolution equation in the subspace of
H
H1 that consists of velocity fields v with divv =0 in the sense of distri-
butions. Note that this is equivalent to w L 2. We make a slight abuse of
∈
4 M. HAIRERANDJ. MATTINGLY
notation and denote by the transition probabilities for (SNS), as well as
t
P
the corresponding Markov semigroup on , that is,
H
(v,A)=P(v(t, ) A v(0, )=v),
t
P · ∈ | ·
for every Borel set A , and
⊂H
( φ)(v)= φ(v′) (v,dv′), ( ∗µ)(A)= (v,A)µ(dv)
Pt Z Pt Pt Z Pt
H H
for every φ: R and probability measure ν on . Analogously we define
H→ H
the projection µφ= φ(x)µ(dx). It was shown in [21] that Assumption 1
implies that (SNS) aRdmits a unique invariant measure µ , that is, µ is a
⋆ ⋆
probability measure on such that ∗µ =µ for every t 0.
H Pt ⋆ ⋆ ≥
This article is concerned with whether, for an arbitrary probability mea-
sureµ, ∗µ µ (as t )andinwhatsensethisconvergencetakes place.
Pt → ⋆ →∞
Note that (SNS) is not expected to have the strong Feller property, so that
it is a fortiori not expected that ∗µ µ in the total variation topology
Pt → ⋆
if µ and µ are mutually singular. (See [9] for a general discussion of the
⋆
strong Feller property in infinite dimensions and [21] for a discussion of its
limitations in the present setting.)
In order to state the main result of the present article, we introduce the
following norm on the space of smooth observables φ: R:
H→
φ =supe−ηkvk2(φ(v) + Dφ(v) ).
η
k k | | k k
v∈H
Here, we denoted by Dφ the Fr´echet derivative of φ. With this notation, we
will show that the operator has a spectral gap in the norm in the
t η
P k·k
following sense:
Theorem 1.4. Consider (SNS) in the range of parameters allowed by
Assumption 1. For every η small enough there exist constants C and γ such
that
φ µ φ Ce−γt φ ,
t ⋆ η η
kP − k ≤ k k
for every Fr´echet differentiable function φ: R and every t 0.
H→ ≥
In[19]asimilar operator-normestimate on was obtained in aweighted
t
P
total variation norm ( without the Dφ term) when the forcing was
η
k·k k k
spatially rough and nondegenerate. Our setting is quite different. The spa-
tially rough and nondegenerate forcing makes the analysis much closer to
the finite-dimensional setting. It is not expected that such estimates hold
in the total variation norm in the setting of this article. We should also
remark that previous estimates, such as [20, 29, 37, 38], giving simply expo-
nential convergence to equilibrium are weaker and the results in this article
represent a significant and new extension of those results.
SPECTRAL GAPS IN WASSERSTEINDISTANCES 5
It is sometimes of interest to know that the structure functions of the so-
lution to (SNS) converge to the structure functions determined by µ . This
⋆
is not an immediate consequence of Theorem 1.4 because point evaluations
of thevelocity field arenotcontinuous functions on .Thesmoothingprop-
H
erties of (SNS) nevertheless allow us to show the following result, which is
an immediate consequence of Theorem 1.4 and Proposition 5.12 below.
Theorem 1.5. Consider (SNS) in the range of parameters allowed by
Assumption 1. Let n 1 and define the n-point structure functions by
≥
n(x ,...,x )= v(x ) v(x )µ (dv).
1 n 1 n ⋆
S Z ···
Then, for every η>0, there exist constants C and γ>0 such that, for every
v , one has the bound
0
∈H
n
sup E v(x ,t) n(x ,...,x ) Ceηkv0k2−γt,
(cid:12) i 1 n (cid:12)
x1,...,xn(cid:12)(cid:12) iY=1 −S (cid:12)(cid:12)≤
(cid:12) (cid:12)
for every t>1. He(cid:12)re, v(x,t) is the solution of ((cid:12)SNS) with initial condition
v .
0
The remainder of this article is structured as follows. In Section 2, we
begin with an abstract discussion of our ideas in a setting with uniform
estimates. In Section 3, we give the main theoretical results of the paper
whichcombinetheideasfromthefirstsectionwithestimatesstemmingfrom
an assumed Lyapunov structure. The convergence is measured in a distance
in which paths are weighted by the Lyapunov function. We then turn in
Section 5 to the specifics of the stochastic Navier–Stokes equation and show
that it satisfies the general theorems from Section 3. In Section 5.3, we show
that the Markov semigroup generated by (SNS) is strongly continuous on a
suitable Banach space and that its generator has a spectral gap there. Then
in Section 5.5, we demonstrate the power of the spectral gap estimates by
giving a short proof that (SNS)’s invariant measures depend continuously
on all the parameters of the equation.
2. A simplified, uniform setting. In this section, we illustrate many of
the main ideas used throughout this article in a simplified setting. We con-
sider the analogue of one of the simplest (and yet powerful) conditions for
a Markov chain with transition probabilities to have a unique invariant
P
measure, namely Doeblin’s condition3:
3Doeblin’s original condition was the existence of a probability measure ν and a con-
stant ε>0 such that P(x,A)≥ε whenever ν(A)>1−ε; see [11, 12]. It turns out that,
provided that the Markov chain is aperiodic and ψ-irreducible, this is equivalent to the
(in general stronger) condition given here; see [40], Theorem 16.0.2.
6 M. HAIRERANDJ. MATTINGLY
Theorem2.1 (Doeblin). Assume that there exist δ>0 anda probability
measure ν such that (x, ) δν for every x. Then, there exists a unique
P · ≥
probability measure µ such that ∗µ =µ . Furthermore, one has φ
⋆ ⋆ ⋆
P kP −
µ φ (1 δ) φ µ φ for every bounded measurable function φ.
⋆ ∞ ⋆ ∞
k ≤ − k − k
Atypical exampleofasemigroupforwhich Theorem2.1can beappliedis
given byanondegeneratediffusiononasmoothcompact manifold.Theorem
2.1showsthefundamentalmechanismforconvergencetoequilibriumintotal
variation norm. It is simple because the assumed estimates are extremely
uniform. In this section we give a theorem guaranteeing convergence in a
Wasserstein distance with assumptions analogous to Doeblin’s condition.
A classical generalization of Doeblin’s condition was made by Harris [22]
who showed how to combine the existence of a Lyapunov function and a
Doeblin-like estimate localized to a sufficiently large compact set to prove
convergencetoequilibrium.Wewillgivea“Harris-like”versionofourresults
in Section 3.
2.1. Spectral gap under uniform estimates. The aim of this section is to
present avery simplecondition that ensuresthat a Markov semigroup on
t
P
a Banach space yields a contraction operator on the space of probability
H
measures endowed with a Wasserstein distance. One can view it as a ver-
sion of Doeblin’s condition for the Wasserstein distance instead of the total
variation distance. The main motivation for using a distance that metrizes
the topology of weak convergence is that probability measures on infinite-
dimensional spaces tend to bemutually singular, so that strong convergence
is not expected to hold in general, even for ergodic systems.
The first assumption captures the regularizing effect of the Markov semi-
group. While it does not imply that one function space is mapped into a
more regular one as often occurs, it does say that at least gradients are
decreased.
Assumption2. Thereexistconstants α (0,1),C >0andT >0such
1 1
∈
that
(2) D φ C φ +α Dφ ,
t ∞ ∞ 1 ∞
k P k ≤ k k k k
for every t T and every Fr´echet differentiable function φ: R.
1
≥ H→
Remark2.2. Atypicalwayofchecking(2)istofirstshowthatforevery
t 0, is boundedas an operator on the space with norm φ + Dφ .
t ∞ ∞
≥ P k k k k
It then suffices to check that (2) holds with α <1 for one particular time t
1
to deduce from the semigroup property that
D φ C( φ +e−γt Dφ )
t ∞ ∞ ∞
k P k ≤ k k k k
is valid with some γ>0 for every t 0.
≥
SPECTRAL GAPS IN WASSERSTEINDISTANCES 7
Remark2.3. Ifthespace isactuallyacompactmanifold,(2)together
H
with the Arzel`a–Ascoli theorem implies that the essential spectral radius
of (as an operator on the space with norm φ + Dφ ) is strictly
t ∞ ∞
P k k k k
smaller than 1. This is a well-known and often exploited fact4 in the theory
of dynamical systems. A bound like (2) is usually referred to as the Lasota–
Yorke inequality [30, 31] or the Ionescu–Tulcea–Marinescu inequality [26]
andisusedtoshowtheexistenceofabsolutelycontinuousinvariantmeasures
when is the transfer operator acting on densities. Usually, it is used with
t
P
two different H¨older norms on the right-hand side. The present application
with a Lipschitz norm and an infinity norm has a different flavor.
This bound alone is of course not enough in general to guarantee the
uniqueness of the invariant measure. (Counterexamples with =S1, the
H
unitcircle,caneasilybeconstructed.)Furthermore,wearemainlyinterested
in the case where is not even locally compact.
H
Inordertoformulateoursecondassumption,weusethenotation (µ µ )
1 2
C
for the set of all measures Γ on such that Γ(A )=µ (A) and
1
H×H ×H
Γ( A) = µ (A) for every Borel set A . Such a measure Γ on the
2
H × ⊂ H
product space is referred to as a coupling of µ and µ . We also denote
1 2
by ∗ the semigroup acting on probability measures which is dual to .
Pt Pt
With these notation, our second assumption, which is a form of uniform
topological irreducibility, reads:
Assumption 3. For every δ >0, there exists a T =T (δ) so that for
2 2
any t T there exists an a>0 so that
2
≥
sup Γ (x′,y′) : x′ y′ δ a,
Γ∈C(Pt∗δx,Pt∗δy) { ∈H×H k − k≤ }≥
for every x,y .
∈H
Remark 2.4. Recall that the total variation distance between proba-
bility measures can be characterized as one minus the supremum over all
couplings of the mass of the diagonal. Therefore, if we set δ=0 in Assump-
tion 3, we get (x, ) (y, ) 1 a for every x and y. By [40],
t t TV
kP · −P · k ≤ −
Theorem 16.0.2, this is equivalent to the assumption of Theorem 2.1, so
that the results in this section can be viewed as an analogue of Doeblin’s
theorem.
4It can be obtained as a corollary of the fact that the essential spectral radius of an
operator T can be characterized as the supremum over all λ>0 such that there exists a
singular sequence {f } with kf k=1 and kTf k≥λ for every n. A slightly different
n n≥0 n n
proof can be found in [23] and is directly based on the study of the essential spectral
radiusbyNussbaum[42].Itis,however,verycloseinspirittothemuchearlierpaper[26].
8 M. HAIRERANDJ. MATTINGLY
To measure the convergence to equilibrium, we will use the following
distance function on :
H
(3) d(x,y)=min 1,δ−1 x y ,
{ k − k}
where δ is a small parameter to be adjusted later on. The distance (3)
extends in a natural way to a Wasserstein distance between probability
measures by
(4) d(µ ,µ )= sup φ(x)µ (dx) φ(x)µ (dx) ,
1 2 1 2
(cid:12)Z −Z (cid:12)
Lipd(φ)≤1(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
where Lip (φ) denotes the Lipschitz constant of φ in the metric d. By the
d
Monge–Kantorovich duality [44, 49], the right-hand side of (4) is equivalent
to
(5) d(µ ,µ )= inf d(x,y)µ(dx,dy).
1 2
µ∈C(µ1,µ2)ZZ
(Note that the infimum is actually achieved; see [50], Theorem 4.1.) With
these notation, one has the following convergence result:
Theorem 2.5. Let ( ) be a Markov semigroup over a Banach space
t t≥0
P
satisfying Assumptions 2 and 3. Then, there exist constants δ>0, α<1
H
and T >0 such that
(6) d( ∗µ , ∗µ ) αd(µ ,µ ),
PT 1 PT 2 ≤ 1 2
for every pair of probability measures µ , µ on . In particular, ( )
1 2 t t≥0
H P
has a unique invariant measure µ and its transition probabilities converge
⋆
exponentially fast to µ .
⋆
Proof. We will prove the general result by first proving it for delta
measures, namely,
(7) d( ∗δ , ∗δ ) αd(x,y)
Pt x Pt y ≤
for all (x,y) . Once this estimate is proven, we can finish the proof
∈H×H
of the general case by the following argument.
The Monge–Kantorovich duality yields Q (µ ,µ ) so that d(µ ,µ )=
1 2 1 2
∈C
d(x,y)Q(dx,dy). To complete the proof observe that
R
d( ∗µ , ∗µ ) d( ∗δ , ∗δ )Q(dx,dy)
Pt 1 Pt 2 ≤Z Pt x Pt y
α d(x,y)Q(dx,dy)=αd(µ ,µ ).
1 2
≤ Z
SPECTRAL GAPS IN WASSERSTEINDISTANCES 9
Let us first show that (7) holds when x y δ for some appropriately
k − k≤
chosen δ. Note that by (4) this is equivalent to showing that
(8) φ(x) φ(y) αd(x,y)d=efαδ−1 x y
t t
|P −P |≤ k − k
forallsmoothφwithLip (φ) 1.Notethatwecanassumeφ(0)=0,sothat
d ≤
this implies that Dφ δ−1 and φ 1. It follows from Assumption 2
∞ ∞
k k ≤ k k ≤
that D φ C +α δ−1 for every t T . Choosing δ =(1 α )/(2C)
t ∞ 1 1 1
k P k ≤ ≥ −
and substituting in for C, we get D φ δ−1(1+α )/2, so that (8)
t ∞ 1
k P k ≤
holds for t T and α (1+α )/2.
1 1
≥ ≥
Let us now turn to the case x y >δ. It follows from Assumption 3
k − k
that for every t>T (δ) there exists a positive constant a so that for any
2
(x,y) 2 there exists Γ ( ∗δ , ∗δ ) such that Γ(∆ )>a>0, where
∈H ∈C Pt x Pt y δ
∆ = (x′,y′): x′ y′ 1δ .
δ { k − k≤ 2 }
Since d(x′,y′) 1 on ∆ and d(x′,y′) 1 on the complement, one has
≤ 2 δ ≤
1 1 a
d(x′,y′)Γ(dx′,dy′) Γ(∆ )+1 Γ(∆ )=1 Γ(∆ ) 1 .
δ δ δ
Z ≤ 2 − − 2 ≤ − 2
Since we are in the setting d(x,y)=1, this implies that when x y >δ,
k − k
φ(x) φ(y) αd(x,y)
t t
|P −P |≤
holds for α 1 a and t T (δ).
≥ − 2 ≥ 2
Setting α=max 1 a,1(1+α ) and T =max T ,T (δ) completes the
{ − 2 2 1 } { 1 2 }
proof. (cid:3)
Corollary 2.6. Let be as in Theorem 2.5. Then, there exist con-
t
P
stants α<1 and T >0 such that
(9) φ µ φ α φ , φ =sup(φ(x) + Dφ(x) ),
T ⋆ 1,∞ 1,∞ 1,∞
kP − k ≤ k k k k | | k k
x∈H
for every Fr´echet differentiable function φ: R.
H→
Proof. Defined (x,y)=1 x y .Sincedisequivalenttod ,(6)still
1 1
∧k − k
holds for arbitrary α (but with a different value for T) with d replaced by
d .Theclaim thenfollows fromtheMonge–Kantorovich duality, notingthat
1
Lip (φ) 2 φ and, for functions φ with φ(x)µ (dx)=0, φ
d1 ≤ k k1,∞ ⋆ k k1,∞ ≤
Lip (φ). (cid:3) R
d1
10 M. HAIRERANDJ. MATTINGLY
2.2. A more pathwise perspective. In [20, 37, 38], the authors advocated
a pathwise point of view which explicitly constructed coupled versions of
the process starting from two different initial conditions in such a way that
the two coupled processes converged together exponentially fast. This point
of view is very appealing as it conveys a lot of intuition; however, writing
down the details can become a bit byzantine. Hence the authors prefer the
arguments given in the preceding section for their succinctness and ease of
verification. Nonetheless,thecalculations ofthepresentsection providedthe
intuition whichguidedtheprevioussection; andhence,wefinditinstructive
topresentthem.As noneoftherestof thepaperusesany ofthecalculations
fromthissection,wedonotprovideallofthedetails.Ourgoalistoshowhow
the estimates from the previous section can be used to construct an explicit
coupling in which the expectation of the distance between the trajectories
starting from x and y converges to zero exponentially in time.
0 0
Fix a t max(T ,T ), where T and T are the constants in Assumptions
1 2 1 2
≥
2and3.Fix δ 0 aswe didin theproofof Theorem2.5.Now for k=0,1,...
≥
define the following sequence of stopping times:
r =inf m s :m N, x y (1 α )δ ,
k k−1 mt mt 1
{ ≥ ∈ k − k≤ − }
s =inf m r :m N, x y >δ ,
k k mt mt
{ ≥ ∈ k − k }
where s =0 by definition.
−1
If n [r ,s ), let the distribution of (x ,y ) be given by a cou-
k k t(n+1) t(n+1)
pling w∈hich minimizes the Ed(x ,y ). Hence choosing δ = (1
t(n+1) t(n+1)
−
α )/(2C) and α ((1+α )/2,1) as in the paragraph below (8), Monge–
1 1
∈
Kantorovich duality gives that
(10) E(d(x ,y )(x ,y )) αδ−1 x y
t(n+1) t(n+1) tn tn tn tn
| ≤ k − k
provided n [r ,s ). Given a random variable X and events A and B,
k k
for notation∈al convenience, we define E(X;A) = E(X1 ) and P(A;B) =
A
E(1 1 ), where 1 is the indicator function on the event A. Observe that
A B A
if x y (1 α )δ, then
0 0 1
k − k≤ −
(11) E( x y ;s >n+1(x ,y )) α x y 1 ,
k t(n+1)− t(n+1)k 1 | tn tn ≤ k tn− tnk s1>n
which implies that
E( x y ;s >n+1) αn+1 x y .
t(n+1) t(n+1) 1 0 0
k − k ≤ k − k
From this we see that as long as x and y stay in a δ ball of each other, they
will converge toward each other exponentially in expectation.
Observe furthermore that
δP(s =n)=δP( x y >δ;s >n 1)
1 tn tn 1
k − k −
δE(d(x ,y );s >n 1)
tn tn 1
≤ −
=δE(E(d(x ,y ) (x ,y ));s >n 1)
tn tn t(n−1) t(n−1) 1
| −
αE( x y ;s >n 1) αn x y ,
t(n−1) t(n−1) 1 0 0
≤ k − k − ≤ k − k