Table Of ContentA nonparametric copula density
estimator incorporating information on
bivariate marginals
6
1
0
Yu-Hsiang Cheng
2
e-mail: [email protected]
n
and
a
J
0 Tzee-Ming Huang∗
3
e-mail: [email protected]
]
E
M 1. Introduction
.
t Considertheproblemofestimatingcopuladensitywhenthebivariatemarginals
a
are known. Let c be a copula density. Let W be the space of tensor product
t h
s linear B-splines on [0,1]d with equally space knots, where h=(h ,...,h ) and
[ 1 d
h is the distance between two adjacent knots for the i-th dimension. Let B ,
i 1
1 ..., B denote the tensor product B-spline basis functions for W . Then, we
k h
v
consider c is approximated by
9
0
α B +···+α B ,
1 1 1 k k
0
where α ,...,α are coefficients. First, we state some notations and definitions
0 1 k
. as follows.
2
0 • W =L2([0,1]d).
6 • S:{(f ,...,f ):f ,...,f ∈L2([0,1]) and 1f (x)dx=···= 1f (x)dx}.
1 1 d 1 d 0 1 0 d
• A : W → S is a linear operator such that foRr f in W, Af = (fR,...,f ),
: 1 d
v where
i
X
1 1
ar fi(xi)=Z0 ···Z0 f(x1,...,xd)dx1···dxi−1dxi+1···dxd.
• H : L2([0,1]d) → L2([0,1]2) to be the linear mapping such that for
ij
f ∈W and for 1≤i,j ≤d, (H f)(x ,x ) is given by
ij i j
1 1
··· f(x ,...,x )dx ···dx dx ···dx dx ···dx .
Z Z 1 d 1 i−1 i+1 j−1 j+1 d
0 0
T
• M = c(u)B (u)du,..., c(u)B (u)du
(cid:16)R[0,1]d 1 R[0,1]d k (cid:17)
∗Correspondingauthor.
1
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 2
2. Methodology and main results
Suppose that we observe data X ,...,X with copula density c, where X =
1 n i
(X ,...,X ). For i=1,...,k, let
i,1 i,d
n
1
Mˆ = B Fˆ (X ),...,Fˆ (X )
i i 1 j,1 d j,d
n
Xj=1 (cid:0) (cid:1)
be a moment estimator for M , where Fˆ (x) = 1 n I(X ≤ x) is the em-
i j n i=1 i,j
pirical CDF of data X ,...,X . Let P
1,j n,j
B (u)B (u)du ··· B (u)B (u)du
[0,1]d 1 1 [0,1]d 1 k
P = R ... ... R ...
B (u)B (u)du ··· B (u)B (u)du
[0,1]d k 1 [0,1]d k k
R R
and
1 1 k 2
Pen(c,α)= (H c)(u ,u )−(H α B )(u ,u ) du du ,
1≤Xi,j≤dZ0 Z0 (cid:12)(cid:12) ij i j ijXt=1 t t i j (cid:12)(cid:12) i j
(cid:12) (cid:12)
where α = (α ,...,α )T is the vector of B-spline coefficients. We estimate the
1 k
copula density c using cˆ= αˆ B +···+αˆ B , where αˆ = (αˆ ,...,αˆ ) is the
1 1 k k 1 k
minimizer of
β(Pα−Mˆ)T(Pα−Mˆ)+λPen(c,α)
under the constraints that αTB is nonnegative and the marginals of αTB are
uniform density on [0,1], where B =(B ,...,B )T and β =1/( d h ). Now,
1 k i=1 i
we show that cˆis consistent for c under some mild conditions. TQheorem1 gives
the approximation error of B-splines under a linear constraint and a nonnega-
tivity consraint.
Theorem 1. Supposethat η ∈S and{f ∈W :Af =η}6=∅. LetV denote the
h
set {f ∈W :f ≥0} and V(0) be the interior of V. Suppose that g ∈{f ∈V(0) :
def
Af = η} and ε is a positive number such that B(g,ε) = {f ∈ W : kf −gk <
ε}⊂V(0) and
ε
2dkg¯ −gk<
w
2
for some g¯ ∈ W . Then for f ∈ V ∩{f ∈ W : Af = η} and f¯ ∈ W , there
w h w h
exists f ∈V ∩{f ∈W :Af =η} such that
w h
2
kf −fk≤2d 1+ (kfk+kgk+ε) kf¯ −fk.
w w
(cid:16) ε (cid:17)
Using Theorem 1, we can establish the consistency of cˆ, and the result is
given in Theorem 2.
Theorem 2. Let k = d (1/h ) + 1 denote the number of tensor basis
i=1 i
functions.Supposethat lQim m(cid:0)ax(h ,...,h(cid:1) )=0and lim (1/h )2+dk/n=0,
1 d min
n→∞ n→∞
where h =min(h ,...,h ) . Then, kcˆ−ck→0 in probability.
min 1 d
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 3
3. Proofs
We will provide the proofs of Theorems 1 and 2 in this section.
3.1. Proof of Theorem 1
TheproofofTheorem1isbasedonLemma1,whichisstatedandprovedbelow.
Lemma 1. Suppose that η ∈ S and {f ∈ W : Af = η} =6 ∅. Then for
h
f ∈ {f ∈ W : Af = η} and f¯ ∈ W , there exists f ∈ {f ∈ W : Af = η}
w h w h
such that
kf −fk≤2dkf¯ −fk,
w w
where k·k denotes the L2 norm.
Proof. First, we will prove Lemma 1 when η =(0,...,0). For any f¯ ∈W , let
w h
(f ,...,f ) = Af¯ and µ = 1f (x)dx. Let g(x ,...,x ) = d f (x ) and
1 d w 0 1 1 d i=1 i i
f∗ =f¯ −g, then we have R P
w
Af∗ =Af¯ −Ag =−µ(d−1)(1,...,1).
w
Let e denote the constant function 1 on [0,1]d, then e ∈ W . Take f =
1 1 h w
f∗+µ(d−1)e , then Af =(0,...,0) and
1 w
kf −fk ≤ kf −f¯ k+kf¯ −fk
w w w w
≤ kµ(d−1)e −gk+kf¯ −fk
1 w
≤ |µ|(d−1)ke k+dkf¯ −fk+kf¯ −fk
1 w w
≤ 2dkf¯ −fk.
w
Here we have used the fact that µ2 ≤kf k2 ≤kf¯ −fk2 for i=1, ..., d.
i w
Next, we will prove Lemma 1 for a general η. From the assumption that
{f ∈ W : Af = η} 6= ∅, there exists a function f˜in W such that Af˜= η.
h h
Suppose that f ∈{f ∈W :Af =η} and f¯ ∈W . ThenA(f−f˜)=(0,...,0).
w h
Apply Lemma 1 with η, f and f¯ replaced by (0,...,0), f −f˜ and f¯ −f˜
w w
respectively, then there exists g ∈W such that Ag =(0,...,0) and
w h w
kg −(f −f˜)k≤2dk(f¯ −f˜)−(f −f˜)k.
w w
Take f =g +f˜, then f ∈W , Af =η and the above equation becomes
w w w h w
kf −fk≤2dkf¯ −fk.
w w
The proof of Lemma 1 is complete.
Now, we will prove Theorem 1.
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 4
Proof. The proof for Theorem 1 is adapted from the proof for Lemma 2.4 in
Wong [1]. Suppose that the assumptions in Theorem 1 hold, f ∈V ∩{f ∈W :
Af = η} and f¯ ∈ W . Then by Lemma 1, there exist g˜ , f˜ ∈ {f ∈ W :
w h w w h
Af =η} such that
ε
kg˜ −gk≤2dkg¯ −gk<
w w
2
and
kf˜ −fk≤2dk f¯ −f|.
w w
Note that g˜ ∈ {f ∈ W : Af = η}∩V(0). Let f = τf˜ +(1−τ)g˜ and
w h τ w w
τ∗ = sup{τ ∈ [0,1] : f ∈ V}, then we will show that Theorem 1 holds with
τ
fw =fτ∗.
Take ε =ε/2, then B(g˜ ,ε )⊂B(g,ε)⊂V(0). For
1 w 1
ε
1
0≤τ ≤ ,
ε +kf˜ −fk
1 w
we have kτ(f˜ −f)/(1−τ)k≤ε , so
w 1
τ
(f˜ −f)+g˜ ∈B(g˜ ,ε )⊂V,
w w w 1
1−τ
which gives
f = τf˜ +(1−τ)g˜
τ w w
τ
= τf +(1−τ) (f˜ −f)+g˜ ∈V.
w w
h1−τ i
Therefore, we have
ε kf˜ −fk kf˜ −fk
1−τ∗ ≤1− 1 = w ≤ w
ε1+kf˜w−fk ε1+kf˜w−fk ε1
and
kf −fk = kτ∗f˜ +(1−τ∗)g˜ −fk
w w w
≤ τ∗kf˜ −fk+(1−τ∗)kg˜ −fk
w w
2
≤ kf˜ −fk 1+ (kfk+kgk+ε)
w
(cid:16) ε (cid:17)
2
≤ 2d 1+ (kfk+kgk+ε) kf¯ −fk.
w
(cid:16) ε (cid:17)
3.2. Proof of Theorem 2
Beforeweprovidethe proofofTheorem2,the Lemma2anditsproofarestated
as follows.
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 5
n
1
Lemma2. Fort=1,...,k,Mˆ = B Fˆ (X ),...,Fˆ (X ) .Leth =
t t 1 i,1 d i,d min
n
Xi=1 (cid:0) (cid:1)
2
min(h ,...,h ). Then E(|Mˆ −M |2)=O 1 1+ d .
1 d t t (cid:0)n(cid:1)(cid:16) (cid:16)hmin(cid:17) (cid:17)
Proof. Forsimplicity,We firstdefinesomenotations.For1≤ℓ≤dand1≤i≤
n, let ζ =Fˆ (X )−F (X ), where F is the CDF for X , and
i,ℓ ℓ i,ℓ ℓ i,ℓ ℓ i,ℓ
ϕ = B F (X )+ζ ,...,F (X )+ζ
1 t 1 i,1 i,1 d i,d i,d
. (cid:0) (cid:1)
.
.
ϕ = B F (X ),...,F (X ),F (X )+ζ ,...,F (X )+ζ
ℓ t 1 i,1 ℓ−1 i,ℓ−1 ℓ i,ℓ i,ℓ d i,d i,d
. (cid:0) (cid:1)
.
.
ϕ = B F (X ),...,F (X ),F (X )+ζ
d t 1 i,1 d−1 i,d−1 d i,d i,d
(cid:0) (cid:1)
ϕ = B F (X ),...,F (X ) .
d+1 t 1 i,1 d i,d
(cid:0) (cid:1)
Since B (u ,...,u )=φ (u )···φ (u ), where for 1≤j ≤d, |φ |≤1 and
t 1 d t,1 1 t,d d t,j
|x −x |
1 2
|φ (x )−φ (x )|≤ for x ,x ∈[0,1],
t,j 1 t,j 2 1 2
h
min
we have
ℓ−1 d
ϕ −ϕ = φ F (X )+ζ −φ F (X ) φ F (X ) φ F (X )+ζ ,
ℓ ℓ+1 t,ℓ ℓ i,ℓ i,ℓ t,ℓ ℓ i,ℓ t,s s i,s t,s s i,s i,s
h (cid:0) (cid:1) (cid:0) (cid:1)isY=1 (cid:0) (cid:1)s=Yℓ+1 (cid:0) (cid:1)
and
1
|ϕ −ϕ |≤|φ F (X )+ζ −φ F (X ) |≤ |ζ |.
ℓ ℓ+1 t,ℓ ℓ i,ℓ i,ℓ t,ℓ ℓ i,ℓ i,ℓ
h
(cid:0) (cid:1) (cid:0) (cid:1) min
Therefore,
|B Fˆ (X ),...,Fˆ (X ) −B F (X ),...,F (X ) |
t 1 i,1 d i,d t 1 i,1 d i,d
(cid:0) (cid:1) (cid:0) (cid:1)
= |(ϕ −ϕ )+(ϕ −ϕ )+···+(ϕ −ϕ )|
1 2 2 3 d d+1
d
1
≤ |ζ |.
i,ℓ
h
min X
ℓ=1
In addition,
n
1
|M −Mˆ | = c(u)B (u)du− B Fˆ (X ),...,Fˆ (X )
t t (cid:12)(cid:12)Z[0,1]d t nXi=1 t(cid:0) 1 i,1 d i,d (cid:1)(cid:12)(cid:12)
(cid:12) n (cid:12)
1
≤ B Fˆ (X ),...,Fˆ (X ) −B F (X ),...,F (X )
(cid:12)(cid:12)nXi=1h t(cid:0) 1 i,1 d i,d (cid:1) t(cid:0) 1 i,1 d i,d (cid:1)i(cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
I
| n {z }
1
+ B F (X ),...,F (X ) −E B (F (X ),...,F (X ) .
t 1 i,1 d i,d t 1 i,1 d i,d
(cid:12)(cid:12)nXi=1 (cid:0) (cid:1) (cid:0) (cid:1)(cid:12)(cid:12)
(cid:12) (cid:12)
II
| {z }
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 6
Since |I|≤ 1 n d |ζ |, we have
nhmin i=1 ℓ=1 i,ℓ
P P
n d n d
1 d 2 1 d 2 2n+2 d 2 1
EI2 ≤ Eζ2 = = O .
(cid:16)nd(cid:17)(cid:16)hmin(cid:17) XX i,ℓ (cid:16)nd(cid:17)(cid:16)hmin(cid:17) XX 12n2 (cid:16)hmin(cid:17) (cid:16)n(cid:17)
i=1ℓ=1 j=1ℓ=1
In addition,
1 1
EII2 ≤ =O ,
n (cid:16)n(cid:17)
so
1 d 2
E(|M −Mˆ |2)=O 1+ .
t t
(cid:16)n(cid:17)(cid:16) (cid:16)hmin(cid:17) (cid:17)
Now, we will prove Theorem 2.
Proof. Suppose that there exists a α∗ such that k(α∗)TB −ck ≤ ∆ , where
1
(α∗)TB satisfies the constraints for a copula density. Then,
kαˆTB−ck2 ≤ 2k(αˆ−α∗)TBk2+2k(α∗)TB−ck2
≤ 2(αˆ−α∗)TP(αˆ−α∗)+2∆2
1
≤ 2(αˆ−α∗)TPTP−1P(αˆ−α∗)+2∆2
1
2
≤ (αˆ−α∗)TPTP(αˆ−α∗)+2∆2
min(eigen(P)) 1
2
≤ β(αˆ−α∗)TPTP(αˆ−α∗) +2∆2.(1)
(cid:16)βmin(eigen(P))(cid:17)(cid:16) (cid:17) 1
Here the eigen(A) denotes the eigenvalues of a matrix A. Let
I = β(Pαˆ−Mˆ)T(Pαˆ−Mˆ)+λPen(c,αˆ)
0
−[β(Pα∗−Mˆ)T(Pα∗−Mˆ)+λPen(c,α∗)]
−[β(Pαˆ−M)T(Pαˆ−M)+λPen(c,αˆ)]
+β(Pα∗−M)T(Pα∗−M)+λPen(c,α∗).
Since
β(Pαˆ−Mˆ)T(Pαˆ−Mˆ)+λPen(c,αˆ)
≤ β(Pα∗−Mˆ)T(Pα∗−Mˆ)+λPen(c,α∗),
we have
I +β(Pαˆ−M)T(Pαˆ−M)+λPen(c,αˆ)
0
≤ β(Pα∗−M)T(Pα∗−M)+λPen(c,α∗).
Therefore,
β(αˆ−α∗)TPTP(αˆ−α∗)
≤2β(Pαˆ−M)T(Pαˆ−M)+2β(Pα∗−M)T(Pα∗−M)
≤4β(Pα∗−M)T(Pα∗−M)+2λPen(c,α∗)−2I . (2)
0
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016
Cheng and Huang/Nonparametric copula density estimator 7
Note that
I =2β(Mˆ −M)TPα∗−2β(Mˆ −M)TPαˆ,
0
so
−2I ≤ 4|β(Mˆ −M)TP(αˆ−α∗)|
0
≤ 4 β(Mˆ −M)T(Mˆ −M) β P(αˆ−α∗) T P(αˆ−α∗) (3)
q q
(cid:0) (cid:1) (cid:0) (cid:1)
Let
ε = 4β(Pα∗−M)T(Pα∗−M)+2λPen(c,α∗),
1
ε = β(Mˆ −M)T(Mˆ −M),
2
q
and
T
U = β P(αˆ−α∗) P(αˆ−α∗) ,
q
(cid:0) (cid:1) (cid:0) (cid:1)
then it follows from (2) and (3) that
U2 ≤ε +4ε U,
1 2
so
|U|≤2ε + ε +4ε2. (4)
1 q 1 2
To control ε , let c = B (u)du for 1≤i≤k, then
1 i [0,1]d i
R
(Pα∗−M)T(Pα∗−M) ≤ max c B (u) (α∗)TB(u)−c(u) 2du
1≤i≤k iZ[0,1]d1≤Xi≤k i (cid:0) (cid:1)
= ∆2O(1)/β,
1
which, together with the fact that Pen(c,α∗) = (d(d−1)/2)∆2, implies that
1
ε =∆2O(1)→0 as n→∞.
1 1
To control ε , note that from Lemma 2, we have
2
k d 2 1 d
Eε2 ≤ O 1+
2 (cid:16)n(cid:17)(cid:16) (cid:16)hmin(cid:17) (cid:17)(cid:16)hmin(cid:17)
k 1 2+d
= O ,
(cid:16)n(cid:17)(cid:16)hmin(cid:17)
so ε converges to 0 in probability.
2
Fromthe abovediscussionforε andε ,itfollowsfrom(4)thatU converges
1 2
to 0 in probability. From (1),
2
kαˆTB−ck2 ≤ U2+2∆2 =O(1)U2+2∆2,
(cid:16)βmin(eigen(P))(cid:17) 1 1
so kcˆ−ck=kαˆTB−ck converges to 0 in probability.
References
[1] W.H.Wong. Onconstrainedmultivariatesplinesandtheirapproximations.
Numerische Mathematik, 43:141–152,1984.
imsart-generic ver. 2009/08/13 file: tr01302016.tex date: February 2, 2016