Table Of ContentAdaptive global thresholding on the sphere
Claudio Durastantia,1
aFakulta¨t fu¨r Matematik, Ruhr Universita¨t, Bochum
6
1
0
Abstract
2
This work is concerned with the study of the adaptivity properties of nonpara-
l
u metric regression estimators over the d-dimensional sphere within the global
J
thresholding framework. The estimators are constructed by means of a form
6
of spherical wavelets, the so-called needlets, which enjoy strong concentration
2
properties in both harmonic and real domains. The author establishes the con-
] vergence rates of the Lp-risks of these estimators, focussing on their minimax
T propertiesandprovingtheiroptimalityoverascaleofnonparametricregularity
S function spaces, namely, the Besov spaces.
.
h
Keywords: Global thresholding, needlets, spherical data, nonparametric
t
a regression, U-statistics, Besov spaces, adaptivity.
m
2010 MSC: 62G08, 62G20, 65T60
[
2
v 1. Introduction
4
4 ThepurposeofthispaperistoestablishadaptivityfortheLp-riskofregres-
8 sion function estimators in the nonparametric setting over the d-dimensional
2 sphere Sd. The optimality of the Lp risk is established by means of global
0
thresholding techniques and spherical wavelets known as needlets.
.
1 Let (X ,Y ),...,(X ,Y ) be independent pairs of random variables such
1 1 n n
0 that, for each i ∈ {1,...,n}, X ∈ Sd and Y ∈ R. The random variables
i i
6
X ,...,X are assumed to be mutually independent and uniformly distributed
1 1 n
locations on the sphere. It is further assumed that, for each i∈{1,...,n},
:
v
i Y =f(X )+ε , (1)
X i i i
r where f : Sd (cid:55)→ R is an unknown bounded function, i.e., there exists M > 0
a
such that
sup |f(x)|≤M <∞. (2)
x∈Sd
Moreover,therandomvariables(cid:15) ,...,(cid:15) inEq.(1)areassumedtobemutually
1 n
independent and identically distributed with zero mean. Roughly speaking,
Email address: [email protected](ClaudioDurastanti)
1TheauthorissupportedbyDeutscheForschungsgemeinschaft(DFG)-GRK2131,“High-
dimensionalPhenomenainProbability—FluctuationsandDiscontinuity”.
Preprint submitted to Journal of Multivariate Analysis July 27, 2016
theycanbeviewedastheobservationalerrorsandinwhatfollows, theywillbe
assumed to be sub-Gaussian.
In this paper, we study the properties of nonlinear global hard thresholding
estimators, inordertoestablishtheoptimalratesofconvergenceofLp-risksfor
functions belonging to the so-called Besov spaces.
1.1. An overview of the literature
In recent years, the issue of minimax estimation in nonparametric settings
has received considerable attention in the statistical inference literature. The
seminal contribution in this area is due to Donoho et al. [7]. In this paper, the
authors provide nonlinear wavelet estimators for density functions on R, lying
over a wide nonparametric regularity function class, which attain optimal rates
of convergence up to a logarithmic factor. Following this work, the interaction
between wavelet systems and nonparametric function estimation has led to a
considerable amount of developments, mainly in the standard Euclidean frame-
work; see, e.g., [3, 5, 24, 26, 27, 28, 30] and the textbooks [22, 44] for further
details and discussions.
More recently, thresholding methods have been applied to broader settings.
Inparticular,nonparametricestimationresultshavebeenachievedonSd byus-
ingasecondgenerationwaveletsystem,namely,thesphericalneedlets. Needlets
were introduced by Narcowich et al. [39, 40], while their stochastic properties
dealing with various applications to spherical random fields were examined in
[2, 6, 34, 35, 36]. Needlet-like constructions were also established over more
general manifolds by Geller and Mayeli [18, 19, 20, 21], Kerkyacharian et al.
[25] and Pesenson [41] among others, and over spin fiber bundles by Geller and
Marinucci [16, 17].
In the nonparametric setting, needlets have found various applications on
directional statistics. Baldi et al. [1] established minimax rates of convergence
for the Lp-risk of nonlinear needlet density estimators within the hard local
thresholding paradigm, while analogous results concerning regression function
estimationwereestablishedbyMonnier[38]. Theblockthresholdingframework
was investigated in Durastanti [9]. Furthermore, the adaptivity of nonparamet-
ric regression estimators of spin function was studied in Durastanti et al. [10].
In this case, the regression function takes as its values algebraical curves lying
onthetangentplaneforeachpointonS2 andthewaveletsusedaretheso-called
spin (pure and mixed) needlets; see Geller and Marinucci [16, 17].
The asymptotic properties of other estimators for spherical data, not con-
cerning the needlet framework, were investigated by Kim and Koo [31, 32, 33],
while needlet-like nearly-tight frames were used in Durastanti [8] to establish
the asymptotic properties of density function estimators on the circle. Finally,
in Gautier and Le Pennec [15], the adaptive estimation by needlet thresholding
was introduced in the nonparametric random coefficients binary choice model.
Regarding the applications of these methods in practical scenarios, see, e.g.,
[13, 14, 23], where they were fruitfully applied to some astrophysical problems,
concerning, for instance, high-energy cosmic rays and Gamma rays.
2
1.2. Main results
Consider the regression model given in Eq. (1) and let {ψ : j ≥ 0,k =
j,k
1,...,K } be the set of d-dimensional spherical needlets. Roughly speaking,
j
j and K denote the resolution level j and the cardinality of needlets at the
j
resolution level j, respectively. The regression function f can be rewritten in
terms of its needlet expansion. Namely, for all x∈Sd, one has
Kj
(cid:88)(cid:88)
f(x)= β ψ (x),
j,k j,k
j≥0k=1
where {β :j ≥0,k =1,...,K } is the set of needlet coefficients.
j,k j
For each j ≥0 and k ∈{1,...,K }, a natural unbiased estimator for β is
j j,k
given by the corresponding empirical needlet coefficient, viz.
n
1 (cid:88)
β(cid:98)j,k = n Yiψj,k(Xi); (3)
i=1
see,e.g.,Baldietal. [1]andH¨ardleetal. [22]. Therefore,theglobalthresholding
needlet estimator of f is given, for each x∈Sd, by
fˆn(x)=(cid:88)Jn τjK(cid:88)Jnβ(cid:98)j,kψj,k(x), (4)
j=0 k=1
where τ is a nonlinear threshold function comparing the given j-dependent
j
statisticΘ(cid:98)j(p),builtonasubsampleofp<nobservations,toathresholdbased
on the observational sample size. If Θ(cid:98)j(p) is above the threshold, the whole
j-level is kept; otherwise it is discarded.
Loosely speaking, this procedure allows one to delete the coefficients corre-
sponding to a resolution level j whose contribution to the reconstruction of the
regression function f is not clearly distinguishable from the noise. Following
Kerkyacharian et al. [30], we consider the so-called hard thresholding frame-
work, defined as
τ =τ (p)=1{Θˆ (p)≥Bdjn−p/2},
j j j
where p ∈ N is even. Further details regarding the statistic Θˆ (p) will be
j
discussedinSection3.4,wherethechoiceofthethresholdBdjn−p/2 willalsobe
motivated.
For the rest of this section, we consider Θ(cid:98)j(p) as an unbiased statistic of
|β |p +···+|β |p. The so-called truncation bandwidth J , on the other
j,1 j,Kj n
hand, is the higher frequency on which the empirical coefficients βˆ ,...,βˆ
j,1 j,Kj
are computed. The optimal choice of the truncation level is J = ln (n1/d);
n B
for details, see Section 3. This allows the error due to the approximation of f,
whichisaninfinitesumwithrespecttoj, tobecontrolledbyafinitesum, such
as the estimator fˆ .
n
3
Our objective is to estimate the global error measure for the regression es-
timator fˆ . For this reason, we study the worst possible performance over a
n
so-called nonparametric regularity class {F :α∈A} of function spaces of the
α
Lp-risk, i.e.,
R (cid:0)fˆ ;F (cid:1)= sup E(cid:16)(cid:107)fˆ −f(cid:107)p (cid:17).
n n α n Lp(Sd)
f∈Fα
Recall that an estimator fˆ is said to be adaptive for the Lp-risk and for the
n
scale of classes {F :α∈A} if, for every α∈A, there exists a constant c >0
α α
such that
E(cid:16)(cid:107)fˆ −f(cid:107)p (cid:17)≤c R (cid:0)fˆ ;F (cid:1);
n Lp(Sd) α n n α
see, e.g., [1, 22, 30].
For r > 0 and for p ∈ [1,r], we will establish that the regression estimator
fˆ is adaptive for the class of Besov spaces Bs , where 1 ≤ q ≤ ∞ and d/p ≤
n p,q
s < r+1. Finally, let R ∈ (0,∞) be the radius of the Besov ball on which f
is defined. The proper choice of r will be motivated in Section 2.1. Our main
result is described by the following theorem.
Theorem 1.1. Given r ∈(1,∞), let p∈[1,r]. Also, let fˆ be given by Eq. (4),
n
with J = ln n1/d. Then, for 1 ≤ q ≤ ∞, d/p ≤ s < r+1 and 0 < R < ∞,
n B
there exists C >0 such that
(cid:16) (cid:17)
sup E (cid:107)fˆn−f(cid:107)pLp(Sd) ≤Cn2−s+spd.
f∈Bs (R)
r,q
Thebehaviorofthe L∞-riskfunctionwillbestudiedseparately in Section3
and the analogous result is described in Theorem 3.2. Moreover, the details
concerning the choice of r will be presented in Remark 3.1 and other properties
of Lp-risk functions, such as optimality, will be discussed in Remark 3.3.
1.3. Comparison with other results
The bound given in Eq. (12) is consistent with the results of Kerkyacharian
et al. [30], where global thresholding techniques were introduced on R. As far
as nonparametric inference over spherical datasets is concerned, our results can
be viewed as an alternative proposal to the existing nonparametric regression
methods (see, e.g., [1, 9, 10, 38]), related to the local and block thresholding
procedures.
Recall that in local thresholding paradigm, each empirical estimator β(cid:98)j,k is
comparedtoathresholdτ anditis,therefore,keptordiscardedifitsabsolute
j,k
value is above or below τ respectively, i.e., the threshold function is given by
j,k
1{|β(cid:98)j,k| ≥ τj,k}. Typically, the threshold is chosen such that τj,k = κ(lnn/n),
whereκdependsexplicitlyontwoparameters,namely,theradiusRoftheBesov
ball on which the function f is defined and its supremum M; see, e.g., Baldi
et al. [1]. An alternative and partially data-driven choice for κ is proposed by
Monnier [38], i.e., here
n
κ= κ0 (cid:88)ψ (X )2.
n j,k i
i=1
4
Even if this stochastic approach is proved to outperform the deterministic one,
the threshold still depends on both R and M, which control κ . Also according
0
to the results established on R (see H¨ardle et al. [22]), local techniques entail
nearlyoptimalityratesfortheLp-risksoverawidevarietyofregularityfunction
spaces. Inthiscase,theregressionfunctionf belongstoBs (R),wheres≥d/r,
p,q
p∈{1,∞},q ∈{1,∞}and0<R<∞(cf. [1,10,22]). However,theseadaptive
ratesofconvergenceareachievedontheexpenseofhavinganextralogarithmic
term and of requiring explicit knowledge of the radius of the Besov balls on
which f is defined, in order to establish an optimal threshold.
As far as the block thresholding is concerned, for any fixed resolution level
this procedure collects the coefficients βˆ ,...,βˆ into (cid:96) = (cid:96)(n) blocks de-
j,1 j,Kj
noted B ,...,B of dimension depending on the sample size. Each block is
j,1 j,(cid:96)
thencomparedtoathresholdandthenitisretainedordiscarded. Thismethod
has exact convergence rate (i.e., without the logarithmic extra term), although
itrequiresexplicitknowledgeoftheBesovradiusR. Furthermore,theestimator
isadaptiveonlyoveranarrowersubsetofthescaleofBesovspaces,theso-called
regular zone; see H¨ardle et al. [22]. The construction of blocks on Sd can also
be a difficult procedure, as it requires a precise knowledge of the pixelization
of the sphere, namely, the structure of the subregions on which the sphere is
partitioned, in order to build spherical wavelets.
On the other hand, the global techniques presented in this paper do not
requireanyknowledgeregardingtheradiusofBesovballandhaveexactoptimal
convergence rates even over the narrowest scale of regularity function spaces.
1.4. Plan of the paper
This paper is organized as follows. Section 2 presents some preliminary
results,suchastheconstructionofsphericalneedletframesonthesphere,Besov
spaces and their properties. In Section 3, we describe the statistical methods
weapplywithintheglobalthresholdingparadigm. Thissectionalsoincludesan
introduction to the properties of the sub-Gaussian random variables and of the
U-statistic Θ(cid:98)j(p), which are key for establishing the thresholding procedure.
Section 4 provides some numerical evidence. Finally, the proofs of all of our
results are collected in Section 5.
2. Preliminaries
This section presents details concerning the construction of needlet frames,
thedefinitionofsphericalBesovspacesandtheirproperties. Inwhatistofollow
the main bibliographical references are [1, 2, 7, 21, 22, 24, 37, 39, 40].
2.1. Harmonic analysis on Sd and spherical needlets
Consider the simplified notation L2(cid:0)Sd(cid:1) = L2(cid:0)Sd,dx(cid:1), where dx is the
uniform Lebesgue measure over Sd. Also, let H be the restriction to Sd of
(cid:96)
5
the harmonic homogeneous polynomials of degree (cid:96); see, e.g., Stein and Weiss
[43]. Thus, the following decomposition holds
∞
L2(cid:0)Sd(cid:1)=(cid:77)H .
(cid:96)
(cid:96)=0
An orthonormal basis for H is provided by the set of spherical harmonics
(cid:96)
{Y :m=1,...,g } of dimension g given by
(cid:96),m (cid:96),d (cid:96),d
(cid:18) (cid:19)
(cid:96)+η (cid:96)+2η −1 d−1
g = d d , η = .
(cid:96),d η (cid:96) d 2
d
For any function f ∈L2(cid:0)Sd(cid:1), we define the Fourier coefficients as
(cid:90)
a := Y (x)f(x)dx,
(cid:96),m (cid:96),m
Sd
such that the kernel operator denoting the orthogonal projection over H is
(cid:96)
given, for all x∈Sd, by
g(cid:96),d
(cid:88)
P f(x)= a Y (x).
(cid:96),d (cid:96),m (cid:96),m
m=1
Also, let the measure of the surface of Sd be given by
(cid:46) (cid:18)d+1(cid:19)
ω =2π(d+1)/2 Γ .
d 2
The kernel associated to the projector P links spherical harmonics to the
(cid:96),d
Gegenbauer polynomial of parameter η and order (cid:96), labelled by C(ηq). Indeed,
d (cid:96)
the following summation formula holds
P (x ,x )= (cid:88)g(cid:96),d Y (x )Y x = (cid:96)+ηdC(ηd)((cid:104)x ,x (cid:105)),
(cid:96),d 1 2 (cid:96),m 1 (cid:96),m 2 η ω (cid:96) 1 2
d d
m=1
where (cid:104)·,·(cid:105) is the standard scalar product on Rd+1; see, e.g., Marinucci and
Peccati [37].
Following Narcowich et al. [40], K = ⊕(cid:96) H is the linear space of homo-
(cid:96) i=0 i
geneous polynomials on Sd of degree smaller or equal to (cid:96); see also [1, 37, 39].
Thus,thereexistasetofpositivecubaturepointsQ ∈Sd andasetofcubature
(cid:96)
weights {λ }, indexed by ξ ∈Q , such that, for any f ∈K ,
ξ (cid:96) (cid:96)
(cid:90)
(cid:88)
f(x)dx= λ f(ξ).
ξ
Sd
ξ∈Q(cid:96)
In the following, the notation a≈b denotes that there exist c ,c >0 such
1 2
that c b ≤ a ≤ c b. For a fixed resolution level j and a scale parameter B, let
1 2
6
(cid:0) (cid:1)
K = card Q . Therefore, {ξ : k = 1,...,K } is the set of cubature
j [2Bj+1] j,k j
points associated to the resolution level j, while {λ :k =1,...,K } contains
j,k j
the corresponding cubature weights. These are typically chosen such that
K ≈Bdj and ∀ λ ≈B−dj.
j k∈{1,...,Kj} j,k
Define the real-valued weight (or window) function b on (0,∞) so that
(i) b lies on a compact support (cid:2)B−1,B(cid:3);
(ii) the partitions of unity property holds, namely, (cid:80) b2((cid:96)/Bj) = 1, for
j≥0
(cid:96)≥B;
(iii) b∈Cρ(0,∞) for some ρ≥1.
Remark 2.1. Note that ρ can be either a positive integer or equal to ∞. In
the first case, the function b(·) can be built by means of a standard B-spline ap-
proach, using linear combinations of the so-called Bernstein polynomials, while
in the other case, it is constructed by means of integration of scaled exponen-
tial functions (see also Section 4). Further details can be found in the textbook
Marinucci and Peccati [37].
For any j ≥0 and k ∈{1,...,K }, spherical needlets are defined as
j
(cid:18) (cid:19)
(cid:112) (cid:88) (cid:96)
ψ (x)= λ b P (x,ξ ).
j,k j,k Bj (cid:96),d j,k
(cid:96)≥0
Spherical needlets feature some important properties descending on the struc-
tureofthewindowfunctionb. Usingthecompactnessofthefrequencydomain,
it follows that ψ is different from zero only on a finite set of frequencies (cid:96), so
j,k
that we can rewrite the spherical needlets as
(cid:18) (cid:19)
(cid:112) (cid:88) (cid:96)
ψ (x)= λ b P (x,ξ ),
j,k j,k Bj (cid:96),d j,k
(cid:96)∈Λj
whereΛ =(cid:8)u:u∈(cid:0)(cid:2)Bj−1(cid:3),(cid:2)Bj+1(cid:3)(cid:1)(cid:9)and[u],u∈R,denotestheintegerpart
j
of u. From the partitions of unity property, the spherical needlets form a tight
frame over Sd with unitary tightness constant. For f ∈L2(cid:0)Sd(cid:1),
Kj
(cid:107)f(cid:107)2 =(cid:88)(cid:88)|β |2,
L2(Sd) j,k
j≥0k=1
where
(cid:90)
β = f(x)ψ (x)dx, (5)
j,k j,k
Sd
are the so-called needlet coefficients. Therefore, we can define the following
reconstruction formula (holding in the L2-sense): for all x∈Sd,
Kj
(cid:88)(cid:88)
f(x)= β ψ (x).
j,k j,k
j≥0k=1
7
From the differentiability of b, we obtain the following quasi-exponential local-
ization property; for x∈Sd and any η ∈N such that η ≤ρ, there exists c >0
η
such that
c Bjd/2
|ψ (x)|≤ η , (6)
j,k {1+Bjd/2d(x,ξ )}η
j,k
where d(·,·) denotes the geodesic distance over Sd.
Roughly speaking, |ψ (x)| ≈ Bjd/2 if x belongs to the pixel of area B−dj
j,k
surrounding the cubature point ξ ; otherwise, it is almost negligible. The
j,k
localizationresultyieldsasimilarboundednesspropertyfortheLp-norm,which
is crucial for our purposes. In particular, for any p ∈ [1,∞) there exist two
constants c ,C >0 such that
p p
cpBjd(12−p1) ≤(cid:107)ψj,k(cid:107)Lp(Sd) ≤CpBjd(21−p1), (7)
and there exist two constants c ,C >0 such that
∞ ∞
c∞Bjd2 ≤(cid:107)ψj,k(cid:107)L∞(Sd) ≤C∞Bjd/2.
AccordingtoLemma2inBaldietal. [1],thefollowingtwoinequalitieshold.
For every 0<p≤∞,
(cid:13) (cid:13)
(cid:13)Kj (cid:13)
(cid:13)(cid:13)(cid:13)(cid:88)βj,kψj,k(cid:13)(cid:13)(cid:13) ≤cBjd(21−p1)(cid:107)βj,k(cid:107)(cid:96)p, (8)
(cid:13)k=1 (cid:13)Lp(Sd)
and for every 1≤p≤∞,
(cid:107)βj,k(cid:107)(cid:96)pBjd(12−p1) ≤c(cid:107)f(cid:107)Lp(Sd),
where(cid:96) denotesthespaceofp-summablesequences. Thegeneralizationforthe
p
case p=∞ is trivial.
The following lemma presents a result based on the localization property.
Lemma 2.1. For x ∈ Sd, let ψ (x) be given by Eq. (2.1). Then, for q ≥ 2,
j,k
k (cid:54)= k , for i (cid:54)= i = 1,...,q, and for any η ≥ 2, there exists C > 0 such
i1 i2 1 2 η
that
(cid:90) (cid:89)q Bdj(q−1)
ψ (x)dx≤ ,
Sd j,ki (1+Bdj∆)η(q−1)
i=1
where
∆= min d(ξ ,ξ ).
i1,i2∈{1,...,q},i1(cid:54)=i2 j,ki1 j,ki2
Remark 2.2. As discussed in Geller and Pesenson [21] and Kerkyacharian et
al. [25],needlet-likewaveletscanbebuiltovermoregeneralspaces,namely,over
compact manifolds. In particular, let {M,g} be a smooth compact homogeneous
manifold of dimension d, with no boundaries. For the sake of simplicity, we
assume that there exists a Laplace–Beltrami operator on M with respect to the
8
action g, labelled by ∆ . The set {γ : q ≥ 0} contains the eigenvalues of
M q
∆ associated to the eigenfunctions {u : q ≥ 0}, which are orthonormal with
M q
respect to the Lebesgue measure over M and they form an orthonormal basis in
L2(M); see [20, 21]. Every function f ∈ L2(M) can be described in terms of
its harmonic coefficients, given by a =(cid:104)f,u (cid:105) , so that, for all x∈M,
q q L(cid:32) 2(M)
(cid:88)
f(x)= a u (x).
q q
q≥1
Therefore, it is possible to define a wavelet system over {M,g} describing a
tight frame over M along the same lines as in Narcowich et al. [40] for Sd; see
also [21, 25, 41] and the references therein, such as Geller and Mayeli [19, 20].
Here we just provide the definition of the needlet (scaling) function on M, given
by
Bj+1 (cid:18)√ (cid:19)
(cid:112) (cid:88) −γq
ψ (x)= λ b u (x)u¯(ξ ),
j,k j,k Bj q j,k
q=Bj−1
where in this case the set {ξ ,λ } characterizes a suitable partition of M,
j,k j,k
(cid:112)
given by a ε-lattice on M, with ε = λ . Further details and technicalities
j,k
concerningε-latticescanbefoundinPesenson[41]. Analogouslytothespherical
case, for f ∈ L2(M) and arbitrary j ≥ 0 and k ∈ {1,...,K }, the needlet
j
coefficient corresponding to ψ is given by
j,k
Bj+1 (cid:18)√ (cid:19)
(cid:112) (cid:88) −γq
β =(cid:104)f,ψ (cid:105) = λ b a u (ξ ).
j,k j,k L2(Sd) j,k Bj q q j,k
q=Bj−1
These wavelets preserve all the properties featured by needlets on the sphere: be-
cause, as shown in the following sections, the main results presented here do not
depend strictly on the underlying manifold (namely, the sphere) but rather they
can be easily extended to more general frameworks such as compact manifolds,
wheretheconcentrationpropertiesofthewaveletsandthesmoothapproximation
properties of Besov spaces still hold.
2.2. Besov space on the sphere
Here we will recall the definition of spherical Besov spaces and their main
approximation properties for wavelet coefficients. We refer to [1, 10, 22, 39] for
more details and further technicalities.
Suppose that one has a scale of functional classes G , depending on the q-
t
dimensional set of parameters t ∈ T ⊆ Rq. The approximation error G (f;p)
t
concerning the replacement of f by an element g ∈G is given by
t
G (f;p)= inf (cid:107)f −g(cid:107) .
t Lp(Sd)
g∈Gt
Therefore, the Besov space Bs is the space of functions such that f ∈Lp(cid:0)Sd(cid:1)
p,q
and
(cid:88)1
{tsG (f;p)}q <∞,
t t
t≥0
9
which is equivalent to
(cid:88)
Bj{G (f;p)}q <∞.
Bj
j≥0
The function f belongs to the Besov space Bs if and only if
p,q
1/p
Kj
(cid:88)
{|βj,k|(cid:107)ψj,k(cid:107)Lp(Sd)}p =B−jswj, (9)
k=1
where w ∈ (cid:96) , the standard space of q-power summable infinite sequences.
j q
Loosely speaking, the parameters s ≥ 0, 1 ≤ p ≤ ∞ and 1 ≤ q ≤ ∞ of
the Besov space Bs can be viewed as follows: given B > 1, the parameter p
p,q
denotes the p-norm of the wavelet coefficients taken at a fixed resolution j, the
parameter q describes the weighted q-norm taken across the scale j, and the
parameter r controls the smoothness of the rate of decay across the scale j. In
view of Eq. (7), the Besov norm is defined as
q/p1/q
(cid:107)f(cid:107)Bs =(cid:107)f(cid:107)Lp(Sd)+(cid:88)Bjq{s+d(1/2−1/p)}(cid:88)Kj |βj,k|p
p,q j≥0 k=1
(cid:13) (cid:13)
=(cid:107)f(cid:107) +(cid:13)Bj{s+d(1/2−1/p)}(cid:107)β (cid:107) (cid:13) ,
Lp(Sd) (cid:13) j,k (cid:96)p(cid:13)(cid:96)q
for q ≥1. The extension to the case q =∞ is trivial.
We conclude this section by introducing the Besov embedding, discussed in
[1, 29, 30] among others. For p<r, one has
Bs ⊂Bs and Bs ⊂Bs−d(1/p−1/r),
r,q p,q p,q r,q
or, equivalently,
Kj Kj
(cid:88)|βj,k|p ≤(cid:88)|βj,k|rKj1−p/r; (10)
k=1 k=1
Kj Kj
(cid:88)|β |r ≤(cid:88)|β |p. (11)
j,k j,k
k=1 k=1
Proofs and further details can be found, for instance, in [1, 10].
3. Global thresholding with spherical needlets
This section provides a detailed description of the global thresholding tech-
nique applied to the nonparametric regression problem on the d-dimensional
sphere. We refer to [12, 22, 30] for an extensive description of global threshold-
ing methods and to [1, 10] for further details on nonparametric estimation in
the spherical framework.
10