Table Of ContentGeometry of the Central Limit Theorem in
the Nonextensive Case
9
0
0
C. Vignat1 and A. Plastino2
2
n
a 1I.G.M., Universit´e de Marne la Vall´ee, Marne la Vall´eee, France
J
2Exact Sci. Fac., National University La Plata and IFLP-CCT-CONICET
9
2 C.C. 727, 1900 La Plata, Argentina
]
h
c
e Abstract
m
-
We uncover geometric aspects that underlie the sum of two independent stochastic
t
a
variables when both are governed by q Gaussian probability distributions. The
t
s −
pertinent discussion is given in terms of random vectors uniformly distributed on a
.
t
a p sphere.
m −
-
d
n
o
c
[ 1 Introduction
1
v
1
2 Nonextensive statistical physics provides a rich framework for the interpreta-
6 tion of complex systems’ behavior whenever classical statistical physics fails
4
[1]. The basic tool for this approach is the extension of the classical Boltz-
.
1
mann entropy to the wider class of Tsallis entropies. In this context, the usual
0
9 Gaussian distributions is extended to the q-Gaussian distributions, to be de-
0
fined below. The study of the properties of these distributions is an interesting
:
v
problem, being the subject of a number of recent publications [1]. Of special
i
X interest is the extension of the usual stability result that holds in the Gaussian
r case, namely, that if X R and X R are independent Gaussian random
a 1 2
∈ ∈
variables with unit variance, then the linear combination
Z = a X +a X
1 1 2 2
is again Gaussian and
Z a2 +a2X, (1)
1 2
∼
q
Email address: [email protected], [email protected](C.
Vignat1 and A. Plastino2).
Preprint submitted to Elsevier January 29, 2009
where X is Gaussian with unit variance, and denotes equality in distribu-
∼
tion.
This stability property is at the core of the central limit theorem (CLT),
which describes the behavior of systems that result of the additive superpo-
sition of many independent phenomena. The CLT can be ranked among the
most important results in probability theory and statistics, and plays an es-
sential role in several disciplines, notably in statistical mechanics. Pioneers
like A. de Moivre, P.S. de Laplace, S.D. Poisson, and C.F. Gauss have shown
that the Gaussian distribution is the attractor of the superposition process of
independent systems with a finite second moment. Distinguished authors like
Chebyshev, Markov, Liapounov, Feller, Lindeberg, and L´evy have also made
essential contributions to the CLT-development. As far asphysics is concerned
one can state that, starting from any system with any finite variance distribu-
tion function (for some measurable quantity x), and combining additively a
sufficiently large number of such independent systems together, the resultant
distribution function of x is always Gaussian.
Anaturalquestion isthustheextension ofthestabilityresult (1)tothenonex-
tensive case, that is, for q-Gaussian distributions. This interesting problem is
currently the subject of several publications (see for example [2]) in which
possible extensions of the CLT to the nonextensive context are studied. The
aim of this communication is to give some geometric insight into the behavior
of q-Gaussian distributions for the case q < 1.
2 Definitions and notations
In nonextensive statistics, the usual Shannon entropy of a density probability
f , namely
X
H (X) = f logf
1 X X
−
Z
is replaced by its Tsallis version
1
H (X) = 1 fq
q 1 q − X
− (cid:18) Z (cid:19)
where the nonextensivity index q is a real parameter, usually taken to be pos-
itive. It can be checked by applying L’Hospital’s rule that Shannon’s entropy
coincides with the limit case
limH (X) = H (X)
q 1
q→1
It is a well-known result that the distribution that maximizes the Shannon
entropy under a covariance matrix constraint EXXT = K (where K is a
2
symmetric definite positive matrix) is the Gaussian distribution
1
f (X) = exp XTK−1X .
X 1
πK 2 −
| | (cid:16) (cid:17)
Its nonextensive counterpart, called a q-Gaussian, is defined as follows.
Definition 1 The n variate distribution with zero mean and given covari-
−
ance matrix EXXT = K having maximum Tsallis entropy is denoted as
G (K) and defined as follows for 0 < q < 1 :
q
1
f (X) = A 1 XtΣ−1X 1−q , (2)
X q
− +
with matrix Σ = pK, parameter p(cid:16)defined as p(cid:17)= 22−q + n and notation
1−q
(x) = max(x,0). Moreover, the partition function is
+
Γ 2−q + n
A = q−1 2 .
q Γ 2(cid:16)−q πΣ(cid:17)1/2
1−q | |
(cid:16) (cid:17)
We note that this distribution has bounded support; namely, f (X) = 0 only
X
6
when X belongs the ellipso¨ıd
= Z Rn; ZtΣ−1Z 1 .
Σ
E ∈ ≤
n o
We also need the notion of spherical vector, defined as follows:
Definition 2 A random vector X Rn is spherical if its density f is a
X
∈
function of the norm X of X only, namely
| |
f (X) = g( X )
X
| |
for some function g : R+ R+.
→
An alternative characterization of a spherical vector is as follows [3]:
Proposition 3 A random vector X Rn is spherical if
∈
X AX
∼
for any orthogonal matrix A, where sign denotes equality in distribution.
∼
This property highlights the importance of spherical vectors in physics since
they describe systems that are invariant by orthogonal transformation.
A fundamental property of a spherical vector is the following:
3
Proposition 4 [3] If X Rn is a spherical random vector, then it has the
∈
stochastic representation
X rU
∼
where U is a uniform vector on the sphere = X Rn; XTX = 1 and r is
n
S ∈
a positive scalar random variable independent ofnU. Moreover, r has ostochastic
representation
r X . (3)
∼ | |
3 A heuristic approach
We start with a heuristic approach to the stability problem, namely the be-
havior of the random variable Z = a X +a X when X and X are two unit
1 1 2 2 1 2
variance, q-Gaussian independent random vectors in Rn with nonextensivity
parameter q < 1; let us assume that the following hypothesis - called (H)
hypothesis :
2
n+ N, (4)
1 q ∈
−
holds so that 1 = p−n 1 where p > n is an integer; a classical result is that
1−q 2 −
X (resp. X ) can then be considered as the n dimensional marginal vector
1 2
−
of a random vector U (resp. U ) that is uniformly distributed on the unit
1 2
sphere p−1 in Rp. Thus, there exist random vectors X˜1 and X˜2 in Rp−n such
S
that
X X
1 2
U = and U =
1 2
X˜ X˜
1 2
aretwop dimensionalindependentvectorsuniformlydistributedon .Then,
p
− S
the sum U +U is a spherical vector and has stochastic representation
1 2
a U +a U rU
1 1 2 2
∼
where U is uniform on . Now, by equation (3), the random variable r is
p
S
distributed as
r a U +a U = a2 +a2 +2λa a
1 1 2 2 1 2 1 2
∼ | |
where λ = UTU : this can be easily dedquced from
1 2
a U +a U = a2UTU +a2UTU +2a a UTU
1 1 2 2 1 1 1 2 2 2 1 2 1 2
| |
q
remarking that UTU = UTU = 1. But λ is a random variable with q-
1 1 2 2
Gaussian distribution! We prove this result by noticing that, conditioned to
U = u , random variable λ is the angle between U and the fixed direction u .
2 2 1 2
Since U is spherical, we may restrict our attention to the angle between U
1 1
4
and the first vector of the canonical basis in Rn, so that we look for the distri-
bution of the first component of U , which follows a q-Gaussian distribution
1
with parameter q such that
λ
1 p 1
= − 1.
1 q 2 −
λ
−
Since this distribution does not depend on our initial choice U = u , random
2 2
variable λ follows unconditionally the above cited distribution. We conclude
that the n dimensional marginal Z = a X +a X of vector a U +a U is
1 1 2 2 1 1 2 2
−
distributed as
a X +a X rX
1 1 2 2
∼
where X is the n dimensional marginal vector of U so that X is again
−
q Gaussian with parameter q. Moreover, this result extends to the case where
−
X and X both have1 a covariance matrix K = I by multiplying vectors X
1 2 1
1 6
and X2 by matrix K2. Consequently, we have deduced the following
Theorem 5 If X and X are two q-Gaussian independent random vectors
1 2
in Rn with covariance matrix K and nonextensivity parameter q < 1 and if
hypothesis (H) holds then
a X +a X (a a )X
1 1 2 2 1 2
∼ ◦
where X is again q-Gaussian with same covariance matrix K and same nonex-
tensive parameter q as X , and where
1
a a = a2 +a2 +2λa a , (5)
1 2 1 2 1 2
◦
q
the random variable λ being independent of X and again q-Gaussian dis-
tributed with nonextensive parameter q defined by
λ
(n 1) (n 3)q
q = − − − . (6)
λ (n+1) (n 1)q
− −
Two remarks are of interest at this point:
the univariate framework n = 1 is the only case for which random variable
•
λ has the same nonextensivity parameter q as X and X ;
λ 1 2
however, we note that
•
lim q = 1.
n→+∞ λ
This means that for large dimensional systems, the random variable λ con-
verges to the constant 0 and we recover the deterministic convolution; this
1 the case where X1 and X2 have distinct covariance matrices is more difficult and
left to further study
5
q
l
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 q
q
Figure 1. nonextensivity parameter q as a function of q for dimensions
λ
n = 1,2,3,5,10 and 100 (bottom to top)
is coherent with the fact that large dimensional q Gaussian vectors are
−
”close” to Gaussian vectors by De-Finetti inequality.
The curves in Figure 1 show the nonextensive parameter q as a function of q
λ
for several values of dimension n.
More can be said about the algebra a a :
1 2
◦
Theorem 6 The algebra a a defined as in (5) is associative and for any
1 2
◦
n 2,
≥
n
a a ... a = a2 +2 λ a a
1 ◦ 2 ◦ ◦ n v i ij i j
ui=1 i<j
uX X
where random variables λ = UTU atre q-Gaussian.
ij i j
As an example,
a a a = a2 +a2 +a2 +2λ a a +2λ a a +2λ a a .
1 2 3 1 2 3 12 1 2 13 1 3 23 2 3
◦ ◦
q
PROOF. By definition,
n
a a ... a = a U
1 2 n i i
◦ ◦ ◦ | |
i=1
X
n
= a2UtU +2 a a UtU
v i i i i j i j
ui=1 i<j
uX X
t
6
Since U = 1, we deduce, by denoting UtU = λ , that
| i| i j ij
n
a a ... a = a2 +2 λ a a .
1 ◦ 2 ◦ ◦ n v i ij i j
ui=1 i<j
uX X
t
By the same proof as above, we deduce that each λ is q Gaussian distributed
i
−
with parameter q . We remark that random variables λ are independent
λ i,j
pairwise but are obviously not mutually independent.
4 Generalization
The preceding result was derived under the hypothesis (H) as expressed by
(4), that is, for specific values of q < 1 only; we show in this section that this
result holds in fact without this hypothesis - for all values of q < 1 - but the
proof requires more elaborate analytic tools. Our main result is
Theorem 7 Theorem 5 holds for all values of q such that 0 < q < 1.
PROOF. The characteristic function associated to the q Gaussian distribu-
−
tion (2) is
ϕ (u) =d Eexp iuTX = 2p2−1Γ p Jp2−1 √uTKu
X (cid:16) (cid:17) (cid:18)2(cid:19) √u(cid:16)TKu p2−1(cid:17)
(cid:16) (cid:17)
where Jp2−1 is the Bessel function of the first kind and with parameter p2 −1
where
2 q
p = 2 − +n.
1 q
−
According to Gegenbauer [4, 367, eq.16],
2νΓ ν + 1 Γ 1 Jν (Z)Jν (z) = π Jν √Z2 +z2 −2Zzcosφ sin2ν φdφ.
2 2 Zν zν 0 ((cid:16)Z2 +z2 2Zzcosφ)ν2 (cid:17)
(cid:18) (cid:19) (cid:18) (cid:19) Z
−
Choosing Z = a √uTKu, z = a √uTKu, λ = cosφ and ν = p 1, this
1 2 − 2 −
equality can be rewritten as
ϕ (u)ϕ (u) = ϕ (u)
a1X1 a2X2 √a21+a22+2λa1a2X
where λ is distributed according to
f (λ) = Γ(ν +1) 1 λ2 ν−12 .
Γ ν + 1 Γ 1 −
2 2 (cid:16) (cid:17)
(cid:16) (cid:17) (cid:16) (cid:17)
7
Since q is defined by
λ
1 1 p 1
= ν = − 1,
1 q − 2 2 −
λ
−
we deduce (6).
Let us recall the scaling behavior of Gaussian vectors
a X +a X a2 +a2X
1 1 2 2 1 2
∼
q
which can be probabilistically interpreted in the context of α stable distri-
−
butions: a distribution f is α stable if, for X and X independent with
α 1 2
−
distribution f , the linear combination
α
1
a1X1 +a2X2 ( a1 α + a2 α)α X,
∼ | | | |
whereX followsagaindistributionf .Thus,aGaussiandistributionisα stable
α
−
with α = 2. The result of Thm.1 can be viewed as follows: q Gaussians are
−
not α stable (unless q = 1 which corresponds to the Gaussian case α = 2);
−
however, their scaling behavior is close to the Gaussian α = 2 case, except for
the fact that the scaling variable a a includes an additional random term
1 2
◦
λ.
5 Geometric interpretation
Geometrically, the Gaussian scaling factor a2 +a2 can be interpreted, ac-
1 2
cording to Pythagoras’ theorem, as the lenqgth of the hypotenuse of a right
triangle with sides of lengths a and a . The q Gaussian case corresponds
1 2
| | | | −
toatriangleforwhichtheanglebetween a and a ,letuscallitφ,fluctuates
1 2
| | | |
around rectangularity.
The distribution of the angle φ where λ = cosφ is given by
−
Γ(ν +1) 1 1
f (φ) = sin2ν φ, 0 φ π, ν = + .
φ Γ ν + 1 Γ 1 ≤ ≤ 1 qλ 2
2 2 −
(cid:16) (cid:17) (cid:16) (cid:17)
This distributions is shown in Figure 3 for values of the parameter q =
0.99, 0.9, 0.5 and 0.1 (top to bottom).
8
a
1 a oa
1 2
a
2
Figure 2. the geometric interpretation of a1 a2 in the Gaussian case (q = 1 left); in
◦
the q Gaussian case (left), a1 a2 is randomly chosen as one of the hypothenuses
− ◦
represented, the angle φ between sides a1 and a2 being distributed as shown on
Figure 3
f
f
(
$
&
'
!
#
f
#"# #"( !"# !"( '"# '"( &"#
%
Figure 3. the distribution of angle φ for values of the parameter q = 0.99, 0.9, 0.5
λ
and 0.1 (top to bottom).
We remark that this distribution is symmetric around the angle φ = π and
2
that, as q 1, the angle φ becomes deterministic and equal to π. Further,
→ 2
the usual scaling law for Gaussian distributions (1) is recovered.
9
5.1 An optical analogy
We remark that formula (5) exhibits a close resemblance with the interference
formula for the amplitude of the superposition of two optical beams. Interfero-
metric optical testing is based on these phenomena of interference. Two-beam
interference is the superposition of two waves, such as the disturbance of the
surface of a pond by a small rock encountering a similar pattern from a second
rock. When two wave crests reach the same point simultaneously, the wave
height is the sum of the two individual waves. Conversely, a wave trough and a
wave crest reaching a point simultaneously will cancel each other out. Water,
sound, and light waves all exhibit interference. A light wave can be described
by its frequency, amplitude, and phase, and the resulting interference pattern
between two waves depends on these properties, among others. Our present
interest lies in the two-beam interference equation. It gives the irradiance I
[6] for monochromatic waves of irradiance I , and I in terms of the phase
1 2
difference ∆ expressed as cosφ = cos(φ φ ). We have
1,2 1 2
−
I = I +I +2 I I cosφ,
1 2 1 2
and, in terms of the A amplitudes I = Aq2,
−
A2 = A2 +A2 +2A A cosφ.
1 2 1 2
If theemission ofthe two beams couldbeso arrangedthat thephase difference
becomes random [7,8,9], this physical analogy would be exact.
5.2 Study of the composition law
◦
The composition law
a a = a2 +a2 +2λa a
1 2 1 2 1 2
◦
q
has been studied in [5], in the more general case where a and a are indepen-
1 2
dent, positive random variables. The associativity result is as follows
Theorem 8 [5, p.18 thm.1] The composition law is associative if and only
◦
if either
a a = a2 +a2
1 2 1 2
◦
or q
a a = a + a
1 2 1 2
◦ | | | |
or
a a = a2 +a2 +2λa a (7)
1 2 1 2 1 2
◦
where λ G (0,1) for some q 0q.
q
∼ ≥
10