Table Of ContentStatisticalScience
2007,Vol.22,No.3,407–413
DOI:10.1214/0883423060000000097
(cid:13)c InstituteofMathematicalStatistics,2007
Majorization: Here, There and
Everywhere
Barry C. Arnold
8
0 Dedication
0 This article is written for Ingram Olkin on the occasion of his 80th
2
birthday. Ingram has provided inspiration for me over the last 40 years
n and continues to inspire. I am indebted to him for his encouragement
a
and support throughout my career. I am contributing this humbly in
J
8 the sure knowledge that he could have written it better than I.
2
]
E
Abstract. The appearance of Marshall and Olkin’s 1979 book on in-
M
equalities with special emphasis on majorization generated a surge of
. interest in potential applications of majorization and Schur convexity
t
a
in a broad spectrum of fields. After 25 years this continues to be the
t
s case. The present article presents a sampling of the diverse areas in
[
which majorization has been found to be useful in the past 25 years.
1
v Key words and phrases: Inequalities, Schur convex, covering, waiting
1 time, paired comparisons, phase type, catchability, disease transmis-
2
sion, apportionment, statistical mechanics, random graph.
2
4
.
1
1. INTRODUCTION that. They heroically had sifted the literature and
0
8 endeavored to arrange ideas in order, often provid-
Prior to the appearance of the celebrated volume
0 ing references to multiple proofs and multiple view-
: Inequalities: Theory of Majorization and Its Appli-
v points on key results, with reference to a variety of
cations (MarshallandOlkin,1979)manyresearchers
i applied fields. Many of the key ideas relating to ma-
X
wereunawareoftherichbodyofliteraturerelatedto
jorization were already discussed in the (also justly
r majorizationthatwasscatteredinjournalsinawide
a celebrated) volume entitled Inequalities by Hardy,
varietyoffields.Indeed,manymajorizationconcepts
Littlewood and Po´lya (1934). Indeed, this slim vol-
had been reinvented and often rechristened in dif-
ume still merits occasional revisits since there re-
ferent research areas (e.g., as Lorenz or dominance
main in it many “seedlings for further research” (to
ordering in economics), complicating the difficulties
borrow Kingman’s apt descriptive phase). Of course
for the researcher when trying to relate current re-
the Hardy, Littlewood and Po´lya volume, though
search to the extant corpus. Of course, the appear-
slim and printed on small pages, was all meat and
ance of the Marshall and Olkin volume changed all
no gravy: more like a series of insightful telegrams.
Only a relatively small number of researchers were
Barry C. Arnold is Professor, Department of Statistics, inspired by it to work on questions relating to ma-
University of California, Riverside, California 92521, jorization.
USA e-mail: [email protected]. Butthings weredifferentafter 1979. Marshall and
Olkinsoldtheproductmuchmoreeffectively.When-
This is an electronic reprint of the original article
published by the Institute of Mathematical Statistics in ever asituation was encountered inwhich a solution
Statistical Science, 2007, Vol. 22, No. 3, 407–413. This or an extreme case involved a discrete uniform dis-
reprint differs from the original in pagination and tribution,thepossibilityofamajorizationproofwas
typographic detail. now apparent if not to all, certainly to many, and
1
2
B. C. ARNOLD
certainly in many different areas of research. More- The extremal case under the majorization order
over, if a uniform allocation or distribution was in corresponds to the choice x =( n x )/n. In par-
i j=1 j
a sense optimal, then the concept of majorization ticular then, a Schur convex fuPnction will take on
frequently could be used to order competing alloca- a larger value when there is some variability in x
tions or distributions. than it does when there is no variability [i.e., when
Naturally extensions of the majorization concept x =x¯=( n x )/n,i=1,2,...,n].
i j=1 j
were possible and indeed many have been fruitfully Many exPamples of Schur convex functions can of
introduced. The focus of the present article is, how- course be found in the literature. Perhaps the sim-
ever, on classical majorization. The goal is to pro- plest example is what is called a separable convex
vide a hint (via selected examples from the post- function. It is of the form
1979literature)ofthevastarrayofsettingsinwhich
n
majorization provides a useful and interpretable or- g(x)= h(x ),
i
dering. In no sense can such a survey be complete. I Xi=1
apologize, in advance, to researchers who, quite le-
where h is a convex function.
gitimately, can point to papers of their own which
We now begin our tour of examples in the liter-
they feel would be even better illustrations of the
ature in which majorization makes cameo and/or
theme: Majorization, here, there and everywhere.
starring appearances.
Nevertheless it is my hope that the examples se-
Onecanevenconsideravariation ofthechildren’s
lected will be found to be interesting, to be suffi-
game “Where’s Waldo?”. In that game a very com-
ciently diverse in order to illustrate the potential
plicatedpictureisprovidedinwhich,hiddenaway,is
ubiquityof dispersionordering(a.k.a. majorization)
a picture of the hero Waldo. He is always there, but
concepts and,perhaps,toinspireresearchers to seek
he is often hard to find. Similarly we can view vari-
evenmoreresearchnichesinwhichmajorizationand
ous areas of statistical research and/or applications
Schur convexity will play a useful role.
asbeingrathercomplicated scenesinwhichperhaps
Waldo,a.k.a.majorization,maywellbelurking.The
2. SOME NEEDED DEFINITIONS
search begins.
We will say that a vector x Rn majorizes an-
other vector y Rn and write∈x y if for each 3. COVERING A CIRCLE WITH RANDOMLY
∈ ≻
k=1,2,...,n 1 we have PLACED ARCS
−
k k
Suppose that n arcs of lengths ℓ ,ℓ ,...,ℓ are
x y 1 2 n
i:n≤ i:n placed independently and uniformly on the unit cir-
Xi=1 Xi=1
cle (a circle with unit circumference). Let P(ℓ) de-
and
note the probability that the unit circle is com-
n n
pletely covered by these arcs. The problem is only
x = y .
i:n i:n
interesting when the total length of the arcs L =
Xi=1 Xi=1
n ℓ exceeds 1, the circumference of the circle.
In the above we denote the ordered coordinates of a i=1 i
WPe therefore assume that L>1. In the special case
veActofrunxc∈tiRonngb:yRxn1:n≤Rx2is:nsa≤id··t·o≤bxen:Snc.hur con- in which the arcs are of equal lengths (say ℓ¯=L/n),
→ the required probability was provided by Stevens
vex if x y implies g(x) g(y). For additional de-
≻ ≥ (1939). Specifically we have
tails and alternative characterizations of majoriza-
tion and Schur convexity, we naturally refer to Mar- n n
(3.1) P(ℓ¯1)= ( 1)k (1 kℓ¯)n−1.
shall and Olkin (1979). − (cid:18)k(cid:19) − +
In short, the vector x majorizes y if the coordi- kX=0
nates of x are more dispersed than are the coordi- At the other extreme, if one arc is of length L and
nates of y, subject to the constraint that the sum of the others of length 0, coverage is certain. It would
the coordinates of x and of y is the same. appear then that, in this situation, increasing the
ASchurconvexfunctionthenisonethatincreases variability among the ℓ ’s subject to the sum be-
i
as dispersionincreases (wheretheconceptof disper- ing equal to L, might well be associated with an
sion used is specifically linked to the majorization increase in the coverage probability. Proschan con-
order). jectured that P(ℓ) is a Schur convex function. It is
3
MAJORIZATION:HERE, THERE ANDEVERYWHERE
indeedSchurconvex butit isnot thateasy to verify. each of the k teams in the league will have played
Details were provided by Huffer and Shepp (1987). each other team agiven number,say p,oftimes.For
Not surprisingly, the argument is based on study- simplicity, we ignore such factors as home field ad-
ing the effect on P(ℓ) of making a small change in vantage and we assume that the rules of the league
two unequal ℓ ’s (to make them more alike) holding excludethepossibilityofties.Similaranalysismight
i
the other lengths fixed. Waldo is here, but he is not wellbeappliedtotaste-testingexperimentsandother
easily unmasked. paired comparison scenarios, but we will follow Joe
(1988) and focus on the sports setting.
4. WAITING FOR A PATTERN In modeling this scenario, it is convenient to con-
sider a k k matrix P = (p ) in which, for i =
If we seat a monkey at a keyboard and have him × ij 6
j,p denotes the probability that team i will beat
type letters, spaces and punctuation marks at ran- ij
team j inaparticulargame. Ofcoursewehave p +
dom,itiscommonknowledgethateventually hewill ij
p = 1, recalling our assumption that ties do not
producea perfectly typed version of the Gettysburg ji
occur. We leave the diagonal elements of P empty
Address and, for that matter, the entire contents of
so that P has n(n 1) nonnegative elements. The
the2004 edition of the Encyclopedia Brittanica. But
−
strength of a particular team, say team i, is to some
wewould have to wait arather longtime tosee this.
extent measured by its correspondingrow total p =
The mathematical formulation of the monkey’s i
p . For a given vector p of team strengths, we
activities involves observing a sequence X ,X ,... j6=i ij
1 2
cPan consider the class P(p) of all probability matri-
of independent identically distributed random vari-
ces P with only off-diagonal elements defined and
ables with possible values 1,2,...,k and associated
with row totals given by p.
positiveprobabilitiesp ,p ,...,p .LetN denotethe
1 2 k
It is reasonable to assume that if team i is better
waiting time until a particular consecutive string of
than team j (i.e., if p 0.5) and if team j is better
outcomes is observed, or one of a particular set of ij
≥
thanteamk,thenteamishouldbebetterthanteam
outcomestringsisobserved.Ifwearewaitingforthe
k.
string t ,t ,...,t where each t is a number cho-
1 2 ℓ j
Joe calls the matrix P weakly transitive if p
sen from the set 1,2,...,k, there are several ways in ij
≥
0.5 and p 0.5 imply p 0.5. A stronger con-
whichvariability canaffectthewaitingtimerandom jk ik
≥ ≥
dition is also plausible. He defines P to be strongly
variable N. The random variable will be affected by
transitive if p 0.5 and p 0.5 imply p
variability among the pi’s, the probabilities of the ij ≥ jk ≥ ik ≥
individual possible values of the X’s. It will be also max(pij,pjk).
affected by the variability among the t ’s appearing Where does majorization come into this picture?
j
in the string whose appearance we are awaiting. For Each matrix P in (p) can be rearranged as an
P
example, we might expect to have to wait longer for n (n 1)-dimensional row vector denoted by P∗.
× −
a string of ℓ consecutive like outcomes than for a We will write P Q iff P∗ Q∗ in the usual sense
≺ ≺
string of ℓ distinct outcomes. Possibilities for a role of majorization. A matrix P (p) is said to be
∈ P
for majorization abound here. minimal if Q P implies Q∗=P∗ up to rearrange-
≺
In particular, Ross (1999) considers the waiting ment. Joe (1988) verifies that any strong transitive
time N until we observe a run of k observed values P is minimal. Variations in which ties and home
of the X ’s that includes all k of the possible values field advantage are considered are also discussed in
i
of the X ’s, as a function of p=(p ,...,p ). Here Joe (1988).
i 1 k
indeeditispossibletoverifythatforeveryn,P(N >
n) is aSchur convex function of p,and consequently 6. PHASE TYPE DISTRIBUTIONS
that E(N) is also Schur convex as a function of p.
In a continuous-time Markov chain with (n+1)
The shortest waiting time is thus associated with
states, of which n states (1,2,...,n) are transient
the case in which the p ’s are all equal to 1/k.
j and state n+1 is absorbing, the time T until ab-
sorption in state n+1 is said to have a phase type
5. PAIRED COMPARISONS
distribution (Neuts, 1975). Such distributions are
The theory of paired comparisons has found con- parameterized by an initial distribution vector for
siderable application in the study of professional the chain, α=(α ,α ,...,α ) (we assume that the
1 2 n
sporting contests. At the end of a typical season probability of beginning in state n+1 is 0), and
4
B. C. ARNOLD
a matrix of intensities of transitions among the n representedamongthecapturedbutterflies.Wemay
transient states Q. The elements of Q satisfy q < well use r (and n) to help us estimate ν.
ii
0,i=1,2,...,n, and q 0,j =i. In such a setting Atypicalstochasticmodelforthisproblemisbased
ij
≥ 6
T is said to have a phase type distribution with ontheassumptionthatbutterfliesfromspeciesj,j=
parameters α and Q and we write T PH(α,Q). 1,2,...,ν,enterthetrapaccordingtoaPoisson (λ )
∼ j
A very simple example is the one in which α = process and that these Poisson processes are inde-
α∗=(1,0,0,0,...,0) and Q=Q∗ where q∗ = δ, i pendent. Define p = λ / ν λ . The probability
ii − ∀ j j i=1 i
and qi∗j =δ for j =i+1 while qij =0 otherwise. In that a particular butterflyPtrapped is from species
this situation the chain begins in state 1, and then j is then given by p ,j = 1,2,...,ν. The p ’s can
j j
successively moves through states 2,3,...,n, spend- be interpreted as measures of “catchability” of the
ing an exponential (δ) time in each state. Conse- various species. The simplest model is that of equal
quently the time to absorption, say T∗, will be a catchability (i.e., p =1/ν,j =1,2,...,ν). If we as-
j
sum of n i.i.d. exponential random variables and so sume that ν n, then, under the equal catchability
T∗ gamma(n,δ) (in queueing contexts this is of- model, a min≤imum variance unbiased estimate of ν,
∼
ten called the Erlang distribution rather than the
based on r, exists. It is given by
gamma distribution).
Wesaythataphasetypedistributionisofordern (7.1) νˆ=S(n+1,r)/S(n,r)
if n is thesmallest integer suchthat thedistribution
where S(n,x) is a Stirling number of the second
canbeidentified withtheabsorptiontimeof achain
kind.What happenswhenthespecies vary incatch-
with n transient states and one absorbing state. It
ability? In an extreme case in which one partic-
appears that, in some sense, T∗ exhibits the most
ular species is easily trapped and the others are
regular behavior of any phase type distribution of
extremely difficult to trap, we will usually observe
order n. This can be made precise in terms of what
r=1 and will consequently badly underestimate ν.
is called the Lorenz order, a natural extension of
Indeed as Nayak and Christman (1992) observe, the
majorization.
random number R of species captured has a distri-
Let denote the class of nonnegative random
L bution which is a Schur convex function of p. Thus
variableswithfinitepositiveexpectations. (Thiscan
the estimate (7.1) and other estimates which are
beextendedtoallowtherandomvariablestoassume
sensible under equal catchability will be negatively
negative values, but for our present purposes this is
biased with the bias increasing as the catchability
not needed.) For X and Y in , we will write X
L
L ≤ becomes more variable.
Y iff E(g(X/E(X))) E(g(Y/E(Y)) for every con-
≤
tinuousconvexfunctiong.Majorizationcanbeiden-
8. DISEASE TRANSMISSION
tified as a special case here by choosing X and Y to
each have n equally likely values x ,x ,...,x and
1 2 n Tong (1997) identifies an interesting majorization
y ,y ,...,y ,respectively,withE(X)=E(Y).More
1 2 n feature of a disease transmission model due to
detailed discussion of the Lorenz order on may be
Eisenberg (1991). Consider a closed population of
L
found in Arnold (1987). Aldous and Shepp (1987)
n+1 individuals. One individual (number n+1) is
showed that T∗ [with its gamma(n,δ) distribution]
susceptible to the disease but as yet is uninfected.
hasthesmallestcoefficientofvariationamongphase
The other n individuals are carriers of the disease.
type distribution of order n, that is, it minimizes
If individual n+1 has a single contact with individ-
E(( T )2). More generally, O’Cinneide (1991) ver-
E(T) uali,wedenotetheprobabilityof avoidinginfection
ified that T∗ T for any variable T that is phase
L by p ,i=1,2,...,n.
type of order≤n, thus confirming the fact that T∗ i
Itisassumedthatindividualn+1makesatotalof
exhibits the least “variability” (as measured by the
J contacts with individuals in the population in ac-
Lorenz order).
cordancewithapreferencevectorα=(α ,α ,α ,...,
1 1 2
α ), where α >0,i=1,2,...,n, and n α =1.
7. CATCHABILITY n i i=1 i
In addition, individual n+1 has a lifePstyle vector
An island community contains an unknown num- k =(k ,k ,...,k ) where the k ’s are nonnegative
1 2 J i
berν ofspeciesofbutterflies.Butterfliesaresequen- integers summing to J. For given vectors α and k,
tially trapped until n individuals have been cap- theindividualn+1proceedsasfollows. He/shefirst
tured. Denote by r, the number of distinct species picks a partnerfrom among the n carriers according
5
MAJORIZATION:HERE, THERE ANDEVERYWHERE
to the preference vector α. Thus he/she will select used to arrive at an assignment of integer-valued
individual 1 with probability α , individual 2 with numbers of seats to every party in a manner essen-
1
probability α , and so on. He/she then makes k tially reflecting proportional representation? This is
2 1
contacts with this partner. Then he/she selects a not a new problem. Several very well-known Amer-
second partner (it could be the same one) accord- ican politicians have proposed methods of round-
ing to the preference vector α and has k contacts ing for use in this situation. Balinski and Young
2
with this partner. The process terminates after all (2001)provideagoodsurveyofthemethodsusually
J = J k contacts have been made. Denote the considered. Marshall, Olkin and Pukelsheim (2002)
i=1 i
probPability of escaping infection by H(k,α,p), de- highlight the role of majorization in comparing the
pending as it does on lifestyle (k), preference (α) various candidate rounding methods. John Quincy
and variable nontransmission probabilities (p). Adams proposed a method that was kind to small
There are several possible roles for majorization parties (rounding up their representation), while at
here. Variability among the coordinates of k,α theother extreme Thomas Jefferson urgedrounding
and/or p can be expected to affect H(k,α,p). Tong down, which favors large parties. Other popular in-
(1997)focusesonthelifestylevectork.Twoextreme termediate strategies are associated with the names
lifestyles are readily identified. The first one corre- Dean, Hill and Webster.
sponds to k =(J,0,0,...,0) which could be called It is easiest to describe all of these apportion-
a monogamous style. Here a partner is randomly ment methods in terms of a sequence of signposts
chosen according to the preference vector α and all which determine rounding decisions. The signposts
contacts are made with this individual. The sec- s(k) are numbers in the interval [k,k+1] such that
ond extreme lifestyle has k=(1,1,1,...,1). In this s(k) is a strictly increasing function of k. The cor-
case each contact is made with a randomly cho- responding rounding rule is that a number in the
sen individual. Theprobability of escaping infection interval [k,k+1] is rounded down if it is less than
with k=(J,0,...,0) is clearly n α pJ while the s(k) and is rounded up if it is greater than s(k). If
i=1 i i
probability of escaping infectioPn using the lifestyle the number is exactly equal to s(k), then we may
(1,1,1,...,1) is ( n α p )J. It follows via Jensen’s round up or down. So-called power-mean signpost
i=1 i i
inequality that oPne has a larger probability of es- sequences have been popular. They are of the form
caping infection with the “monogamous” lifestyle
kp (k+1)p 1/p
(J,0,...,0) thanwith the“random” lifestyle (1,1,1, s (k)= + ,
p
...,1). This holds for every α and every p. But of (9.1) (cid:18) 2 2 (cid:19)
coursethesetwo lifestyles areextremecases withre- p .
−∞≤ ≤∞
gard to majorization. It is then quite plausible that
The five most popular apportionment methods can
the probability of escaping infection is a Schur con-
all be viewed as having been based on a particu-
vex function of the lifestyle vector k. Indeed, Tong
lar power-mean signpost sequence. The Adams rule
(1997) confirms this conjecture. He also is able to
(rounding up) corresponds to p = , the Dean
get some results when the number J of contacts is a −∞
rulecorrespondstop= 1,theHillrulecorresponds
random variable. Several interesting aspects of this −
to p=0, the Webster rule to p=1 and finally the
problem remain open.
Jefferson rule (rounding down) corresponds to p=
. Marshall, Olkin and Pukelsheim (2002) show
9. APPORTIONMENT IN PROPORTIONAL ∞
that the seating vector produced by a power-mean
REPRESENTATION
rounding rule of order p will always be majorized
Theidealofoneman–onevoteisoftenapproached by the seating vector produced by a power-mean
by the device of proportional representation. Thus rounding rule of order p′ if and only if p p′. Con-
≤
if there are N seats available and if a political party sequently, among the five popular apportionment
received 100q% of the votes, then ideally that party rules,thechange whenmoving fromthe Adamsrule
should be assigned Nq seats. But fractional seats toward the Jefferson rule is a change in favor of
cannot be assigned (or better yet are not assigned, largepartiesinamajorizationsense.Themovefrom
since there seems to be no reason why they could an Adams apportionment toward a Jefferson appor-
not be assigned, except perhaps for aesthetic con- tionmentcanactually beaccomplishedbyaseriesof
siderations). Which method of rounding should be single seat reassignments from a poorer party (with
6
B. C. ARNOLD
fewervotes)toaricherparty(withmorevotes)[par- X(n) are independent identically distributed ran-
alleling reverse Robin Hood (a.k.a. Pigou–Dalton) dom variables each with possible values 1,2,...,n
income transfers in an economic setting]. and with common distribution defined by
(11.1) P(X(i)=j)=p , j=1,2,...,n,
j
10. MAJORIZATION IN STATISTICAL
MECHANICS where pj ≥0,∀j and nj=1pj =1. We construct the
randomgraphbydrawPingthenrandomarcs(i,X(i)),
The state space of a physical system, S , can be
n i=1,2,...,n.Inthismanner,onearcemanatesfrom
identified with the set of all probability vectors p=
each node. However, of course, several arcs can ter-
(p1,p2,...,pn)′ wherepi≥0 and ni=1pi=1. Ause- minate at the same node. The resulting graph will
ful partial order in this contextPis related to the have a random number of connected components. A
information content of the states. For two states connected component of the graph is a set of nodes
p and q, it is prescribed that p q iff there exists such that any pair of them is linked by an arc in the
≺
a doubly stochastic matrix T with p=Tq. But of graph, and there are no arcs joining any nodes in
course, appealing to the classical result of Hardy, the set with any node outside the set. Let us de-
Littlewood and Po´lya (1929), this is in fact the ma- note the random number of such connected sub-
jorization partial order (and the notation is thus sets by M. The distribution of M will of course
consistent with our usage in earlier sections of this be influenced by the probability vector p, appear-
paper). In this context separable concave functions ing in (11.1), which governs the distribution of the
are called generalized entropies. random arcs X(1),X(2),...,X(n).
A related partial order is defined on k-tuples of Forexample,ifp=(1,0,0,...,0),thenallarcswill
states. For two k-tuples (p ,p ,...,p ) and (q ,q , terminate at node 1 and there will be a single con-
1 2 k 1 2
...,q ) we define nected subsetof nodesin the randomgraph,that is,
k
M =1.
(p ,p ,...,p ) (k)q ,q ,...,q ) Thefollowing expression for the expected value of
1 2 k ≺ 1 2 k
M is provided by Ross:
iff there exists a stochastic matrix T such that p =
i
Tq ,i=1,2,...,k. In particular when k=2, a par- (11.2) E(M)= (S 1)! p
j
i | |−
tial ordering defined with respect to a reference XS jY∈S
state s becomes of interest. The partial order rel- where the summation extends over all nonempty
ative to s is defined by subsetsof 1,2,...,n .Itisthen possible,usingthis
{ }
(10.1) p q iff (p,s) (2)(q,s). expression, to verify that E(M) is a Schur concave
s
≺ ≺ function of p. Consequently the expected number of
It may be noted that if s is chosen to be equal to connected components of the graph is maximized if
e=(1,..., 1), then the corresponding partial order p =1/n,j=1,2,...,n.
n n j
(relative to e) coincides with the usual majorization
order. Thus the partial ordering is a genuine ex- 12. A STOCHASTIC RELATION BETWEEN
s
≺
tension of the classical majorization order. THE SUM OF TWO RANDOM VARIABLES
Dynamic processes in the state space S can be AND THEIR MAXIMUM
n
identified with indexed families of stochastic matri-
Suppose that X = (X ,X ) is a random vector
1 2
ces. Such processes which preserve the s-partial or-
withnonnegativecoordinaterandomvariablesX ,X .
1 2
der have been studied in some detail. A convenient
It is often of interest to compare the tail behavior
introductory reference is Zylka (1985).
of X ,X with that of max(X ,X ). In the context
1 2 1 2
Schurconvexfunctionsandanalogouss-Schurcon- of construction of confidence intervals for the differ-
vex functions turn out to have useful thermody- ence between normal means with unequal variances
namic interpretation in this context. (aBehrens–Fishersetting),Dalal andFortini(1982)
identified a sufficient condition for stochastic order-
11. CONNECTED COMPONENTS ing between X +X and √2max(X ,X ) that in-
1 2 1 2
IN A RANDOM GRAPH volves Schur convexity. Specifically they prove that
a sufficient condition for
Ross (1981) considers a random graph with nodes
numbered 1,2,...,n. Suppose that X(1),X(2),..., P(X +X c) P(√2max(X ,X ) c)
1 2 1 2
≤ ≥ ≤
7
MAJORIZATION:HERE, THERE ANDEVERYWHERE
for any c 0, is that the joint density of (X ,X ), REFERENCES
1 2
≥
say f(x ,x ), is such that f(√x ,√x ) is a Schur
1 2 1 2 Aldous, D. and Shepp, L. (1987). The least variable phase
convex function of x. The proof involves condition-
typedistributionisErlang.Comm.Statist.StochasticMod-
ing on X12+X22 and observing that on any curve els 3 467–473. MR0925936
x21+x22=t, the joint density f(x1,x2) increases as Arnold, B. C. (1987). Majorization and the Lorenz Order:
one moves away from the line x =x . A Brief Introduction. Springer, Berlin. MR0915159
1 2
An important special case in which the hypothe- Balinski, M. L. and Young, H. P. (2001). Fair Represen-
tation Meeting the Idea of One Man, One Vote, 2nd ed.
ses are satisfied is the situation in which (X ,X )=
1 2
Brookings InstitutePress, Washington, DC. MR0649246
(Y , Y ) where Y N(2)(0,σ2 1 ρ ).
| 1| | 2| ∼ ρ 1 Dalal,S.R.andFortini,P.(1982).Aninequalitycompar-
A related n-dimensional result(cid:0)is a(cid:1)lso provided by ing sums and maxima with application to Behrens–Fisher
Dalal and Fortini (1982). They show that if X1,X2, typeproblem. Ann. Statist. 10 297–301. MR0642741
...,X arei.i.d.positiverandomvariableswithcom- Eisenberg, B. (1991). The effect of variable infectivity on
n
mondensityf andiflogf(√x)isconcaveandf(x)/x therisk of HIV infection. Statist. Medicine 9 131–139.
Hardy, G. H., Littlewood, J. E. and Po´lya, G. (1929).
is nonincreasing, then
Some simple inequalities satisfied by convex functions.
n Messenger of Mathematics 58 145–152.
X √nmax(X ,X ,...,X ). Hardy, G. H., Littlewood, J. E. and Po´lya, G. (1934).
i st 1 2 n
≤
Xi=1 Inequalities. Cambridge Univ.Press.
Huffer,F. W.andShepp,L.A.(1987).Ontheprobability
13. FURTHER EXAMPLES of covering the circle by random arcs. J. Appl. Probab. 24
422–429.MR0889806
The list could be continued. Schur convexity and Joe, H. (1988). Majorization, entropy and paired compar-
majorization can be found in many other settings. isons. Ann. Statist. 16 915–925. MR0947585
Marshall,A.W.andOlkin,I.(1979).Inequalities:Theory
To conclude our short survey, we will merely men-
ofMajorizationandItsApplications.AcademicPress,New
tion briefly a few more interesting settings in which
York.MR0552278
Waldo appears:
Marshall, A. W., Olkin, I. and Pukelsheim, F. (2002).
A majorization comparison of apportionment methods in
(i) the study of peakedness of univariate and
proportionalrepresentation.Soc.ChoiceWelf.19885–900.
multivariate distributions,
MR1935010
(ii) admissibility of tests in multivariate analysis Nayak, T. K. and Christman, M. C. (1992). Effect of un-
of variance, equalcatchability on estimates ofthenumberofclasses in
(iii) probabilitycontentofregionsforaSchurcon- a population. Scand. J. Statist. 19 281–287. MR1183202
Neuts, M. F. (1975). Computational uses of the method of
cave joint density,
phases in the theory of queues. Comput. Math. Appl. 1
(iv) the study of diversity in ecological environ-
151–166.MR0386055
ments,
O’Cinneide,C.A.(1991).Phase-typedistributionsandma-
(v) income and wealth inequality measurement jorization. Ann. Appl. Probab. 1 219–227. MR1102318
(with multivariate extensions). Ross, S. (1981). A random graph. J. Appl. Probab. 18 309–
315. MR0598950
As observed in the Introduction, there are many Ross,S.(1999).Themeanwaitingtimeforapattern.Probab.
more examples in the literature and there is no rea- Engrg. Inform. Sci. 13 1–9. MR1666364
sontobelievethatthesearchfornewapplicationsof Stevens,W.L.(1939).Solutiontoageometricalproblemin
probability.Ann. Eugenics 9 315–320. MR0001479
majorization and Schur convexity will falter in the
Tong, Y. L. (1997). Some majorization orderings of hetero-
next 25 years. When the Inequalities volume cele-
geneity in a class of epidemics. J. Appl. Probab. 34 84–93.
bratesitsgoldenjubilee,anevenmoreextensiveand
MR1429057
fascinating array of appearances can be confidently Zylka, C. (1985). A note on the attainability of states by
predicted.ThesearchforWaldowillcontinueapace. equalizing processes. Theor. Chim. Acta 68 363–377.