Table Of ContentEpidemics and percolation in small-world networks
Cristopher Moore1,2 and M. E. J. Newman1
1Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501
2Departments of Computer Science and Physics, University of New Mexico, Albuquerque, New Mexico 87131
0 We study some simple models of disease transmission on small-world networks, in which either
0 the probability of infection by a disease or the probability of its transmission is varied, or both.
0 Theresultingmodelsdisplayepidemicbehaviorwhentheinfectionortransmissionprobabilityrises
2 above the threshold for site or bond percolation on the network, and we give exact solutions for
the position of this threshold in a variety of cases. We confirm our analytic results by numerical
n
a simulation.
J
7
I. INTRODUCTION They solved this model for the one-dimensional small-
]
h world graph, and the solution was later generalized to
c Ithaslongbeenrecognizedthatthestructureofsocial higher dimensions [8] and to finite-sized lattices [9]. In-
e networksplaysanimportantrole in the dynamics of dis- fectionwith100%efficiencyisnotaparticularlyrealistic
m
ease propagation. Networks showing the “small-world” model however, even for spectacularly virulent diseases
- effect [1,2], where the number of “degrees of separation” like Ebola fever, so Newman and Watts also suggested
t
a between any two members of a givenpopulation is small using a site percolation model for disease spreading in
st bycomparisonwiththesizeofthepopulationitself,show which some fraction p of the population are considered
t. much faster disease propagation than, for instance, sim- susceptible to the disease, and an initial outbreak can
a ple diffusion models on regular lattices. spread only as far as the limits of the connected cluster
m
Milgram [3] was one of the first to point out the exis- of susceptible individuals in which it first strikes. An
- tence of small-world effects in real populations. He per- epidemic can occur if the system is at or above its per-
d
n formed experiments which suggested that there are only colation threshold where the size of the largest cluster
o aboutsixintermediateacquaintancesseparatinganytwo becomes comparable with the size of the entire popula-
c peopleontheplanet,whichinturnsuggeststhatahighly tion. Newman and Watts gave an approximate solution
[ infectiousdiseasecouldspreadtoallsixbillionpeopleon for the fraction pc of susceptible individuals at this epi-
2 the planet in only about six incubation periods of the demic point, as a function of the density of shortcuts on
v disease. thelattice. Inthispaperwederiveanexactsolution,and
2 Early models of this phenomenon were based on ran- alsolookatthe caseinwhichtransmissionbetweenindi-
9 dom graphs [4,5]. However, random graphs lack some viduals takes place with less than 100%efficiency, which
4 of the crucial properties of real social networks. In par- can be modeled as a bond percolation process.
1
ticular, social networks show “clustering,” in which the
1
probability of two people knowing one another is greatly
9 II. SITE PERCOLATION
9 increased if they have a common acquaintance [6]. In
/ a random graph, by contrast, the probability of there
t
a being a connection between any two people is uniform, Twosimple parametersofinterestinepidemiologyare
m regardless of which two you choose. susceptibility, the probability that an individual exposed
- Watts and Strogatz [6] have recently suggested a new to a disease will contract it, and transmissibility, the
d “small-world model” which has this clustering property, probability that contact between an infected individual
n
and has only a small averagenumber of degrees of sepa- andahealthybutsusceptibleonewillresultinthe latter
o
rationbetweenanytwoindividuals. Inthis paperweuse contracting the disease. In this paper, we assume that a
c
: a variant of the Watts–Strogatz model [7] to investigate disease begins with a single infected individual. Individ-
v
diseasepropagation. Inthis variant,the populationlives uals are represented by the sites of a small-world model
i
X on a low-dimensional lattice (usually a one-dimensional andthediseasespreadsalongthebonds,whichrepresent
r one) where each site is connected to a small number of contacts between individuals. We denote the sites as be-
a neighboring sites. A low density of “shortcuts” is then ing occupied or not depending on whether an individual
addedbetweenrandomlychosenpairsofsites,producing issusceptibletothedisease,andthebondsasbeingoccu-
much shorter typical separations, while preserving the piedornotdependingonwhetheracontactwilltransmit
clustering property of the regular lattice. thediseasetoasusceptibleindividual. Ifthedistribution
NewmanandWatts[7]gaveasimpledifferentialequa- of occupied sites or bonds is random, then the problem
tion model for disease propagation on an infinite small- ofwhenanepidemic takesplace becomes equivalentto a
worldgraphinwhichcommunicationofthediseasetakes standard percolation problem on the small-world graph:
place with 100% efficiency—all acquaintances of an in- whatfractionpcofsitesorbondsmustbeoccupiedbefore
fected person become infected at the next time-step. a “giantcomponent” ofconnected sites forms whose size
1
canbereachedbytravelingalongasingleshortcut. Then
weaddalllocalclusterswhichcanbereachedfromthose
newonesbytravelingalongasingleshortcut,andsoforth
until the connected cluster is complete. Let us define a
vector v at each step in this process, whose components
v areequaltotheprobabilitythatalocalclusterofsizei
i
hasjustbeen addedto the overallconnectedcluster. We
wish to calculate the value v′ of this vector in terms of
itsvaluevatthepreviousstep. Atorbelowthe percola-
′
tion threshold the v are small and so the v will depend
i i
linearly on the v according to a transition matrix M
i
thus:
′
v = M v , (4)
i ij j
FIG.1. A small-world graph with L=24, k=1, and four Xj
shortcuts. The colored sites represented susceptible individ-
uals. The susceptibility is p= 3 in thisexample. where
4
M =N (1−(1−ψ)ij). (5)
ij i
scales extensively with the total number L of sites [10]?
HereN isthenumberoflocalclustersofsizeiasbefore,
i
We will start with the site percolation case, in which
and 1−(1−ψ)ij is the probability of a shortcut from a
every contact of a healthy but susceptible person with
local cluster of size i to one of size j, since there are ij
an infected person results in transmission, but less than
possible pairs of sites by which these can be connected.
100% of the individuals are susceptible. The fraction p
Note that M is not a stochastic matrix, i.e., it is not
of occupied sites is precisely the susceptibility defined
normalized so that its rows sum to unity.
above. Now consider the largest eigenvalue λ of M. If λ<1,
Consider a one-dimensional small-world graph as in iteratingEq.(4)makesvtendtozero,sothattherateat
Fig. 1. L sites are arrangedon a one-dimensional lattice
whichnewlocalclustersareaddedfallsoffexponentially,
withperiodicboundaryconditionsandbondsconnectall
andthe connectedclustersarefinite withanexponential
pairs of sites which are separated by a distance of k or size distribution. Conversely, if λ > 1, v grows until
less. (For simplicity we have chosen k=1 in the figure.)
the size of the cluster becomes limited by the size of the
Shortcutsarenowaddedbetweenrandomlychosenpairs
whole system. The percolation threshold occurs at the
ofsites. Itisstandardtodefinetheparameterφtobethe
point λ=1.
averagenumber of shortcuts per bond on the underlying
For finite L finding the largest eigenvalue of M is dif-
lattice. The probability that two randomly chosen sites
ficult. However, if φ is held constant, ψ tends to zero as
have a shortcut between them is then L→∞, so for large L we can approximate M as
kφL
2 2kφ
ψ =1− 1− ≃ (1) Mij =ijψNi. (6)
(cid:20) L2(cid:21) L
Thismatrixistheouterproductoftwovectors,withthe
for large L.
result that Eq. (4) can be simplified to
A connected cluster on the small-world graph con-
sists of a number of local clusters—occupied sites which
λv =iψN jv , (7)
i i j
are connected together by the near-neighbor bonds on
Xj
theunderlyingone-dimensionallattice—whicharethem-
selves connected together by shortcuts. For k = 1, the where we have set v′ =λv . Thus the eigenvectors of M
i i
average number of local clusters of length i is have the form v = Cλ−1iψN where C = jv is a
i i j j
N =(1−p)2piL. (2) constant. Eliminating C then gives P
i
λ=ψ j2N . (8)
For general k we have j
Xj
N =(1−p)2kp(1−(1−p)k)i−1L
i For k =1, this gives
=(1−q)2pqi−1L, (3)
1+p 1+p
where q =1−(1−p)k. λ=ψLp1−p =2φp1−p. (9)
Now we build a connected cluster out of these local
clusters as follows. We start with one particular local Setting λ = 1 yields the value of φ at the percolation
cluster, and first add to it all other local clusters which threshold p :
c
2
φ= 2pc1(−1+pcpc), (10) Qi+1,j =(cid:26)p(1(1−−pp)2)(cid:2)QQjji+PkQjk(cid:3) ffoorr ii≥=10, (15)
and solving for p gives and
c
4φ2+12φ+1−2φ−1 Qi+1 =p(2−p)Qi+p(1−p) Qij +p2 Qjj′.
pc = p 4φ . (11) Xj j+Xj′=i
(16)
For general k, we have
If we define generating functions H(z) = Q zi and
i i
λ=ψLp1+q =2kφp2−(1−p)k, (12) H(z,w)= i,jQijziwj, this gives us P
1−q (1−p)k P
H(z,w)=z(1−p)2 H(w)+H(w,1)
or, at the threshold +zp(1(cid:2)−p)H(w,z), (cid:3) (17)
H(z)=zp(2−p)H(z)+zp(1−p)H(z,1)
(1−p )k
c
φ= . (13) +zp2H(z,z). (18)
2kp (2−(1−p )k)
c c
Since any pair of adjacent sites must belong to some
The threshold density p is then a root of a polynomial
c cluster or clusters (possibly of size one), the proba-
of order k+1.
bilities Q and Q must sum to unity according to
i ij
Q + Q =1,orequivalentlyH(1)+H(1,1)=1.
i i i,j ij
III. BOND PERCOLATION PFinally, Pthe density of clusters of size i is equal to the
probability that a randomly chosen site is the rightmost
site of such a cluster, in which case neither of the two
An alternative model of disease transmission is one in
bonds to its right are occupied. Taken together, these
which all individuals are susceptible, but transmission
results imply that the generating function for clusters,
takes place with less than 100%efficiency. This is equiv- G(z)= N zi, must satisfy
alent to bond percolation on a small-world graph—an i i
P
epidemicsetsinwhenasufficientfractionpc ofthebonds G(z)=(1−p)2 H(z)+H(z,1) . (19)
on the graph are occupied to cause the formation of a
(cid:2) (cid:3)
giant component whose size scales extensively with the Solving Eqs. (17) and (18) for H(z) and H(z,1) then
sizeofthegraph. Inthismodelthefractionpofoccupied gives
bonds is the transmissibility of the disease.
G(z)= z(1−p)4 1−2pz+p3(1−z)z+p2z2
Fork =1,thepercolationthresholdp forbondperco-
c
lationisthe sameasforsitepercolationforthefollowing (cid:2) 1−4pz(cid:0)+p5(2−3z)z2−p6(1−z)z(cid:1)2(cid:3)(cid:14)
reason. On the one hand, a local cluster of i sites now (cid:2) +p4z2(1+3z)+p2z(4+3z)
consists of i − 1 occupied bonds with two unoccupied
−p3z 1+5z+z2 , (20)
ones at either end, so that the number of local clusters
of i sites is (cid:0) (cid:1)(cid:3)
the first few terms of which give
Ni =(1−p)2pi−1, (14) N1/L=(1−p)4, (21)
whichhasonelessfactorofpthaninthe sitepercolation N2/L=2p(1−p)6, (22)
case. On the other hand, the probability of a shortcut N3/L=p2(1−p)6(6−8p+3p2). (23)
between two random sites now has an extra factor of p
Again replacing ψ with ψp, Eq. (8) becomes
init—itisequaltoψpinsteadofjustψ. Thetwofactors
of p cancel and we end up with the same expression for 2
d
the eigenvalue of M as before, Eq. (9), and the same λ=ψp i2N =2kφp z G(z) , (24)
i
(cid:20)(cid:18) dz(cid:19) (cid:21)
threshold density, Eq. (11). Xi z=1
Fork >1,calculatingN isconsiderablymoredifficult.
i which,settingk =2,impliesthatthepercolationthresh-
We will solve the case k = 2. Let Q denote the proba-
i old p is given by
bility that a given site n and the site n−1 to its left are c
part of the same local cluster of size i, when only bonds (1−p )3(1−p +p2)
to the left of site n are taken into account. Similarly, let φ= 4p (1+3p2−c3p3−2pc4+5cp5−2p6). (25)
Q be the probability that n and n−1 are part of two c c c c c c
ij
separate local clusters of size i and j respectively, again In theory it is possible to extend the same method to
when only bonds to the left of n are considered. Then, larger values of k, but the calculation rapidly becomes
by considering site n+1 which has possible connections tedious and so we will, for the moment atleast, moveon
to both n and n−1, we can show that to other questions.
3
IV. SITE AND BOND PERCOLATION
1.0
The most general disease propagation model of this
typeisoneinwhichboththesusceptibilityandthetrans- 0.8
missibility take arbitrary values, i.e., the case in which d pc
sites and bonds are occupied with probabilities psite and ol
pbondrespectively. Fork =1,alocalclusterofsizeithen esh 0.6
consists of i susceptible individuals with i−1 occupied hr
bonds between them, so that n t
atio 0.4
Ni =(1−psitepbond)2pisitepib−on1d. (26) col
r
e
p 0.2
Replacing ψ with ψpbond in Eq. (8) gives
1+p
λ=ψpbond j2Nj =2φp1−p, (27) 0.010−4 10−3 10−2 10−1 10−3 10−2 10−1 100
Xj
where p = psitepbond. In other words, the position of shortcut density φ
the percolation transition is given by precisely the same
expression as before except that p is now the product FIG.2. Thepointsarenumericalresultsforthepercolation
of the site and bond probabilities. The critical value threshold as a function of shortcut density φ for systems of
of this product is then given by Eq. (11). The case of sizeL=106. Leftpanel: sitepercolationwithk=1(circles),
k > 1 we leave as an open problem for the interested 2(squares), and5 (triangles). Right panel: bond percolation
(and courageous)reader. with k = 1 (circles) and 2 (squares). Each point is averaged
over1000 runsand theresulting statistical errors are smaller
than the points. The solid lines are the analytic expressions
V. NUMERICAL CALCULATIONS for the same quantities, Eqs. (11), (13), and (25). The slight
systematic tendency of the numerical results to overestimate
thevalueofpc isafinite-sizeeffectwhichdecreasesbothwith
We have performed extensive computer simulations of
increasing system size and with increasing φ [12].
percolation and disease spreading on small-world net-
works, both as a check on our analytic results, and to
investigate further the behavior of the models. First, we epidemicsfirstappearcanalsobe obtainedbynumerical
have measured the position of the percolation threshold simulation. In Fig. 3 we show results for the number of
for both site and bond percolation for comparison with new cases of a disease appearing as a function of time in
ouranalyticresults. Anaivealgorithmfordoingthisfills simulations of the site-percolation model. (Very similar
in either the sites or the bonds of the lattice one by one results are found in simulations of the bond-percolation
insomerandomorderandcalculatesateachstepthesize model.) Inthesesimulationswetookk =5andavalueof
of either the average or the largest cluster of connected φ=0.01fortheshortcutdensity,whichimplies,following
sites. The position of the percolationthreshold can then Eq.(13),thatepidemicsshouldappearifthesusceptibil-
be estimated from the point at which the derivative of itywithinthepopulationexceedsp =0.401. Thecurves
c
this size with respect to the number of occupied sites or in the figure are (from the bottom upwards) p = 0.40,
bonds takes its maximum value. Since there are O(L) 0.45, 0.50, 0.55, and 0.60, and, as we can see, the num-
sites or bonds on the network in total and finding the ber of new cases of the disease for p=0.40 shows only a
clusters takestime O(L), suchanalgorithmrunsin time smallpeakofactivity(barelyvisiblealongthe loweraxis
O(L2). A more efficient way to perform the calculation of the graph) before the disease fizzles out. Once we get
is, rather than recreating all clusters at each step in the above the percolation threshold (the upper four curves)
algorithm,to calculate the new clusters fromthe ones at a large number of cases appear, as expected, indicating
the previous step. By using a tree-like data structure to the onset of epidemic behavior. In the simulations de-
storetheclusters[11],onecaninthiswayreducethetime picted, epidemic disease outbreaks typically affected be-
needed to find the value of pc to O(LlogL). In Fig. 2 tween50%and100%ofthesusceptibleindividuals,com-
we show numericalresults for pc fromcalculations ofthe pared with about 5% in the sub-critical case. There is
largest cluster size using this method for systems of one also a significant tendency for epidemics to spread more
million sites with various values of k, for both bond and quickly (and in the case of self-limiting diseases presum-
site percolation. As the figure shows, the results agree ably also to die out sooner) in populations which have a
well with our analytic expressions for the same quanti- higher susceptibility to the disease. This arises because
ties over several orders of magnitude in φ. inmore susceptible populations there aremorepaths for
Directconfirmationthatthepercolationpointinthese theinfectiontotakefromaninfectedindividualtoanun-
modelsdoesindeedcorrespondtothethresholdatwhich
4
these models, confirming both the position of the perco-
30000
lation threshold and the fact that epidemics take place
d 60 above this threshold only.
g e
ppearin 20000 age infect 40 icnolFpairtniinoanclliypt,hlewregesihvpeoolaidnntoenoxuaatcsttmhreaastlul-tlwthoeformlrdtehgthreaospditheuwsoeirtdhbhoanenrdyepcueanrn--
cases a percent 200 dloecralylicnlgusltaetrtsicaesfoarfwunhcictihonweofcathneciralsciuzela.teIf,thfoerdiennsstiatnycoef,
w 0 50 100 150 200 one could enumerate lattice animals on lattices of two
e
n time t
of 10000 or more dimensions, then the exact percolation thresh-
r oldforthecorrespondingsmall-worldmodelwouldfollow
e
b
m immediately.
u
n
0 ACKNOWLEDGMENTS
0 50 100 150 200
time t
We thank Duncan Watts for helpful conversations.
This work was supported in part by the Santa Fe In-
FIG. 3. The number of new cases of a disease appearing
stitute and DARPA under grant number ONR N00014-
as a function of time in simulations of the site-percolation
95-1-0975.
model with L = 106, k = 5, and φ = 0.01. The top four
curves are for p=0.60, 0.55, 0.50, and 0.45, all of which are
above the predicted percolation threshold of pc = 0.401 and
show evidence of the occurrence of substantial epidemics. A
fifth curve, for p = 0.40, is plotted but is virtually invisible
next to the horizontal axis because even fractionally below
the percolation threshold no epidemic behavior takes place.
[1] M. Kocken, The Small World. Ablex, Norwood, NJ
Eachcurveisaveragedover1000repetitionsofthesimulation.
(1989).
Inset: the total percentage of the population infected as a
[2] D. J. Watts, Small Worlds. Princeton University Press,
function of time in thesame simulations.
Princeton (1999).
[3] S. Milgram, “The small world problem,” Psychology To-
infected one. The amount of time an epidemic takes to day2, 60–67 (1967).
[4] L. Sattenspiel and C. P. Simon, “The spread and per-
spreadthroughoutthepopulationisgivenbytheaverage
sistenceofinfectiousdiseasesinstructuredpopulations,”
radius of (i.e., path length within) connected clusters of
Mathematical Biosciences 90, 367–383 (1988).
susceptible individuals, a quantity which has been stud-
[5] M.KretschmarandM.Morris,“Measuresofconcurrency
ied in Ref. [7].
in networks and the spread of infectious disease,” Math-
In the inset of Fig. 3 we show the overall (i.e., inte-
ematical Biosciences 133, 165–195 (1996).
grated)percentage of the population affected by the dis-
[6] D. J. Watts and S. H. Strogatz, “Collective dynamics of
easeasafunctionoftimeinthesamesimulations. Asthe ‘small-world’ networks,” Nature 393, 440–442 (1998).
figure shows, this quantity takes a sigmoidal form simi-
[7] M. E. J. Newman and D. J. Watts, “Scaling and perco-
lartothatseenalsoinrandom-graphmodels[4,5],simple
lation in the small-world network model,” Phys. Rev. E
small-world disease models [7], and indeed in real-world 60, 7332–7342 (1999).
data. [8] C. Moukarzel, “Spreading and shortest pathsin systems
with sparse long-range connections,” Phys. Rev. E, in
press (2000).
VI. CONCLUSIONS
[9] M. E. J. Newman, C. Moore, and D. J. Watts,
“Mean-field solution of thesmall-world network model,”
We have derived exact analytic expressions for the cond-mat/9909165.
percolation threshold on one-dimensional small-world [10] Technically this point is not a percolation transition,
graphs under both site and bond percolation. These since it is not possible to create a system-spanning
(i.e., percolating) cluster on a small-world graph. To be
results provide simple models for the onset of epidemic
strictly correct, we should refer tothe transition as “the
behavior in diseases for which either the susceptibility
point at which a giant component first forms.” This is
or the transmissibility is less than 100%. We have also
something of a mouthful however, so we will bend the
looked briefly at the case of simultaneous site and bond
rules alittleand continuetotalk about “percolation” in
percolation, in which both susceptibility and transmis-
this paper.
sibility can take arbitrary values. We have performed
[11] G.T.BarkemaandM.E.J.Newman,“NewMonteCarlo
extensive numerical simulations of disease outbreaks in
algorithms for classical spin systems,” in Monte Carlo
5
Methods in Chemical Physics, D. Ferguson, J. I. Siep-
mann,andD.G.Truhlar(eds.),Wiley,NewYork(1999).
[12] We have also experimented with finite-size scaling as a
methodforcalculatingpc.Suchcalculationsarehowever
prone to inaccuracy because, as pointed in Ref. [7], the
correlation length scaling exponentν,which governs the
behaviorofpc assystemsizeisvaried,dependsontheef-
fectivedimension ofthegraph,whichitselfchangeswith
system size. Thustheexponentcan at best only be con-
sideredapproximatelyconstantinafinite-sizescalingcal-
culation.Ratherthanintroduceunknownerrorsintothe
value of pc as a result of this approximation, we have
opted in thepresent work to quote direct measurements
of pc on large systems as ourbest estimate of theperco-
lation threshold.
6