Table Of ContentStatisticalScience
2010,Vol.25,No.3,275–288
DOI:10.1214/10-STS335
©InstituteofMathematicalStatistics,2010
Connected Spatial Networks over Random
Points and a Route-Length Statistic
David J. Aldous and Julian Shun
Abstract. We review mathematically tractable models for connected net-
works on random points in the plane, emphasizing the class of proximity
graphswhichdeservestobebetterknowntoappliedprobabilistsandstatisti-
cians.WeintroduceandmotivateaparticularstatisticRmeasuringshortness
of routes in a network. We illustrate, via Monte Carlo in part, the trade-off
betweennormalizednetworklengthandRinaone-parameterfamilyofprox-
imity graphs. How close this family comes to the optimal trade-off over all
possiblenetworksremainsanintriguingopenquestion.
Thepaperisawrite-upofatalkdevelopedbythefirstauthorduring2007–
2009.
Key words and phrases: Proximity graph, random graph, spatial network,
geometricgraph.
1. INTRODUCTION Recall that the most studied network model, the ran-
dom geometric graph [40] reviewed in Section 2.1,
The topic called random networks or complex net-
does not permit both connectivity and bounded nor-
works has attracted huge attention over the last 20
malized length in the n limit. An attractive al-
years.Muchofthisworkfocusesonexamplessuchas →∞
ternative is the class of proximity graphs, reviewed in
socialnetworksorWWWlinks,inwhichedgesarenot
Section 2.3, whichin thedeterministic case have been
closely constrained by two-dimensional geometry. In
studied within computational geometry. These graphs
contrast,inaspatialnetwork notonlyareverticesand
are always connected. Proximity graphs on random
edges situated in two-dimensional space, but also it is
points have been studied in only a few papers, but are
actual distances, rather than number of edges, that are
potentially interesting for many purposes other than
ofinterest.Tobeconcrete,wevisualizeidealizedinter-
the specific “short route lengths” topic of this paper
cityroadnetworks,andafeatureofinterestisthe(min-
(see Section 6.5). One could also imagine construc-
imum)routelengthbetweentwogivencities.Because
tions which depend on points having specifically the
weworkonlyintwodimensions,thewordspatialmay
Poissonpointprocessdistribution,andonenovelsuch
be misleading, but equally the word planar would be
network, which we name the Hammersley network, is
misleading because we do not require networks to be
describedinSection2.5.
planar graphs (if edges cross, then a junction is cre-
Visualizing idealized road networks, it is natural to
ated).
taketotalnetworklengthasthe“cost”ofanetwork,but
Our major purpose is to draw the attention of read-
what is the corresponding “benefit”? Primarily we are
ers from the applied probability and statistics commu-
interested in having short route lengths. Choosing an
nities to a particular class of spatial network models.
appropriatestatistictomeasurethelatterturnsouttobe
rathersubtle,andthe(only)technicalinnovationofthis
DavidJ.AldousisProfessor,DepartmentofStatistics,
paper is the introduction (Section 3.2) and motivation
UniversityofCalifornia,367EvansHall#3860,Berkeley,
ofaspecificstatisticR formeasuringtheeffectiveness
California94720,USA(e-mail:[email protected];
ofanetworkinprovidingshortroutes.
URL:www.stat.berkeley.edu/users/aldous).JulianShunis
Inthetheoryofspatialnetworksoverrandompoints,
GraduateStudent,MachineLearningDepartment,
CarnegieMellonUniversity,5000ForbesAvenue, it is a challenge to quantify the trade-off between net-
Pittsburgh,Pennsylvania15213,USA(e-mail: work length [precisely, the normalized length L de-
[email protected]). fined at (2)] and route length efficiency statistics such
275
276 D.J.ALDOUSANDJ.SHUN
as R. Our particular statistic R is not amenable to ex- First (Sections 2.1–2.3) are schemes which use de-
plicitcalculationevenincomparativelytractablemod- terministic rules to define edges for an arbitrary deter-
els,butinSection4wepresenttheresultsfromMonte ministic configuration of cities; then one just applies
Carlo simulations. In particular, Figure 7 shows the theserulestoarandomconfiguration.Second,onecan
trade-off for the particular β-skeleton family of prox- have random rules for edges in a deterministic config-
imitygraphs. uration (e.g., the probability of an edge between cities
Given a normalized network length L, for any real- i andj isafunctionofEuclideandistanced(xi,xj),as
ization of cities there is some network of normalized inpopularsmallworldsmodels[39]),andagainapply
length L which minimizes R. As indicated in Sec- toarandomconfiguration.Third,andmoresubtly,one
tion 5, by general abstract mathematical arguments, canhaveconstructionsthatdependontherandomness
there must exist a deterministic function R (L) giv- modelforcitypositions—Section2.5providesanovel
opt
ing (in the “number of cities ” limit under the example.
random model) the minimum v→alu∞e of R over all pos- WeworkthroughoutwithreferencetoEuclideandis-
sible networks of normalized length L. An intriguing tance d(x,y) on the plane, even though many mod-
openquestionisasfollows: elscouldbedefinedwithreferencetoothermetrics(or
evenwhenthetriangleinequalitydoesnothold,forthe
how close are the values Rβ-skel(L) from the MST).
β-skeleton proximity graphs to the optimum
valuesR (L)? 2.1 TheGeometricGraph
opt
AsdiscussedinSection5.3,atfirstsightitlookseasyto In Sections 2.1–2.3 we have an arbitrary configura-
designheuristicalgorithmsfornetworkswhichshould tionx xi ofcitypositions,andadeterministicrule
={ }
improve over the β-skeletons, for example, by intro- for defining the edge-set . Usually in graph theory
E
oneimaginesafiniteconfiguration,butnotethatevery-
ducingSteinerpoints,butinpracticewehavenotsuc-
thing makes senseforlocally finite configurations too.
ceededindoingso.
Where helpful, we assume “general position,” so that
Thispaperfocusesontherandommodelforcitypo-
intercitydistancesd(x ,x )arealldistinct.
sitionsbecauseitseemsthenaturalsettingfortheoret- i j
For the geometric graph one fixes 0<c < and
ical study. As a complement, in [10] we give empiri-
∞
defines
cal data for the values of (L,R) for certain real-world
networks (on the 20 largest cities, in each of 10 US (x ,x ) iff d(x ,x ) c.
i j i j
∈E ≤
States). In [8] we give analytic results and bounds on
FortheK-neighborgraphonefixesK 1anddefines
the trade-off between L and the mathematically more ≥
tractablestretchstatisticRmaxat(4),inbothworst-case (xi,xj) ∈ E iff xi is one of the K closest
andrandom-casesettingsforcitypositions.Letusalso neighbors of xj, or xj is one of the K clos-
point out a (perhaps) nonobvious insight discussed in estneighborsofxi.
Section3.3:indesigningnetworkstobeefficientinthe
Amoment’sthoughtshowsthesegraphsareingeneral
sense of providing short routes, the main difficulty is
notconnected,soweturntomodelswhichare“bycon-
providing short routes between city-pairs at a specific
struction” connected. We remark that the connectivity
distance(2–3standardizedunits)apart,ratherthanbe- thresholdc inthefiniten-vertexmodeloftherandom
n
tweenpairsatalargedistanceapart. geometricgraphhasbeenstudiedindetail—seeChap-
Finally, recall this is a nontechnical account. Our ter13of[40].
purpose is to elaborate verbally the ideas outlined
2.2 ANestedSequenceofConnectedGraphs
above; some technical aspects will be pursued else-
where. The material here and in the next section was de-
veloped in graph theory with a view toward algorith-
2. MODELSFORCONNECTEDSPATIAL mic applications in computational geometry and pat-
NETWORKS tern recognition. The 1992 survey [28] gives the his-
tory of the subject and 116 citations. But everything
There are several conceptually different ways of
weneedisimmediatefromthe(carefulchoiceof)defi-
defining networks on random points in the plane. To
nitions.Onourarbitraryconfigurationxwecandefine
be concrete, we call the points cities; to be consistent
fourgraphswhoseedge-setsarenestedasfollows:
about language, we regard x as the position of city i
i
andrepresentnetworkedgesaslinesegments(x ,x ). (1) MST relativen’hood Gabriel Delaunay.
i j
⊆ ⊆ ⊆
CONNECTEDNETWORKSOVERRANDOMPOINTS 277
Here are the definitions (for MST and Delaunay, it 2.3 ProximityGraphs
is easy to check these are equivalent to more familiar
Write v and v for the points ( 1,0) and (1,0).
definitions).Ineachcase,wewritethecriterion foran − + −2 2
Theluneistheintersectionoftheopendiscsofradii1
edge(x ,x )tobepresent:
i j centered at v and v . So v and v are not in the
− + − +
Minimumspanningtree(MST)[24].Theredoesnot lunebutareonitsboundary.DefineatemplateAtobe
• existasequencei k ,k ,...,k j ofcitiessuch asubsetofR2 suchthat:
0 1 m
= =
that
(i) Aisasubsetofthelune.
(ii) Acontainstheopenlinesegment(v ,v ).
max(d(xk0,xk1),d(xk1,xk2),...,d(xkm 1,xkm)) (iii) A is invariant under the “reflection−in+the y-
−
<d(x ,x ). axis” map Reflect (x ,x ) ( x ,x ) and the “re-
i j x 1 2 1 2
= −
flection in the x-axis” map Reflect (x ,x ) (x ,
y 1 2 1
Relativeneighborhoodgraph.Theredoesnotexista x ). =
• cityk suchthat −(2iv) Aisopen.
max(d(x ,x ),d(x ,x ))<d(x ,x ). For arbitrary points x,y in R2, define A(x,y) to
i k k j i j
be the image of A under the natural transformation
Gabriel graph. There does not exist a city inside (translation, rotation and scaling) that takes (v ,v )
•
thediscwhosediameter isthelinesegmentfrom xi to(x,y). − +
tox .
j
DEFINITION. Given a template A and a locally fi-
Delaunaytriangulation[23].Thereexistssomedisc,
• niteset ofvertices,theassociatedproximitygraphG
with xi and xj on its boundary, so that no city is hasedgeVsdefinedby,foreachx,y ,
insidethedisc. ∈V
(x,y)isanedgeofGiffA(x,y)containsno
The inclusions (1) are immediate from these defini-
vertexof .
tions. Because the MST (for a finite configuration) is V
connected,allthesegraphsareconnected. Fromthedefinitions:
Figure 1 illustrates the relative neighborhood and
if A isthelune,then G istherelativeneighborhood
Gabrielgraphs.FiguresfortheMSTandtheDelaunay • graph;
triangulation can be found online at http://www.spss. ifAisthedisccenteredattheoriginwithradius1/2,
com/research/wilkinson/Applets/edges.html. • thenGistheGabrielgraph.
Constructionssuchastherelativeneighborhoodand
But the MST and Delaunay triangulation are not in-
Gabriel graphs have become known loosely as prox-
stancesofproximitygraphs.
imity graphs in [28] and subsequent literature, and we
next take the opportunity to turn an implicit definition Notethatreplacing A byasubset A canonlyintro-
(
intheliteratureintoanexplicitdefinition. duceextraedges.Itfollowsfrom(1)thattheproximity
FIG.1. Therelativeneighborhoodgraph(left)andGabrielgraph(right)ondifferentrealizationsof500randompoints.
278 D.J.ALDOUSANDJ.SHUN
graph is always connected. The Gabriel graph is pla- For a picturesque description, imagine one-eyed
nar.ButifAisnotasupersetofthedisccenteredatthe frogs sitting on an infinitely long, thin log, each being
originwithradius1/2,thenGmightnotbeasubgraph able to see only the part of the log to their left before
of the Delaunay triangulation, and in this case edges thenextfrog.Atrandomtimesandpositions(precisely,
may cross, so G is not planar (e.g., if the vertex-set is as a space–time Poisson point process of rate 1) a fly
the four corners of a square, then the diagonals would lands on the log, at which instant the (unique) frog
beedges). whichcanseeitjumpslefttothefly’spositionandeats
For a given configuration x, there is a collection of it.ThisdefinesacontinuoustimeMarkovprocess(the
proximity graphs indexed by the template A, so by Hammersley process) whose states are the configura-
choosing a monotone one-parameter family of tem- tionsofpositionsofallthefrogs.Thereisastationary
plates, one gets a monotone one-parameter family of versionoftheprocessinwhich,ateachtime,theposi-
graphs, analogous to the one-parameter family of tionsofthefrogsformaPoisson(rate1)pointprocess
c
G
geometric graphs. Here is a popular choice [30] in ontheline.
which β 1 gives the Gabriel graph and β 2 gives Now consider the space–time trajectories of all the
= =
therelativeneighborhoodgraph. frogs,drawnwithtimeincreasingupwardonthepage.
See Figure 2. For each frog, the part of the trajectory
DEFINITION (The β-skeleton family). (i) For 0<
betweenthecompletionsoftwosuccessivejumpscon-
β< 1 let A be the intersection of the two open discs
β
sists of an upward edge (the frog remains in place as
ofradius(2β) 1 passingthroughv andv .
−
− + time increases) followed by a leftward edge (the frog
(ii) For 1 β 2 let A be the intersection of
β
≤ ≤ jumpsleft).
the two open discs of radius β/2 centered at ( (β
± − Reinterpreting the time axis as a second space axis,
1)/2,0).
and introducing compass directions, that part of the
2.4 NetworksBasedonPowersofEdge-Lengths trajectory becomes a North edge followed by a West
edge. Now replace these two edges by a single North-
It is not hard to think of other ways to define one-
Weststraightedge.Doingthisprocedureforeachfrog
parameter families of networks. Here is one scheme
and each pair of successive jumps, we obtain a col-
used in, for example, [38]. Fix 1 p < . Given
≤ ∞ lection of NW paths, that is, a network in which each
a configuration x, and a route (sequence of vertices)
city (the reinterpreted space–time random points) has
x ,x ,...,x , say, the cost of the route is the sum of
0 1 k
an edge to the NW and an edge to the SE. Finally, we
pth powers of the step lengths. Now say that a pair
(x,y) is an edge of the network if the cheapest
p
G
routefromx toy istheone-steproute.Asp increases
from 1 to , these networks decrease from the com-
∞
plete graph to the MST. Moreover, for p 2 the net-
≥
work isasubgraphoftheGabrielgraph.
p
G
2.5 TheHammersleyNetwork
Thereisaquiteseparaterecentliteratureintheoreti-
calprobability[26,27]definingstructuressuchastrees
and matchings directly on the infinite Poisson point
process. In this spirit, we observe that the Hammers-
ley process studied in [6] can be used to define a new
network on the infinite Poisson point process, which
wenametheHammersleynetwork.Thisnetworkisde-
signed to have the feature that each vertex has exactly
4 edges, in directions NE (between North and East),
NW, SE and SW. The conceptual difference from the
networksintheprevioussectionisthatthereisnotsuch
a simple “local” criterion for whether a potential edge
(x ,x ) is in the network. And edges cross, creating
i j
junctions. FIG.2. Space–timetrajectoriesinHammersley’sprocess.
CONNECTEDNETWORKSOVERRANDOMPOINTS 279
2.6 NormalizedLength
The notion of normalized network length L is most
easily visualized in the setting of an infinite determin-
istic network which is “regular” in the sense of con-
sisting of a repeated pattern. First choose the unit of
lengthsothatcitieshaveanaveragedensityofoneper
unitarea.Thendefine
(2) L averagenetworklengthperunitarea,
=
" averagedegree(numberofincidentedges)
¯ =
(3)
ofcities.
Figure4showsthevaluesofLand"forsomesim-
¯
ple “repeated pattern” networks. Though not directly
relevant to our study of the random model, we find
Figure 4 helpful for two reasons: as intuition for the
interpretation of the different numerical values of L,
FIG.3. TheHammersleynetworkon2500randompoints. and because we can make very loose analogies (Sec-
tion6.6)betweenparticularnetworksonrandompoints
andparticulardeterministicnetworks.
repeattheconstructionwiththesamerealizationofthe
space–timePoissonpointprocessbutwithfrogsjump-
ingrightwardinsteadofleftward.Thisyieldsanetwork 3. NORMALIZEDLENGTHANDROUTE-LENGTH
on the infinite Poisson point process, which we name EFFICIENCY
theHammersleynetwork.SeeFigure3.
3.1 TheRandomModel
REMARKS. (a) To draw the Hammersley network
For the remainder of the paper we work with “the
onrandompointsinafinitesquare,oneneedsexternal
randommodel”forcitypositions.Thefinitemodelas-
randomizationtogivetheinitial(time0)frogpositions,
sumes n random vertices (cities) distributed indepen-
in fact, two independent randomizations for the left-
dentlyanduniformlyinasquareofarean.Theinfinite
ward and the rightward processes. So to be pedantic,
modelassumesthePoissonpointprocessofrate1(per
one gets a random network over the given realization
unit area) in the plane. The quantities L," above and
ofcities.However,onecandeducefromthetheoretical ¯
R below that we discuss may be interpreted as exact
resultsin[6]thattheexternalrandomizationhaseffect
values in the infinite model or as n limits in the
onlyneartheboundaryofthesquare. →∞
finite model; see Section 5. We use the word normal-
(b) The property that each vertex has exactly
ized as a reminder of the “density 1” convention—we
4 edges, in directions NE (between North and East),
choose the normalized unit of distance to make cities
NW, SE and SW, is immediate from the construction.
haveaveragedensity1perunitarea.Afterthisnormal-
Note,however,thatwhileadjacentNWspace–timetra-
ization,Listheaveragenetworklengthperunitarea.
jectoriesinFigure2donotcross,thecorrespondingdi-
agonalroadsintheHammersleynetworkmaycross,so 3.2 TheRoute-LengthEfficiencyStatisticR
itisnotaplanargraph,thoughthishasonlynegligible
In designing a network, it is natural to regard total
effectonroutelengths.
lengthasa“cost”.Thecorresponding“benefit”ishav-
(c) Intuition, confirmed by Figure 7 later, says that
ing short routes between cities. Write #(i,j) for the
theHammersleynetworkisnotveryefficientasaroad
route length (length of shortest path) between cities i
network. It serves to demonstrate that there do exist
and j in a given network, and d(i,j) for Euclidean
randomnetworksotherthanthefamiliarones,andpro-
distance between the cities. So #(i,j) d(i,j), and
vides an instance where imposing deterministic con-
≥
wewrite
straints (the four edges, in this case) on a random net-
workmakesitmuchlessefficient.Howgeneralaphe- #(i,j)
r(i,j) 1
nomenonisthis? = d(i,j) −
280 D.J.ALDOUSANDJ.SHUN
FIG.4. Variantsquare,triangularandhexagonallattices.Drawnsothatthedensityofcitiesisthesameineachdiagram,andorderedby
valueofL.
so that “r(i,j) 0.2” means that route length is 20% unreasonabletocharacterizetheUKrailnetworkasin-
=
longer than straight line distance. With n cities we get efficient simply because there is no very direct route
n such numbers r(i,j); what is a reasonable way to betweenOxfordandCambridge.
2
combine these into asingle statistic? Twonatural pos- Thestatistic R hasamoresubtledrawback.Con-
! " ave
sibilitiesareasfollows: sideranetworkconsistingof:
Rmax maxr(i,j), the minimum-length connected network (Steiner
:= j i •
(4) )= tree)ongivencities;
R ave r(i,j), and a superimposed sparse collection of randomly
ave (i,j)
:= •
orientedlines(aPoissonlineprocess[45]).
where ave denotes average over all distinct pairs
(i,j)
(i,j). The statistic R has been studied in the con- See Figure 5. By choosing the density of lines to be
max
text of the design of geometric spanner networks [37] sufficientlylow,onecanmakethenormalizednetwork
where it is called the stretch. However, being an “ex- length be arbitrarily close to the minimum needed for
tremal” statistic R seems unsatisfactory as a de- connectivity. Butitiseasytoshow(see[7]forcareful
max
scriptorofrealworldnetworks—forinstance,itseems analysis and a stronger result) that one can construct
CONNECTEDNETWORKSOVERRANDOMPOINTS 281
efficiency trade-off [the function R (L) discussed in
opt
Section5],andso,inparticular,itmakessensetocom-
parethevaluesofR fornetworkswithdifferentn.
Advantage 2. A more realistic model for traffic
would posit that volume of traffic between two cities
variesasapower-lawd γ ofdistanced,sothatincal-
−
culating R it would be more realistic to weight by
ave
d γ.Thismeansthattheoptimalnetwork,whenusing
−
R asoptimalitycriterion,woulddependonγ.Useof
ave
R finessesthisissue;thevalueof γ doesnotaffect R.
A related issue is that volume of traffic between two
cities should depend on their populations. Intuitively,
incorporating random population sizes should make
the optimal R smaller because the network designer
can create shorter routes between larger cities. We see
this effect in data [10]; R calculated via population-
weightingistypicallyslightlysmaller.Butwehavenot
triedtheoreticalstudy.
Disadvantage. The statistic R is tailored to the in-
finite model, in which it makes sense to consider two
FIG. 5. Efficient or inefficient? Rave would judge this network citiesatexactlydistanced apart(thentheothercitypo-
efficientinthen limit. sitions form a Poisson point process). For finite n we
→∞
needtodiscretize.Fortheempiricaldatain[10],where
such networks so that R 0 as n . Of course n 20, we average over intervals of width 1 unit (re-
ave
→ →∞ =
no one would build a road network looking like Fig- call the unit of distance is taken such that the density
ure 5 to link cities, because there are many pairs of of cities is 1 per unit area),that is,for d 1,2,...,5,
=
nearby cities with only very indirect routes between wecalculate
them. The disadvantage of R as a descriptive sta-
ave ρ(d) meanvalueofr(i,j)overcity-pairs
tistic is that (for large n) most city-pairs are far apart, ˜ :=
(6) withd 1 <d(i,j)<d 1,
so the fact that a given network has a small value of − 2 + 2
R says nothing about route lengths between nearby
ave R max ρ(d)
cities. ˜ :=1 d< ˜
≤ ∞
We propose a statistic R which is intermediate be-
and use R as proxy for R. For larger n we can use
tween R and R . First consider (see discussion ˜
ave max shorter intervals. Thus, there is, in principle, a certain
belowfordetails)
fuzziness to the notion of R for finite networks, and,
ρ(d) meanvalueofr(i,j)over in particular, it is not clear how to assign a value of R
:= to regular networks such as those in Figure 4. But in
city-pairswithd(i,j) d
practice, for networks we have studied on real-world
=
andthendefine data and on random points, this is not a problem, as
explainednext.
(5) R max ρ(d).
:=0 d< 3.3 CharacteristicShapeoftheFunctionρ(d)
≤ ∞
In words, R 0.2 means that on every scale of dis- For the connected networks on random points (ex-
=
tance,routelengthsareonaverageatmost20%longer cluding the Hammersley network) we are discussing,
thanstraightlinedistance. the function ρ(d) has a characteristic shape (see Fig-
On an intuitive level, R provides a sensible and in- ure 6) attaining its maximum between 2 and 3 and
terpretable way to compare efficiency of different net- slowly decreasing thereafter. We suspect that “this
works in providing short routes. On a technical level, characteristic shape holds for any reasonable model,”
we see two advantages and one disadvantage of using butwedonotknowhowtoturnthatphraseintoapre-
R insteadofR . cise conjecture. Note that “smoothness near the maxi-
ave
Advantage 1. Using R to measure efficiency, there mum”impliesthatanycalculatedvalueRat(6)isquite
˜
is a meaningful n limit for the network length/ insensitivetothechoiceofdiscretization.
→∞
282 D.J.ALDOUSANDJ.SHUN
FIG.6. Thefunctionρ(d) forthreetheoreticalnetworksonrandomcities.IrregularitiesareMonteCarlorandomvariation.
This characteristic shape has a common-sense in- 4. LENGTH-EFFICIENCYTRADE-OFFFOR
terpretation. Any efficient network will tend to place TRACTABLENETWORKS
roads directly between unusually close city-pairs, im-
Recallthatouroverallthemeisthetrade-offbetween
plying that ρ(d) should be small for d <1. For large
networklength androute-length efficiency, andthat in
d the presence of multiple alternate routes helps pre-
this paper we focus on n limits in the random
ventρ(d) fromgrowing.Atdistance2 3fromatyp- →∞
modelandtheparticularstatisticsLandR.
−
ical city i there will be about π32 π22 16 other
The models described in Section 2 are “tractable”
− ≈
citiesj.Forsomeofthesej therewillbecitiesk near in the specific sense that one can find exact analytic
the straight line from i to j, so the network designer formulas for normalized length L. Unfortunately R is
can create roads from i to k to j. The difficulty arises not amenable to analytic calculation, and we resort to
wherethereisnosuchintermediatecity k:includinga MonteCarlosimulationtoobtainvaluesforR.Table1
directroad(xi,xj)willincreaseL,butnotincludingit and Figure 7 show the values of (L,R) in the models.
willincreaseρ(d) for2<d <3. WeexplainbelowhowthevaluesofLarecalculated.
Thus,Figure6offersaminorinsightintospatialnet- Notes on Table 1. (a) Values of R from our simula-
workdesign:thatitiscitypairsatnormalizeddistance tionswithn 2500.
=
2 3 specifically that enforce the constraints on effi-
−
cientnetworkdesign. TABLE1
Statisticsoftractablenetworksonrandompoints
The characteristic shape—at least, the flatness over
2 d 5—isalsovisibleinthereal-worlddata[10].
≤ ≤ Network L "R
For the Hammersley network, the graph of ρ(d) is ¯
quite different; ρ(d) increases to a maximum of 0.35 Minimumspanningtree 0.633 2
∞
around d 0.8 and then decreases more steeply to a Relativen’hood 1.02 2.56 0.38
value of 0=.21 at d 5. This arises from the particular Gabriel 2 4 0.15
= Hammersley 3.25 4 0.35
structure(fromeachcitythereisoneroadineachquad-
Delaunay 3.40 6 0.07
rant)resemblingthedeterministic“diagonallattice”of
Figure4,inwhichtheroutebetweensomenearbypairs Notes:Integervaluesareexact.RecallLisnormalizedlength(2),
willbeviatwodiagonalroadsandajunction. "isaveragedegree(3)andRisourroute-lengthstatistic(5).
¯
CONNECTEDNETWORKSOVERRANDOMPOINTS 283
FIG.7. ThenormalizednetworklengthLandtheroute-lengthefficiencystatisticRforcertainnetworksonrandompoints.The showthe
◦
beta-skeletonfamily,withRNtherelativeneighborhoodgraphandGtheGabrielgraph.The arespecialmodels: showstheDelaunay
• ,
triangulation,!showsthenetwork 2fromSection2.4and showstheHammersleynetwork.
G ♦
(b) Value of L for MST from Monte Carlo [19]. LEMMA 1. ForaproximitygraphwithtemplateA
Inprinciple,onecancalculatearbitrarilyclosebounds onthePoissonpointprocess,
[11],butapparentlythishasneverbeencarriedout.Of
π3/2
course," 2foranytree. (8) L ,
(c)Th¯eG=abrielgraphandtherelativeneighborhood = 4c3/2
π
graph fit the assumptions of Lemma 1 with c π/4
(9) " ,
and c 2π √3, respectively, and their table =entries ¯ = c
= 3 − 4
for L and " are obtained from Lemma 1, as are the wherec area(A).
¯ =
valuesforβ-skeletonsinFigure7.
PROOF. Take a typical city at position x0. For a
(d) For the Hammersley network, every degree
city x at distance s the chance that (x ,x) is an edge
0
equals 4, so L 2 (mean edge-length). It follows equalsexp( cs2)andso
= ×
fromtheory[6]thatatypicaledge,say,NEfrom(x,y), −
goestoacityatposition(x ξx,y ξy),whereξx and mean-degree ∞exp( cs2)2πsds,
+ +
ξy are independent with Exponential(1) distribution. =#0 −
Someanedge-lengthequals 1
L ∞sexp( cs2)2πsds.
= 2 0 −
(7) ∞ ∞ x2 y2e x ydxdy 1.62. #
− −
0 0 + ≈ Evaluatingtheintegralsgives(8)and(9). !
# # $
(e)Foranytriangulation," 6intheinfinitemodel. One can derive similar integral formulas for other
¯ =
For the Delaunay triangulation, L ES where S is “local” characteristics, for example, mean density of
=
the perimeter length of a typical cell, and it is known triangles and moments of vertex degree. See [18, 20,
([35], page 113) that ES 32. Note [33] that the De- 21, 34] for a variety of such generalizations and spe-
= 3π
launay triangulation is in general not the minimum- cializations.
lengthtriangulation.OursimulationresultsinFigure6
4.2 OtherTractableNetworks
for ρ(d) for the Delaunay triangulation are roughly
consistent with a simulation result in [13] saying that Wedonotknowanyotherwaysofdefiningnetworks
ρ(65) 0.05. on random points which are both “natural” and are
≈
tractable in the sense that one can find exact analytic
4.1 ASimpleCalculationforProximityGraphs
formulasforL.Inparticular,weknownotractableway
Letusgiveanexampleofanelementarycalculation of defining networks with deliberate junctions as in
forproximitygraphsoverrandompoints. Figure8.Notealsothat,whileitiseasytomakeadhoc
284 D.J.ALDOUSANDJ.SHUN
whereR isthediscretizedversion(6)calculatedusing
˜
intervalsofsomesuitablelengthδ .Applyingthistoa
n
random configuration X in the finite model gives, for
eachL,arandomvariable
) (L) R (X,L).
n n
:=
Oneintuitivelyexpectsconvergencetosomedetermin-
isticlimit
(12) ) (L) R (L) say,asn .
n opt
→ →∞
The analogous result for R will be proved care-
max
fully in [8], and the same “superadditivity” argument
could be used to prove (12). See [43, 44, 47] for gen-
eral background to such results. The point is that we
FIG. 8. An ad hoc modification of the relative neighborhood
do not have any explicit description of the optimal
graph,introducingjunctions.
[i.e., attaining the minimum in (11)] networks in the
finite or infinite models, so it seems very challenging
modificationstothegeometricgraphtoensureconnec-
toprovethenaturalstrongersuppositionthatthefinite
tivity,thesedestroytractability.Ontheotherhand,one
optimalnetworksthemselvesconverge(insomeappro-
can construct “unnatural” networks (see, e.g., [8]) de-
priate sense) to a unique infinite optimal network for
signedtopermitcalculationofL.
whichthevalueR R (L)isattained.
opt
=
5. OPTIMALNETWORKSANDN →∞LIMITS 5.3 TheCurveRopt(L)
5.1 TractableModels EverypossiblenetworkontheinfinitePoissonpoint
Asmentionedearlier,thequantities L,",R wedis- process defines a pair (L,R), and the curve R
¯ =
cuss may be interpreted as exact values in the infinite Ropt(L)canbedefiedequivalentlyasthelowerbound-
modelorasn limitsinthefinitemodel.Toelab- ary of the set of possible values of (L,R). There is
→∞
oratebriefly,inarealizationofthefinitemodel(ncities no reason to believe that proximity graphs are exactly
distributedindependentlyanduniformlyinasquareof optimal, and, indeed, Figure 7 shows that the Delau-
area n), a network in Table 1 has a normalized length naytriangulationisslightlymoreefficientthanthecor-
L n 1 (networklength) and an average degree responding β-skeleton. But our attempts to do better
n −
= ×
" which are random variables, but there is conver- by ad hoc constructions (e.g., by introducing degree-3
¯n
gence(inprobabilityandinexpectation) junctions—seeFigure8foranexample)havebeenun-
successful. And, indeed, the fact that the two special
(10) L L, " " asn
n→ ¯n→ ¯ →∞ models in Figure 7 lie close to the β-skeleton curve
to limit constants definable in terms of the analogous lends credence to the idea that this curve is almost
network on the infinite model (rate 1 Poisson point optimal. We therefore speculate that the function R
opt
processontheinfiniteplane).Fortheproximitygraphs looks something like the curve in Figure 9, which we
or Delaunay triangulation, the network definition ap- nowdiscuss.
plies directly to the infinite model and proof of (10) is WhatcanwesayaboutR (L)?Itisapriorinonin-
opt
straightforward. For the Hammersley network, (10) is
creasing.Itisknown[47]thatthereexistsaEuclidean
implicitin[6],andfortheMSTdetailedargumentscan
Steiner tree constant L representing the limit nor-
ST
befoundin[9,43].
malized Steiner tree length in the random model, and
clearlyR (L) forL<L .Thefacts
5.2 OptimalNetworks opt ST
=∞
We now turn to consideration of optimal networks. Ropt(L) < forallL>LST
(13) ∞ ;
Given a configuration x of n cities in the area-n
R (L) 0 asL
square, and a value of L which is greater than n−1 opt → →∞
×
(lengthofSteinertree),onecandefineanumber are not trivial to prove rigorously, but follow from the
correspondingfactsforR provedin[8].Butweare
R (x,L) minofR overallnetworks max
(11) n = ˜ unable to prove rigorously that Ropt(L) is strictly de-
onxwithnormalizedlength L, creasingorthatitiscontinuous.
≤
Description:David J. Aldous is Professor, Department of Statistics,. University of California 94720, USA (e-mail:
[email protected]; paper.pdf.