Table Of ContentNetworks, 2nd Edition
MarkNewman
Solutions to Exercises
Ifyoufinderrorsinthesesolutions,pleaselettheauthorknow. Suggestionsforimprovementsarealsowelcome. Pleaseemail
MarkNewman,[email protected]. PleasedonotpostthesesolutionsontheWeborelsewhereinelectronicform. Copyright©2018
MarkNewman.
6 Mathematicsofnetworks
1 0 1 0 0
(cid:169)0 1 1 0 0(cid:170)
b) B(cid:3)(cid:173) (cid:174)
Exercise6.1: (cid:173)(cid:173)0 0 0 1 0(cid:174)(cid:174)
0 1 1 1 1
(cid:171) (cid:172)
a)Undirected
b)Directed,approximatelyacyclic 0 0 1 0 0
c)Planar, tree, directed or undirected depending on the (cid:169)0 0 1 1 1(cid:170)
representation c) BTB(cid:3)(cid:173)(cid:173)1 1 0 1 1(cid:174)(cid:174)
d)Undirected,approximatelyplanar (cid:173)(cid:173)0 1 1 0 1(cid:174)(cid:174)
e)Directedorundirecteddependingonthenetwork 0 1 1 1 0
(cid:171) (cid:172)
f)Citationnetworks,foodwebs
g)The web, the network of who says they’re friends with
whom Exercise6.4:
h)Arivernetwork,aplantoratreeortheirrootsystem a)k(cid:3)A1
i)Aroadnetwork,thenetworkofadjacenciesofcountries
j)Any affiliation network, recommender networks, key- b)m(cid:3) 211TA1
wordindices c)N(cid:3)A2
k)Awebcrawler
l)Drawdatafromaprofessionallycuratedindexsuchasthe d) 1TrA3
6
ScienceCitationIndexorScopus,orfromanautomated
citationcrawlersuchasGoogleScholar Exercise6.5:
m)Aliteraturesearch
a)A3-regulargraphhasthreeendsofedgespernode,and
n)Questionnairesorinterviews
hence3nendstotal. Butthetotalnumberofendsofedges
o)Anappropriatemap
isalsoequalto2m, whichisanevennumber. Hence n
mustbeeven.
Exercise6.2: The maximum number of edges is (cid:0)n(cid:1) because
thereare (cid:0)n(cid:1) distinctplacestoputanedgeandeach2 canhave b)A tree with n nodes has m (cid:3) n −1 edges. Hence the
only one e2dge in a simple network. The minimum is n −1 averagedegreeis2m/n(cid:3)2(n−1)/n <2.
becausewerequirethatthenetworkbeconnectedandn−1is c)TheconnectivityofAandCmustbeatleasty,becauseif
theminimumnumberofedgesthatwillachievethis—seethe thereareypathsfromBtoCandx > ypathsfromAtoB,
discussionatthetopofpage123. thenthereareatleast y pathsallthewayfromAtoC.
OntheotherhandtheconnectivityofAandCcannotbe
Exercise6.3: Thematricesareasfollows: greaterthanybythesameargument: ifthereweremore
than y paths from A to C and more than y paths from
B to A, then there would be more than y paths from B
0 1 0 0 1 toC(viaA).HencetheconnectivityofAandCmustbe
(cid:169)0 0 1 0 0(cid:170) exactlyy.
a) A(cid:3)(cid:173)(cid:173)1 0 0 0 1(cid:174)(cid:174)
(cid:173) (cid:174) Exercise6.6: Let the eigenvector element at the central node
(cid:173)0 1 1 0 0(cid:174)
0 0 0 0 0 be x1. Bysymmetrytheelementsattheperipheralnodesall
(cid:171) (cid:172)
have the same value. Let us denote this value x2. Then the
1
Networks(2ndEdition)
eigenvalueequationlookslikethis: Exercise6.8: Thetotalnumberofedgesattachedtonodesof
type1isn1c1. Thetotalnumberattachedtonodesoftype2is
0 1 1 1 ··· x1 x1 n2c2. Buteachedgeisattachedtoonenodeofeachtypeand
(cid:169)(cid:173)1 0 0 0 ··· (cid:170)(cid:174)(cid:169)(cid:173)x2(cid:170)(cid:174) (cid:169)(cid:173)x2(cid:170)(cid:174) hencethesetwonumbersmustbeequaln1c1(cid:3)n2c2.
(cid:173)(cid:173)1 0 0 0 ··· (cid:174)(cid:174)(cid:173)(cid:173)x2(cid:174)(cid:174)(cid:3)λ(cid:173)(cid:173)x2(cid:174)(cid:174),
(cid:173)(cid:173)1 0 0 0 ··· (cid:174)(cid:174)(cid:173)(cid:173)x2(cid:174)(cid:174) (cid:173)(cid:173)x2(cid:174)(cid:174) Exercise6.9: ThenetworkcontainsanexpansionofUG,and
(cid:173) ... ... ... ... ... (cid:174)(cid:173) ... (cid:174) (cid:173) ... (cid:174) henceisnonplanarbyKuratowski’stheorem:
(cid:171) (cid:172)(cid:171) (cid:172) (cid:171) (cid:172)
whereλistheleadingeigenvalue. Thisimpliesthat(n−1)x2(cid:3)
λx1andx1(cid:3)λx2√. Eliminatingx1andx2fromtheseequations
wefindthatλ (cid:3) n−1. Theequationx1 (cid:3) λx2 thenimplies
thatx1andx2havethesamesign,whichmeansthatthismust
betheleadingeigenvalue(bythePerron–Frobeniustheorem—
seethediscussiononpage160andthefootnoteonpage161).
Exercise6.7:
a)
r (Thefive-foldsymmetricappearanceofthenetworkmightlead
(cid:213)
Totalingoingedges(cid:3) kin, oneatfirsttohypothesizethatitcontainsanexpansionofK5,
i
i(cid:3)1 butuponreflectionweseethatthisisclearlyimpossible,since
r every node has degree 3, whereas every node in K5 has de-
(cid:213)
Totaloutgoingedges(cid:3) kout. gree4. Thusifthenetworkistobenonplanaritmustcontain
i
i(cid:3)1 anexpansionofUG.)
b)Thenumberofedgesrunningtonodes1...rfromnodes Exercise6.10: Theedgeconnectivityistwo. Toprovethiswe
r+1...n isequaltothetotalnumberofedgesrunning displaytwoedge-independentpathsandacutsetofsizetwo
to nodes 1...r minus the number originating at nodes thus:
1...r. Inotherwords,itisequaltothedifferenceofthe
twoexpressionsabove:
Numberofedges(cid:3)(cid:213)r (cid:0)kin−kout(cid:1). A B A B
i i
i(cid:3)1
c)Alloutgoingedgesatnoder+1mustattachtonodesin
therange1...randhencethenumberofedgesoutgoing
fromnoder+1canbenogreaterthanthetotalnumber Thisconstitutesaproofbecausetheexistenceoftwoindepen-
fromnodesr+1...ntonodes1...r. Thus dent paths proves that the connectivity must be at least two,
whiletheexistenceofthecutsetprovesthattheconnectivity
r
kro+u1t ≤(cid:213)(cid:0)kiin−kiout(cid:1). canbenogreaterthantwo.
i(cid:3)1 Exercise6.11:
Similarly,alledgesingoingatnodermustoriginatefrom a)n(cid:3)1,m(cid:3)0, f (cid:3)1.
nodesintheranger+1...nandhencethetotalnumber b)n→n(cid:48)(cid:3)n+1,m→m(cid:48)(cid:3)m+1, f → f(cid:48)(cid:3) f.
finrogmoinngodatesnord+e1r.c.a.nnbteonnoodgreesa1te.r..thr.anThthuestotalnumber c)n→n(cid:48)(cid:3)n,m→m(cid:48)(cid:3)m+1, f → f(cid:48)(cid:3) f +1.
d)Therelationis f+n−m(cid:3)2. Clearlyitistrueforcase(a),
kin ≤(cid:213)r (cid:0)kin−kout(cid:1). whichcanbeconvenientlyusedasastartingpointforin-
r i i duction.Therelationisalsopreservedby(b)and(c).And
i(cid:3)1 anyconnectedplanargraphcanbebuiltupbyaddingits
Notingthatthetotalnumberofedgesism (cid:3)(cid:205)ni(cid:3)1kiin (cid:3) nmoodveessa(bn)daenddg(ecs).oHneenbcyeobnyei,nid.eu.,ctbioynafc+omnb−inmati(cid:3)on2oisf
(cid:205)ni(cid:3)1kiout,thiscanalsobewrittenas trueforallsuchgraphs.
e)Inasimplegraphallfaceshaveatleastthreeedges,except
n
kin ≤ (cid:213) (cid:0)kout−kin(cid:1). for the “outside” face that extends to infinity, and each
r i i edge has two sides. Therefore the number of edges m
i(cid:3)r+1
2
Solutionstoexercises
times two is at least as great as f −1 times three, or 7 Measuresandmetrics
f ≤ 2m/3+1. Substitutingthisinequalityintotherela-
tionfrompart(d)wegetm ≤3n−3,andcombiningthis Exercise7.1:
withtherelationfortheaveragedegreec(cid:3)2m/nweget a)Wenotethat[A1]i (cid:3)(cid:205)jAij (cid:3)ki (cid:3)kandhenceA1(cid:3)k1.
6 b)ThevectorxofKatzcentralitiesisgivenby
c ≤6− <6.
n
x(cid:3)(I−αA)−11(cid:3)(I+αA+α2A2+...)1.
Exercise6.12:
a)The number of paths from s to t of length r is [Ar]st Noting,asabove,thatA1(cid:3)k1fortheregulargraph,this
andeachhasweightαr. Thusthesumoftheweightsfor thenbecomes
pathsoflengthris[(αA)r]st andthesumforpathsofall
lengths(includinglengthzero)is x(cid:3)(1+αk+α2k2+...)1(cid:3) 1 ,
1−αk
∞
Zst (cid:3)(cid:213)[(αA)r]st (cid:3)[(I−αA)−1]st. andhencexi (cid:3)1/(1−αk)foralli.
r(cid:3)0
c)Betweenness centrality and closeness centrality are the
b)Thesumconvergesif|α| < 1/κ1,whereκ1isthelargest obviouschoices.
eigenvalueofA(whichisalwayspositive,aconsequence
ofthePerron–Frobeniustheorem). Exercise7.2: Starting at any node, there is one node at dis-
c)Thederivativeintheproblemisgivenby tance0,twoatdistance1,twoatdistance2,andsoforthupto
amaximumdistanceof 1(n−1). Sothemeandistanceis
∂∂lologgZαst (cid:3) Zαst ∂∂Zαst (cid:3) Zαst (cid:213)r rαr−1[Ar]st 2 (n(cid:213)−1)/2 22 1 n2−1
1 (cid:213) k(cid:3) × (n2−1)(cid:3) .
(cid:3) Zst r r[(αA)r]st. n k(cid:3)1 n 8 4n
Let m bethenumberofpathsfrom s to t oflength (cid:96)st. Andtheclosenessisthereciprocalofthis,or4n/(n2−1).
ThentheleadingtermsinZst andthesumaboveare
Exercise7.3:
Zst (cid:3)(cid:213)[(αA)r]st (cid:3)mα(cid:96)st +O(α(cid:96)st+1), a)The equivalence is most easily demonstrated in the re-
(cid:213)rr[(αA)r]st (cid:3)m(cid:96)stα(cid:96)st +O(α(cid:96)st+1). tvheersnedirection. Wewritetheseriesasx (cid:3) (cid:205)∞k(cid:3)0(αA)k1,
r
∞ ∞
(cid:213) (cid:213)
Substitutingintothepreviousresultwethenget αAx+1(cid:3)αA (αA)k1+1(cid:3)1+ (αA)k1
∂∂lologgZαst (cid:3) mm(cid:96)sαtα(cid:96)s(cid:96)ts+t +OO(α(α(cid:96)s(cid:96)t+st1+)1) (cid:3)(cid:96)st+O(α). (cid:3)(cid:213)∞(kα(cid:3)A0)k1(cid:3)x. k(cid:3)1
Takingthelimitα→0thengivestherequiredresult. r(cid:3)0
Exercise6.13: Alternatively,wecanrearrangethedefinitionofxaccord-
a)There are kin incoming edges at node i and the sum of ingtoEq.(7.7)asx (cid:3) (I−αA)−11andthenexpandthe
the trophicilevels at their other ends is (cid:205)jAijxj. Thus matrixinverseasageometricseries.
tfohleloawvesr.agetrophiclevelis(1/kiin)(cid:205)jAijxjandtheresult b)bIneythonedlimthietfiorfsstmtwaolltαogweetxca(cid:39)n1n+egαleAc1twterhmicshiinmtphleiessetrhieast
b)For species with no prey, kiin (cid:3) 0 and so xi is undeter- thecentralityofnodeiisxi (cid:3)1+αki,whichislinearinthe
mined. Wecanfixthisbyartificiallysetting kin (cid:3) 1for degrees. HenceinthislimittheKatzcentralityis,apart
i fromadditiveandmultiplicativeconstants, thesameas
autotrophs (or indeed setting it to any nonzero value).
thedegreecentrality—higherdegreeimplieshigherKatz
Thentheequationforxi canberewritteninvectorform
centrality.
as
x(cid:3)D−1Ax+1, c)Let us express 1 as a linear combination of the eigen-
whereDisthematrixwiththein-degreesdownthediag- vectorsvr oftheadjacencymatrix1(cid:3)(cid:205)kckvk forsome
choiceofcoefficientsc . (Foradirectednetwork,weuse
onalor1fornodeswithzeroin-degree,and1isthevector k
(1,1,1,...). Rearranging this expression then gives the therighteigenvectors.) Then
requiredresult. (cid:213) (cid:213)
(αA)k1(cid:3)(αA)k crvr (cid:3) cr(ακr)kvr,
r r
3
Networks(2ndEdition)
where κr is the eigenvalue corresponding to eigenvec- Exercise7.7:
torvr. Summingoverkwenowhave a)Becausethenetworkisatreethereisonlyasingleshort-
∞ ∞ estpathbetweenanypairofnodes. Theparticularnode
x(cid:3)(cid:213)k(cid:3)0(αA)k1(cid:3)(cid:213)r cr(cid:213)k(cid:3)0(ακr)kvr (cid:3)(cid:213)r 1−crvαrκr, onfoidnetsereexsctelpietspoanirtshwehsheroertbeostthpamthembebtewrseefanllevinertyhepasairmoef
saollltoernmgsaisnαtκhers<um1.rNemowainasfiwnietetaekxecetphtethliemtietrαm→inr1/(cid:3)κ11,, dinistjootianlt,rseogxio(cid:3)n.nT2h−er(cid:205)eianre2i.(cid:205)in2i suchpairs,andn2pairs
whichdiverges. Hencethistermdominatesinthelimit b)Theremovalofthe ithnodedividesthelinegraphinto
andxbecomesproportionaltotheleadingeigenvectorv1. two disjoint regions of sizes n1 (cid:3) i−1 and n2 (cid:3) n−i.
Applyingtheformulawethenfindthatthebetweenness
Exercise7.4: Every node in this network is symmetry- oftheithnodeis2(n−i+1)i−1.
equivalent,soweonlyhavetocalculatetheclosenessofoneof
them. Startingatanynodethereisonenodeatdistance0,three Exercise7.8:
atdistance1, andtheremainingsixareatdistance2. Sothe
a)Thefourleftmostnodesforma3-core.
meandistanceis(0+3+12)/10(cid:3) 3andtheclosenesscentrality
2 b)Thereareeightedges, ofwhichsixarereciprocated, so
isthereciprocal 23. r (cid:3) 3.
4
Exercise7.5: The vector x of PageRank scores is given by c)The two nodes have two common neighbors and they
Eq. (7.11) to be x (cid:3) (I−αAD−1)−11. In this network, how- havedegrees4and5respectively,sotheircosinesimilar-
ever,allout-degreesare1andhenceD(cid:3)D−1(cid:3)Iand ityis
2 1
x(cid:3)(I−αA)−11(cid:3)(I+αA+α2A2+...)1. σ(cid:3) √4×5 (cid:3) √5.
(Thecentralnodehasout-degreezerobut,asdiscussedinSec- Exercise7.9:
tion7.4,weconventionallysettheout-degreetoonetoavoid
a)Every node in the first network is connected to at least
dividingbyzero,andthischangehasnoeffectonthePageR-
threeoftheothers,sotheentirenetworkisa3-core. In
ank values.) But recall now that the matrix element [Ad]ij thesecondnetwork,however,thereisonenodewithonly
countsthenumberofpathsoflengthd from j toi andhence
n(id) (cid:3) (cid:205)j[Ad]ij (cid:3) [Ad1]i is the number of paths of length d trwemoonveiingghbaollrssu.bRseemquoevnintgnothdiessnwoidtheatwndoothrefnewiteerranteivigehly-
fromallnodestonodei. Sincethenetworkisadirectedtree, bors, we end up removing the entire network. Hence
however,thereisatmostonepathbetweeneachpairofnodes, thereisno3-coreinthesecondnetwork,despiteitsclose
(d)
andhencen isalsothenumberofnodesthathavedistance similaritytothefirst.
i
exactlydfromi. Thus,ifthecentralnodeisnode1,then b)If you consider single nodes to be strongly connected
n(d)(cid:3)(cid:213)δ , componentsthentherearethreesuchcomponentsinthis
1 did network:
i
where δmn istheKroneckerdelta. NowthePageRankofthe
centralnodeis
∞ ∞ ∞
x (cid:3)(cid:213)αd(cid:2)Ad1(cid:3) (cid:3)(cid:213)αdn(d)(cid:3)(cid:213)αd(cid:213)δ
1 1 1 did
d(cid:3)0 d(cid:3)0 d(cid:3)0 i
∞
(cid:213) (cid:213) (cid:213)
(cid:3) αdδ (cid:3) αdi.
did
i d(cid:3)0 i
Exercise7.6: LetL1andR1bethesumsofthedistancesfrom
node1tonodesintheleftandrightshadedregionsandsimi-
(Dependingonwhoyouask,asinglenodemayormay
larlyforL2andR2withnode2. Thenwenotethatthedistance notbeconsideredastronglyconnectedcomponent.)
fromnode2toanynodeintheleftshadedregionis1greater
c)Toptobottomandlefttorightthelocalclusteringcoeffi-
thanthecorrespondingdistancefromnode1tothesamenode
andhenceL2(cid:3)L1+n1. SimilarlyR1(cid:3)R2+n2. Addingthese cientsofthenodesare0,1, 31,1,and 23.
twoexpressionsgives d)Intermsofthequantitieser andar definedonpage205
L1+R1+n1(cid:3)L2+R2+n2. Tofhtehnetbhoeomk,owduelharaivteyeis1 (cid:3) 130,e2 (cid:3) 12,a1 (cid:3) 25,anda2 (cid:3) 35.
NotingthattheclosenesscentralitiesaregivenbyC1(cid:3)n/(L1+ 7
R1)andC2(cid:3)n/(L2+R2),wethenrecovertherequiredresult. Q(cid:3)e11−a12+e22−a22(cid:3) 25.
4
Solutionstoexercises
e)Therearen2pathstotalandallofthemstart,end,orpass Exercise7.14: AssumethenetworksatisfiesDavis’scriterion
throughthecentralnodeexceptforthosethatstartand ofhavingnoloopswithexactlyonenegativeedge. Performing
endatthesameperipheralnode,ofwhichtherearen−1. the coloring as described and then adding back in the nega-
Hencethebetweennessofthecentralnodeisn2−(n−1). tive edges, we see that a negative edge will fall between two
nodesof thesame colorif andonly ifthose nodesare inthe
Exercise7.10: Therearethreeindependentpathsbetweenev- samecomponent,meaningthattheyareconnectedbyapath
erypairofnodesina3-component. Anodeina3-core,onthe of positive edges. That path plus the newly added negative
otherhand,needonlyhaveedgesconnectingittothreeother edgethenformaloopwithexactlyonenegativeedge. Butby
membersofthe3-core,whichisaweakercondition. Thisnet- hypothesistherearenosuchloopsinthenetworkandhence
work,forexample,isasingle3-core,buthastwo3-components: nonegativeedgescanfallbetweennodesinthesamecompo-
nent: they only fall between nodes in different components.
Hencealledgesbetweencomponentsarenegative. Giventhat
alledgeswithincomponentsarebydefinitionpositive(since
thisishowweconstructedthecomponentsinthefirstplace),
the graph is therefore clusterable and the components of the
positive-edgenetworkaretheclusters.
Exercise7.11: Onethirdoftheedgesarenotreciprocatedand
Exercise7.15: Thisquestionismostsimplyansweredinvector
twothirdsare,sor (cid:3) 2.
3 notation. Summingover j isequivalenttomultiplyingbythe
uniformvector1(cid:3)(1,1,1,...),whichgives:
Exercise7.12:
a)Itisbalanced,asonecanshowbyexhaustivelyverifying σ1(cid:3)(D−αA)−11(cid:3)[(I−αAD−1)D]−11
thatallloopscontainanevennumberofminussigns. (cid:3)D−1(I−αAD−1)1(cid:3)D−1x,
b)Allbalancedgraphsareclusterableandhencethisoneis
too. Herearetheclusters: wherex(cid:3)(I−αAD−1)1isthevectorofPageRankscores(see
Eq.(7.11)). NotingthatD−1isthediagonalmatrixwiththere-
ciprocalsofthedegreesalongitsdiagonal,thisthencompletes
theproof.
Exercise7.16:
a)The numbers er are simple—they are just the diagonal
entriesinthetable. Togetthearweneedtoaddthefrac-
tionofcouplesinwhichthewomanisingrouprandthe
fractioninwhichthemanisingroup r, thendivideby
two(sincearisdefinedasthefractionofendsofedgesin
groupr andtheedgecorrespondingtoeachcouplehas
twoends). Thuswehavea1 (cid:3)(0.323+0.289)/2(cid:3)0.306,
Exercise7.13: The number of times the color changes as we and similarly a2 (cid:3) 0.226, a3 (cid:3) 0.400, and a4 (cid:3) 0.068.
go around a loop is equal to the number of minus signs. If Thenthemodularityis
thisnumberisodd,thenwechangeanoddnumberoftimes,
meaningthatweendupwiththeoppositecolorfromtheone Q(cid:3)0.258+0.157+0.306+0.016
westartedwithwhenwegetbacktothestartingnode. Thus −0.3062−0.2262−0.4002−0.0682
thelastedgearoundtheloopwillnotbesatisfied: eitheritis
(cid:3)0.428.
positiveandjoinsunlikecolorsoritisnegativeandjoinslike
ones. Ifallloopshaveaevennumberofminussigns,onthe
b)Applying the same approach to the second set of data
other hand, we never run into problems, and the entire net- gives a1 (cid:3) 0.345, a2 (cid:3) 0.250, and a3 (cid:3) 0.395, and the
workcanbecoloredinthisway. Thenwesimplydividethe
modularitywithrespecttopoliticalalignmentis
networkintocontiguousgroupsoflike-colorednodes. Bydef-
initionalledgeswithinsuchaclusterarepositiveandalledges Q(cid:3)0.25+0.15+0.30−0.3452−0.2502−0.3952
betweendifferent-coloredclustersarenegative. Thereareno (cid:3)0.362.
edgesbetweenclustersofthesamecolor,becauseiftherewere
theclusterswouldbeconsideredonelargeone,nottwosmaller c)Both of these modularity values are quite high as such
ones. Hencethenetworkisclusterableinthesensedefinedby things go—values above Q (cid:3) 0.3 are often considered
Harary. significant—sothereseemstobesubstantialhomophily
inthesepopulations,meaningthatcouplestendtohave
similarethnicityandsimilarpoliticalviewssignificantly
moreoftenthanonewouldexpectbyrandomchance.
5
Networks(2ndEdition)
8 Thelarge-scalestructureofnetworks withthedegrees ki asitselements. Whenwemultiply
anarbitraryvectorvbythismatrixweget
Exercise8.1:
k
a)If you double the area of the carpet it will take about Bv(cid:3)Av− kTv.
twiceaslongtovacuum,sothecomplexityisO(n). 2m
b)One finds words in a dictionary, roughly speaking, by The first term can be evaluated in time O(m +n) as in
binarysearch—openthebookatarandompoint,decide part(b). TheinnerproductkTvinthesecondtermtakes
whetherthewordyouwantisbackwardorforwardfrom time O(n) to evaluate, then we simply multiply it by
whereyouare,openthebookatanotherrandompointin k/2m, which takes a further time O(n). Thus the total
thatdirection,andrepeat. Eachtimeyoudothisyoude- timeforthecomputationisO(m+n).
creasethedistancetothedesiredwordby,onaverage,a
Exercise8.4:
factoroftwo. Whenthedistancegetsdowntooneword,
you have found the word you want. The number k of a)O(n)
factors of two needed to do this is given by 2k (cid:3) n, so b)OneachroundofthealgorithmwetakeO(n)timetofind
k(cid:3)log2nandthecomplexityisO(logn). thehighest-degreenode,thentimeO(m/n)toremoveit
(seeTable8.2onpage228),foratotalofO(n+m/n)time
Exercise8.2:
perround. Therearen rounds,sototalrunningtimeis
a)Perform a breadth-first search starting from the given O(n2 +m). (Normally m is less than n2, so to leading
nodetofindthedistancetoallothernodes,thenaverage orderitcanbeignoredandtherunningtimeisO(n2).)
thosedistancesandtakethereciprocaltogetthecloseness c)Ifweuseaheapwecanfindthehighest-degreenodein
centrality. Thebreadth-firstsearchtakestimeO(m+n) timeO(1)andremoveitfromtheheapintimeO(logn)
andtheaveragetakestimeO(n),sotheoverallrunning and from the network in time O(m/n), for a total time
timeisO(m+n). ofO(m/n+logn)perroundofthealgorithm,toleading
b)UseDijkstra’salgorithm. Ifimplementedusingabinary order. Over n rounds the whole calculation then takes
heap,thetimecomplexitywouldbeO((m+n)logn). timeO(m+nlogn).
d)We place all the numbers in the heap, which takes
c)Userepeatedbreadth-firstsearches. Startatnode1and
O(logn) time per number or O(nlogn) for all of them,
performabreadth-firstsearchtofindallthenodesinthe
thenrepeatedlyfindandremovethelargestone. Find-
component it belongs to. Then find the next node that
ingthelargestonetakestimeO(1)andremovingittakes
isnotinthatcomponentandstartanotherbreadth-first
timeO(logn),andhencethetotalrunningtimeforsort-
searchfromtheretofindallthenodesinitscomponent.
ingnnumbersisO(nlogn)toleadingorder.
Repeat until there are no nodes left that are not in any
ofthepreviouslydiscoveredcomponents. Eachbreadth- e)Makeahistogramofthedegreesasfollows. First,create
firstsearchtakestimeO(nc +mc),wherenc andmc de- an array of n integers, initially all equal to zero, which
notethenumbersofnodesandedgesinthecomponent. takes time O(n). This array represents the bins in our
Summingoverallcomponents,thetotalrunningtimeis histogram. Thengothroughthedegreesonebyoneand
O(n+m). for each degree k increase the count in the kth bin by
one. This also takes time O(n), and at the end of the
d)You could use a truncated version of the augmenting
process the kth bin will contain the number of degrees
path algorithm, in which you repeatedly find indepen-
equaltok. Nowprintoutthecontentsofthehistogram
dentpaths,butstopwhenyouhavefoundtwo—thereis
inorderfromlargestdegreestosmallest,goingthrough
no need to keep going beyond this point if your aim is
eachbininturnandprintingoutseparatelyeachofthe
onlytofindtwopaths.
node degrees it contains. For instance if the k (cid:3) 5 bin
Exercise8.3: containsthreenodes,printouta5threetimes. Thistoo
takestimeO(n). Theendresultwillbeaprintedlistof
a)Multiplyingann×nmatrixintoann-elementvectorin- the degrees in decreasing order, which takes time O(n)
volves n2 multiplies and n2 additions. So the running
togenerate. Thisalgorithmis(aversionof)radixsort.
timeforthecompletecalculationisO(n2).
b)Setupann-elementarraytorepresentthematrix,which Exercise8.5:
takestimeO(n),thenaddtoitonetermforeachnon-zero a)For a network stored in adjacency list format, one nor-
elementintheadjacencymatrix,ofwhichthereare2m. mallyhasthedegreesstoredseparatelyaswell,inwhich
SothetotalrunningtimeisO(m+n). casecalculatingtheirmeanissimplyamatterofsumming
c)AtfirstsightthiscalculationisgoingtotaketimeO(n2) themandthendividingbyn,whichtakesO(n)time.
as in part (a) above. But we can do it faster by noting b)Thesimplestwaytocalculatethemedianistosortthelist
thatthemodularitymatrixcanbewritteninmatrixnota- of degrees in either increasing or decreasing order and
tionasB(cid:3)A−kkT/2m,wherekisthen-elementvector thenfindtheoneinthemiddleofthelist. Asdiscussed
6
Solutionstoexercises
inExercise8.4,thissortingtakeseitherO(nlogn)timeor Ifthedegreeswerecorrelatedyoucouldhaveaproblem,
O(n)time,dependingonthealgorithmused,andhence becausemoreedgescouldendatnodeswithhighout-degree
findingthemediantakesthesametime. than you would expect on average. That would increase the
c)OnewoulduseDijkstra’salgorithm,whichhascomplex- amount of work you needed to do to check all the outgoing
ityO((m+n)logn). edges. Imagine, for instance, the extreme case of a star-like
networkinwhichhalfofalledgespointedinwardstoasingle
d)Thisrequiresustocalculatetheminimumnodecutset,
hubnodeandtheotherhalfpointedoutwards. Thenforthe
whichwecandousingtheaugmentingpathalgorithm.
The running time is O((m + n)m/n), as shown in Sec- 12mingoingedgesyouwouldneedtocheckthedestinationsof
tion8.7.2. 1m outgoingedgesforreciprocity,andthewholecalculation
2
wouldtaketimeO(m2),whichismuchlargerthanO(m2/n).
Exercise8.6:
a)Ifweknowthetruedistancetoeverynodeatdistanced Exercise8.9:
ohravleesdsitshtaenncaendy+no1doerwgirtehaotuert.aInfsausscihgnaendoddiestiasnacdejamceunstt a)xi (cid:3)(cid:205)jαdij.
b)Usebreadth-firstsearchtocalculateandstoreallthedis-
to one with distance d, however, then there is at least
onepathtoitoflengthd+1,viaitsdistance-dneighbor. tances, thenrunthroughallnodesperformingthesum
Henceitsdistancemustbeexactlyd+1. above.
b)Theremustbeatleastonepathoflengthd+1tothenode c)The time complexity for each node is O(m + n) and
O(n(m+n))forallnodes.
inquestion(letuscallitnodev). Thedistancealongthat
pathtothepenultimatenodeu(whichisaneighborofv)
isthend,meaningthatuhasdistancenogreaterthand.
Butucanalsohavedistancenolessthand,sinceifitdid 9 Networkstatisticsandmeasurementerror
thentherewouldbeapathoflengthlessthand+1tov
andhenceitsdistancewouldnotbed+1. Thusu,which Exercise9.1:
isaneighborofv,musthavedistanced. a)Confusionsarisingbecauseauthorshavethesamename.
Confusions arising because the same author may give
Exercise8.7:
their name differently on different papers. Missing pa-
a)To find the diameter of a network we must perform a pers.
breadth-firstsearchstartingfromeachnode,thentakethe b)Some pages are not reachable from the starting point.
largestdistancefoundinanysuchsearch. Eachbreadth- Dynamicallygeneratedpagesmighthavetobeexcluded
firstsearchtakestimeO(m+n)andtherearensearches, toavoidgettingintoinfiniteloopsortrees.
sothetotalrunningtimeisO(n(m+n)).
c)Laboratory experimental error of many kinds. Missing
b)Listing the neighbors of a node i is simply a matter of
pathways.
runningalongtheappropriaterowoftheadjacencylist,
d)Subjectivityonthepartofparticipants. Missingpartici-
whichtakestimeofordertheaveragelengthoftherow,
whichis (cid:104)k(cid:105). Tolistthesecondneighbors,however,we pants. Inaccuratespecificationofwhata“friend”is.
needtolistseparatelyallthekj neighborsofeachofthe e)Out-of-dateorinaccuratemaps.
firstneighbors j. Butthenumberofendsofedgesthat
attachtonode jis,bydefinition,kj,andhencewearekj Exercise9.2:
timesmorelikelytobeattachedtonode jthantoanode
a)Thelikelihoodis
withdegree1. Takingtheappropriateweightedaverage,
theexpectednumberofneighborsofaneighboris n
L(cid:3)µn(cid:214)e−µxi.
(cid:205)jkj×kj (cid:3) (cid:104)k2(cid:105), i(cid:3)1
(cid:205)jkj (cid:104)k(cid:105)
andtheaveragetotalnumberofsecondneighborsis(cid:104)k(cid:105) b)Tenhteialtoign-glikweiltihhoroesdpiescLtto(cid:3)µnalnodgµse−ttµin(cid:205)gnit(cid:3)h1exrie,saunldt,todizffeerro-
timesthis,or(cid:104)k2(cid:105),whichisalsotheamountoftimeitwill weget
taketolistthesecondneighbors. n
1 1 (cid:213)
(cid:3) x .
Exercise8.8: To calculate the reciprocity you have to go µ n i
i(cid:3)1
througheachedgeinthenetworkinturn,ofwhichtherearem,
andaskwhetheranyoftheedgesoutgoingfromthenodeat Exercise9.3: After the algorithm converges, the parameter
thetailendoftheedgeconnectbacktothenodeatthehead. valuesareα (cid:3) 0.598, β (cid:3) 0.202,andρ (cid:3) 0.415,andtheprob-
TocheckthroughalltheoutgoingedgestakesatimeO(m/n) abilities of edges existing between each pair of nodes are as
onaverage,sothewholecalculationtakesO(m2/n). follows:
7
Networks(2ndEdition)
way, inasparsenetworkthereareveryfewedges, and
0.965 0.442 eovnelynbifeaalltionfytfhreamctiowneorefafalllsneodpoespitaiivress,stohetyhewvoaululdeostfilβl
0.022
wouldbesmall.
9
0.119 0.11 0.823 Earxeerncoisfeal9s.e6:poIfsiotibvseesr:vawtihoennswofeesdegeeasnareedgreel,iaitb’sler,etahlleyntthheerree.
Soβ(cid:3)0inthiscase. Substitutingβ(cid:3)0intoEq.(9.29),weget
ρ(1−α)N
Exercise9.4: Q (cid:3)
ij ρ(1−α)N +(1−ρ)(1−β)N
a)Wearecertainabouteverypairthatisconnectedbyan
edgeinanyofthesepaths: theydefinitelyhaveanedge. ifEij (cid:3)0andQij (cid:3)1otherwise. Substitutingthesevaluesinto
Wearealsocertainaboutthenon-existenceofanyedge Eq.(9.30)thengives
thatwouldshortenapath. Forinstance,iftherewerean α(cid:3) (cid:205)i<jEij ,
edgebetweennode5andnode7,thentheshortestpath (cid:205)
to node 7 would go along that edge. Since it does not, N i<jQij
weknowthattheedgedoesnotexist. Forthisreasonwe withρgivenbythesameexpressionaspreviouslyandβ(cid:3)0.
canbesurethereisnoedgebetweenthepairs(1,3),(1,4),
(1,6),(1,7),(1,8),(2,7),and(5,7). Theremainingpairsare 10 Thestructureofreal-worldnetworks
alluncertain: theycouldhaveanedgeornot.
Exercise10.1:
b)(i)Theyaredefinitelyconnectediftheyhaveanedgein
a)Everynodeisconnectedbyanedgetoeveryother,sothe
anyofthepaths. (ii)Theyaredefinitelynotconnectedif
shortestpathbetweenanytwodistinctnodeshaslength1,
addinganedgewouldcreateashorterpathtoanynode.
andhencethediameterisalso1.
(iii)Allotheredgesareuncertain.
b)Thefurthestpointsonthelatticeareoppositecorners. To
Exercise9.5: reachonecornerfromtheotheryouhavetogo L steps
a)Droppingtheparametersα, β,andρfromournotation along and L steps down for a total of 2L steps, so the
diameteris2L. Theequivalentresultforad-dimensional
forthesakeofbrevity,wehave
hypercubic lattice is dL. The number of nodes on the
P(Oij (cid:3)1)(cid:3)P(Oij (cid:3)1,Aij (cid:3)1)+P(Oij (cid:3)1,Aij (cid:3)0) hypercubic lattice is n (cid:3) (L +1)d, which implies that
(cid:3)P(Oij (cid:3)1|Aij (cid:3)1)P(Aij (cid:3)1) Ld(n(cid:3)1/nd1−/d1−).1. Thus the diameter as a function of n is
+P(Oij (cid:3)1|Aij (cid:3)0)P(Aij (cid:3)0). c)Onthefirststepwereach k nodes. Oneachofthesub-
sequentd−1stepsthetreebranchesbyafactorofk−1,
b)Wehave: so the number of nodes is multiplied by k−1. After d
P(Oij (cid:3)1|Aij (cid:3)1)(cid:3)α, steps, therefore, we reach k(k −1)d−1 nodes. The total
number n ofnodesreachablein d stepsorlessisthen
P(Oij (cid:3)1|Aij (cid:3)0)(cid:3)β, givenbytdhesumofthisquantityoverdistances1to d,
P(Aij (cid:3)1)(cid:3)ρ, plus1forthecentralnode:
P(Aij (cid:3)0)(cid:3)1−ρ. 1+ (cid:213)d k(k−1)m−1(cid:3)1+k (cid:213)d−1(k−1)m
Substitutingalloftheseintotheexpressionsaboveand m(cid:3)1 m(cid:3)0
c)iFnorthtehqeureesatiloitnygmiviensinthgeerxeaqmuipreledwaneswhearv.e α (cid:3) 0.4242, (cid:3)1+ k−k 2(cid:2)(k−1)d−1(cid:3).
β (cid:3) 0.0043, and ρ (cid:3) 0.0335, whichgivesafalsediscov- When this number is equal to n we have reached the
eryrateof0.226. Inotherwords,morethanoneinfive wholenetwork,andthediameteristheside-to-sidedis-
observededgesisactuallywrong. tance in the network, which is twice the corresponding
d)Thefalsediscoveryrateisrelativelylargebecausetheob- valueof d. Settingtheaboveexpressionequalto n and
servationsareunreliable: inthelanguageofthereality rearrangingfordweget
miningstudy,manypairsofpeoplewhoareobservedin log[1+(k−2)(n−1)/k]
proximity do not actually have a connection in the net- diameter(cid:3)2 .
log(k−1)
work.Eventhoughthisistrue,however,thefalsepositive
rateisstillsmallbecausemostpeoplewhodonothave d)Thediameterofnetworks(a)and(c)growslogarithmi-
a connection are never observed in proximity. This is cally or slower with n and hence these networks show
justbecausethegraphissparse: mostpairsofpeopleare the small-world effect. Network (b) does not, although
neverobservedinproximityatall. Toputthatanother onecouldarguethatitdoesinthelimitoflarged.
8
Solutionstoexercises
Exercise10.2: Exercise10.5:
a)The constant is fixed by the normalization condition a)Theoneontherightisroughlyscale-free,aswecantell
(cid:205)∞k(cid:3)0pk (cid:3)1whichmeansthat because the cumulative distribution is approximately a
straightlineonthelogarithmicscalesusedinthefigure.
∞
(cid:213) b)Theslopeofthelineintheright-handfigureisapproxi-
1(cid:3)C ak (cid:3)C/(1−a). mately1.1,sotheexponentisα(cid:3)2.1(becausetheslope
k(cid:3)0 ofthecumulativeplotisonelessthantheexponent).
HenceC(cid:3)1−a. c)RearrangingEq.(10.24)forPgivesP(cid:3)W(α−1)/(α−2)and
b) settingW (cid:3) 1 andα (cid:3)2.1thengivesP (cid:3)4.9×10−4,or
∞ ∞ 2
(cid:213) (cid:213) about0.05%. Inotherwords,lessthan 1 ofapercentof
P(cid:3) p (cid:3)(1−a) am (cid:3)ak. 20
k thebestconnectednodeshaveahalfofalltheedgeends.
m(cid:3)k m(cid:3)k
c)Thereare m endsofedgesattachedtoeachnodeofde- Exercise10.6:
greemandtherearenpm suchnodes,sothetotalnum- a)Theaveragedegreeis(nm−1)pm (cid:3)A(nm−1)−β+1.
ber of ends of edges attached to nodes of degree m is
b)Theprobabilitythattwoofyourneighborsareconnected
mnpm. Thenumberofendsofedgesattachedtonodes within your group is just the probability that any two
ofdegreek orgreaterisgivenbythisquantitysummed nodesareconnected,whichispm (cid:3)A(nm−1)−β.
overmthus:
c)Eliminating nm between the previous two expressions
(cid:213)∞ (cid:213)∞ k−ka+a givestherequiredanswer.
mnpm (cid:3)n(1−a) mam (cid:3)nak (1−a)2 . d)For the local clustering to fall off as (cid:104)k(cid:105)−3/4 we need
m(cid:3)k m(cid:3)k β/(1−β)(cid:3) 3 orβ(cid:3) 3.
4 7
Thetotalnumberofendsofedgesisthesameexpression
withk(cid:3)0,whichisjustna/(1−a)2. Dividingonebythe
11 Randomgraphs
other,wethenfindthat
Exercise11.1:
W (cid:3)ak(cid:2)1−k(1−a−1)(cid:3).
a)Theprobabilityofanyparticularsetofthreenodesform-
ingatriangleisp3,andthereare (cid:0)n(cid:1) possiblesuchsets.
d)Eliminating k betweenourexpressionsforP andW we 3
Hencetheexpectednumberoftrianglesinthenetworkis
thenhavetheclaimedresult.
e)The unphysical values W > 1 all fall in the range (cid:16)n(cid:17) c3
0 < k < 1. However, k is only allowed to take integer 3 p3(cid:3) 16n(n−1)(n−2)(n−1)3 (cid:39) 16c3,
values, so W is never greater than 1 in any real-world
situation. where the approximate equality becomes exact in the
limit of large n. Note that the appearance of triangles
Exercise10.3: α(cid:3)2.53±0.34 indifferentpositionsisnotindependent,sincesometri-
angles share edges, but this makes no difference to the
Exercise10.4: Thenumeratorof(10.27)is
result: theexpectednumberoftrianglesisequaltothe
(cid:213) (cid:213) 1 (cid:213) (cid:213) expected number in each position times the number of
(Aij−kikj/2m)kikj (cid:3) Aijkikj− 2m ki2 k2j positions, regardless of whether the triangles are inde-
ij ij i j pendent(becausetheaverageofasumisequaltoasum
S2 ofaverages).
(cid:3)Se− S2, b)Similarly,theprobabilityofaconnectedtripleinanypar-
1
ticularpositionisp2andthenumberofpossiblepositions
where we have made use of 2m (cid:3) (cid:205)iki (cid:3) S1. Likewise, the isthenumberofwaysofchoosingthecentralnodeofthe
denominatoris triplethenchoosingtwoothers,whichisn×(cid:0)n−1(cid:1). Thus
2
(cid:213) (cid:213) 1 (cid:213) (cid:213) theexpectednumberofconnectedtriplesis
(kiδij−kikj/2m)kikj (cid:3) ki3− 2m ki2 k2j
ij i i j 1n(n−1)(n−2) c2 (cid:39) 1nc2.
S2 2 (n−1)2 2
(cid:3)S − 2.
3
S1 c)FollowingEq.(7.28),theclusteringcoefficientis
Dividingnumeratorbydenominatorandmultiplyingtopand 3× 1c3 c
bottombyS1thengivestherequiredanswer. 1n6c2 (cid:3) n.
2
9
Networks(2ndEdition)
Exercise11.2:
a)1−S is the probability, averaged over all nodes, that a # Pull a node off the queue
i = q[pout]
nodedoesnotbelongtothegiantcomponent. Foranode
pout += 1
specificallyofdegreektonotbelongtothegiantcompo-
nent all of its k neighbors must not belong to the giant
# Check its neighbors
component,whichhappenswithprobability(1−S)k.
for j in edge[i]:
b)ByBayes’rule,theprobabilityP(k|GC)ofanodehaving if d[j]==0:
degreekgiventhatitisnotinthegiantcomponentisre- d[j] = c
latedtotheprobabilityP(GC|k)thatitisnotinthegiant q[pin] = j
componentgiventhatithasdegreekthus: pin += 1
P(k|GC)(cid:3)P(GC|k) P(k) (cid:3)(1−S)k e−cck # Check if this is the largest component
P(GC) k!(1−S) if pin>maxs: maxs = pin
e−cck(1−S)k−1
(cid:3) . print("Largest component has size",maxs)
k!
Onatypicalruntheprogramprints
Exercise11.3: Hereisanexampleprogramtosolvethisprob-
Largest component has size 500644
lem,writteninPython:
Inotherwordsithasfoundavalueof
from math import log
from numpy import empty,zeros S(cid:3) 500644 (cid:3)0.500644.
from random import randrange 1000000
ThetruevalueisS(cid:3) 1,soweareoffbylessthan0.1%.
n = 1000000 # Number of nodes 2
c = 2*log(2) # Mean degree Exercise11.4:
m = int(n*c/2) # Number of edges
a)Theaveragedegreeisgivenby
edge = empty(n,set) # Adjacency list
for i in range(n): ln(1−S) ln1
edge[i] = set() c(cid:3)− S (cid:3)− 12 (cid:3)2ln2(cid:3)1.38...
2
# Place the edges b)
for ki i=nrraanndgrea(nmg)e:(n) p5(cid:3)e−cc55! (cid:3)0.0107...
j = randrange(n) orabout1%.
while (i==j) or (i in edge[j]):
c)It is not a member of the giant component if and only
i = randrange(n)
if none of its five neighbors are, which happens with
j = randrange(n) probability (cid:0)1(cid:1)5 (cid:3) 1 . Thusitisamemberofthegiant
edge[i].add(j) 2 32
component with probability 1− 1 (cid:3) 31 (cid:3) 0.96875, or
edge[j].add(i) 32 32
about97%.
# Create queue and set up breadth-first search d)WecanuseBayes’rulethus:
q = empty(n,int) # Queue array P(k)
d = zeros(n,int) # Component labels P(k|ing.c.)(cid:3)P(ing.c.|k)P(ing.c.)
c = 0 # Number of components
maxs = 0 # Largest component (cid:3) 31 ×e−cc5 ×2(cid:3)0.0207...
32 5!
for v in range(n): orabout2%—twiceashighasthefractioninthenetwork
if d[v]==0: asawhole.
q[0] = v # First node in queue
pin = 1 # Write pointer Exercise11.5:
pout = 0 # Read pointer a)Theprobabilityofhavingnoedgestothegiantcompo-
c += 1 nent is simply equal to the probability of not being in
d[v] = c # Label node v thegiantcomponent,whichis1−S(cid:3)e−cSbyEq.(11.16).
Alternatively,ifwewantamoreelaborateproof,theprob-
# Main loop abilityofbeingconnectedtothegiantcomponentviaa
while pin>pout: particular other node is the probability p of having an
10