Table Of ContentYNIMG-08516;No.ofpages:12;4C:
NeuroImagexxx(2011)xxx–xxx
ContentslistsavailableatSciVerseScienceDirect
NeuroImage
journal homepage: www.elsevier.com/locate/ynimg
Technical Note
Comparing Dynamic Causal Models using AIC, BIC and Free Energy
⁎
W.D. Penny
WellcomeTrustCentreforNeuroimaging,UniversityCollege,LondonWC1N3BG,UK
a r t i c l e i n f o a b s t r a c t
Articlehistory: Inneuroimagingitisnowbecomingstandardpractisetofitmultiplemodelstodataandcomparethemusing
Received23June2011 a model selection criterion. This is especially prevalent in the analysis of brain connectivity. This paper
Accepted7July2011 describesasimulationstudywhichcomparestherelativemeritsofthreemodelselectioncriteria(i)Akaike's
Availableonlinexxxx
InformationCriterion(AIC),(ii)theBayesianInformationCriterion(BIC)and(iii)thevariationalFreeEnergy.
DifferencesinperformanceareexaminedinthecontextofGeneralLinearModels(GLMs)andDynamicCausal
Keywords:
Models(DCMs).WefindthattheFreeEnergyhasthebestmodelselectionabilityandrecommenditbeused
Bayesian
forcomparisonofDCMs.
Modelcomparison
Brainconnectivity ©2011ElsevierInc.Allrightsreserved.
DynamicCausalModelling
fMRI
Introduction classofmodelwedefinea‘full’anda‘nested’model,wherethenested
modelisaspecialcaseofthefullmodelwithasubsetofparameters
Mathematicalmodelsofscientificdatacanbeformallycompared settozero.Weexaminehowmodelcomparisonaccuracyvariesasa
using the Bayesian model evidence (Bernardo and Smith, 2000; function of number of data points and signal to noise ratio for the
Gelmanetal.,1995;Mackay,2003),anapproachthatisnowwidely separatecasesofdatabeinggeneratedbyfullornestedmodels.This
usedinstatistics(Hoetingetal.,1999),signalprocessing(Pennyand allowsustoassessthesensitivityandspecificityofthedifferentmodel
Roberts,2002),machinelearning(BealandGhahramani,2003),and comparison criteria. The paper uses simulated data generated from
neuroimaging(Fristonetal.,2008;Pennyetal.,2003;Trujillo-Barreto models with known parameters but these parameters are derived
et al., 2004). By comparing the evidence or ‘score’ of one model fromempiricalneuroimagingdata.Westartbybrieflyreviewingthe
relativetoanother,adecisioncanbemadeastowhichisthemore relevanttheoreticalbackgroundandthengoontopresentourresults.
veridical.Thisapproachhasnowbeenwidelyadoptedfortheanalysis
ofbrainconnectivitydata,especiallyinthecontextofDynamicCausal Methods
Modelling(DCM)(Fristonetal.,2003;Pennyetal.,2004).
Originally (Penny et al., 2004), it was proposed to score DCMs We consider Bayesian inference on data y using model m with
usingacombinationofAkaike'sInformationCriterion(AIC)andthe parameters θ. In the analysis of brain connectivity, the data would
Bayesian Information Criterion (BIC) criteria. Specifically, it was comprise,forexample,fMRItimeseriesfrommultiplebrainregions,
proposedthat(Pennyetal.,2004)ifbothAICandBICprovidedalog the model would make specific assumptions about connectivity
Bayesfactor(differenceinlogmodelevidences)ofgreaterthanthree structure, and the parameters would correspond to connections
in favour of model one versus two, one could safely conclude that strengths.Agenericapproachforstatisticalinferenceinthiscontextis
modelonewasthemoreveridical.Morerecentlyithasbeenproposed Bayesian estimation (Bishop, 2006; Gelman et al., 1995) which
(Stephanetal.,2010),ontheoreticalgrounds,toinsteadscoreDCMs provides estimates of two quantities. The first is the posterior
usingtheFreeEnergy(Fristonetal.,2007a).However,untilnowthere distribution over model parameters p(θ|m,y) which can be used to
hasbeennoempiricalcomparisonofthemodelcomparisonabilities make inferences about model parameters θ. The second is the
ofthedifferentapproaches. probability of the data given the model, otherwise known as the
Thismotivatestheworkinthispaperwhichdescribesasimulation modelevidence.Thiscanbeusedformodelcomparison,inthatratios
study comparing AIC, BIC and the Free Energy. Differences in of model evidences (Bayes factors) allow one to choose between
performance areexaminedinthecontextofGeneralLinearModels models(KassandRaftery,1995;Raftery,1995).Thispaperfocusses
(GLMs) and Dynamic Causal Models (DCMs). Specifically, for each onDynamicCausalModelsandonmodelinferenceusingAIC,BICor
FreeEnergyapproximationstothemodelevidence.Wefirstdescribe
DCM,showhowmodelparametersareestimated,describeBayesian
⁎ Fax:+442078131420. inference for General Linear Models and then go on to describe
E-mailaddress:w.penny@fil.ion.ucl.ac.uk. the different model selection criteria. In what follows N(x;m,S)
1053-8119/$–seefrontmatter©2011ElsevierInc.Allrightsreserved.
doi:10.1016/j.neuroimage.2011.07.039
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
2 W.D.Penny/NeuroImagexxx(2011)xxx–xxx
representsamultivariateGaussiandensityoverxwithmeanmand priorvarianceswhichdependonthenumberofregionsinthemodel
covarianceS,and|S|denotesthedeterminantofmatrixS. (Friston et al., 2003), and in this paper we model activity in three
regions.Fortheintrinsicself-connectionswehave
DCMforfMRI (cid:2) (cid:3)
pðA jmÞ=N A ;−1;σ2 ð3Þ
DynamicCausal Modelling is a framework for fittingdifferential ii ii self
equation models of neuronal activity to brain imaging data using
withσ =0.177.Thetimeconstantassociatedwithaself-connection
Bayesianinference.ThereisnowalibraryofDCMsandvariantsdiffer self
isτ=−1/A ,andthetimeatwhichactivitydecaystohalfitsinitial
according to their level of biological realism and the data features i ii
value(half-life)is(1/A )log0.5(Fristonetal.,2003).Thepriorover
whichtheyexplain.TheDCMapproachcanbeappliedtofunctional ii
self-connections corresponds to a prior over half-life's that can be
MagneticResonanceImaging(fMRI),Electroencephalographic(EEG),
determinedbysamplingfromp(A |m)andtransformingvariablesto
Magnetoencephalographic(MEG),andLocalFieldPotential(LFP)data ii
τ=−1/A .Thisproducesameanhalflifeofapproximately720ms
(Daunizeauetal.,2009).TheempiricalworkinthispaperusesDCM i ii
with90%ofthedistributionbetween500and1000ms.
forfMRI.
Forthoseintrinsiccrossconnectionswhicharenotfixedatzeroby
modelassumptionsmwehave
Neurodynamics
ThispaperusesDCMforfMRIwithbilinearneurodynamicsandan (cid:2) (cid:3)
extendedBalloonmodel(Friston,2002)forthehemodynamics.The pðAikjmÞ=N Aik;1=64;σc2ross ð4Þ
neurodynamicsaredescribedbythefollowingmultivariatedifferen-
tialequation whereσ =0.5.Elementsofthemodulatoryandinputconnectivity
cross
! matrices (which are not fixed at zero by model assumptions) have
z˙ = A+∑M uðjÞBj z +Cu ð1Þ shrinkagepriors
t t t t
j=1 (cid:2) (cid:3) (cid:2) (cid:3)
p Bjjm =N Bj;0;σ2 ð5Þ
ik ik s
wheretindexescontinuoustimeandthedotnotationdenotesatime
derivative.Theithentryinz correspondstoneuronalactivityinthe
t (cid:2) (cid:3) (cid:2) (cid:3)
ithbrainregion,andut(j)isthejthexperimentalinput. p C jm =N C ;0;σ2 ð6Þ
ADCMischaracterisedbyasetof‘intrinsicconnections’,A,that ij ij s
specifywhichregionsareconnectedandwhethertheseconnections
are unidirectional or bidirectional. We also define a set of input and σ =2. In the above, i and k index brain regions and j indexes
s
connections, C, that specify which inputs are connected to which experimentalinput.
regions,andasetofmodulatoryconnections,Bj,thatspecifywhich Theprior varianceparameters σ2 ,σ2 and σ2 alongwiththe
self cross s
intrinsicconnectionscanbechangedbywhichinputs.Usually,theB prior variances on hemodynamic parameters (see Appendix A)
parametersareofgreatestinterestasthesedescribehowconnections determinetheoverallpriorcovarianceonmodelparameters,Cθ(see
betweenbrainregionsaredependentonexperimentalmanipulations. next section). In the free energy model comparison criterion (see
The overall specification of input, intrinsic and modulatory below) these variances contribute to the penalty paid for each
connectivity comprise our assumptionsabout modelstructure. This parameter.
inturnrepresentsascientifichypothesisaboutthestructureofthe
large-scale neuronal network mediating the underlying cognitive Optimisation
function.Thesehypotheses,ormodelsareindexedbym.
Thesimulationsinthispaperuse‘DCM8’(availableinSPM8prior ThestandardalgorithmusedtooptimiseDCMsistheVariational
to revision 4010) with a deterministic, single-state, bilinear neuro- Laplace (VL) method described in (Friston et al., 2007a). The VL
dynamicalmodelasdescribedabove. algorithmcanbeusedforBayesianestimationofanynonlinearmodel
oftheform
Modelpredictions
InDCM,neuronalactivitygivesrisetofMRIsignalsviaanextended y=gðθÞ+e ð7Þ
Balloonmodel(Buxtonetal.,2004)andBOLDsignalmodel(Stephan
etal.,2007)foreachregion.Thisspecifieshowchangesinneuronal whereg(θ)issomenonlinearfunction,andeiszeromeanadditive
activitygiverisetochangesinbloodoxygenationthataremeasured Gaussian noise with covariance C . This covariance depends on
y
withfMRI.Theequationsforthesehemodynamicsareprovidedinthe hyperparameters λ as shown below. The likelihood of the data is
AppendixAanddependonasetofhemodynamicparametersh. therefore
Overall,theDCMparametersarecollectivelywrittenasthevector (cid:2) (cid:3)
θ={A,B,C,h}.Numericalintegrationoftheneurodynamic(Eq.1)and pðyjθ;λ;mÞ=N y;gðθ;mÞ;C ð8Þ
y
hemodynamic equations (Appendix A) leads to prediction of fMRI
activity in each brain region. These values are concatenated to
TheframeworkallowsforGaussianpriorsovermodelparameters
produceasinglemodelpredictionvectorg(θ).
pðθjmÞ=Nðθ;μθ;CθÞ ð9Þ
Priors
Thepriorsfactoriseoverparametertypes wherethepriormeanandcovarianceareassumedknown.Theerror
covariancesareassumedtodecomposeintotermsoftheform
pðθjmÞ=pðAjmÞpðBjmÞpðCjmÞpðhjmÞ ð2Þ
C−1= ∑expðλÞQ ð10Þ
y i i
i
andeachparameterpriorisGaussian.Thepriorsusedinthispaper
correspond to those used in ‘DCM8’. The priors over the intrinsic whereQ areknownprecisionbasisfunctions.Thehyperparameters
i
connectionsarechosentoencouragestabledynamics.Thisresultsin that govern these error precisions are collectively written as the
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
W.D.Penny/NeuroImagexxx(2011)xxx–xxx 3
vectorλ.Thesewillbeestimated.Additionally,thehyperparameters Comparisonofalargenumberofmodelsisbestimplementedusing
areconstrainedbytheprior thefullposteriordensity,p(m|y),asdescribedin(Pennyetal.,2010).
pðλjmÞ=Nðλ;μλ;CλÞ ð11Þ Freeenergy
Theabovedistributionsallowonetowritedownanexpressionfor Itispossibletoplacealowerboundonthelogmodelevidenceof
thejointloglikelihoodofthedata,parametersandhyperparameters thefollowingform(Beal,2003)
pðy;θ;λjmÞ=pðyjθ;λ;mÞpðθjmÞpðλjmÞ ð12Þ
logpðyjmÞ=FðmÞ+KL½qðθ;λjmÞjjpðθ;λjy;mÞ(cid:2) ð17Þ
TheVLalgorithmthenassumesanapproximateposteriordensity
ofthefollowingfactorisedform where F(m) is known as the negative variational free energy
(henceforth‘FreeEnergy’)andthelasttermistheKullback–Liebler
qðθ;λjy;mÞ=qðθjy;mÞqðλjy;mÞ distance between the true posterior density, p(θ,λ|y,m) and an
qðθjy;mÞ=Nðθ;mθ;SθÞ ð13Þ approximate posterior q(θ,λ|m). Because KL is always positive
qðλjy;mÞ=Nðλ;mλ;SλÞ (Mackay,2003),F(m)providesalowerboundonthemodelevidence.
TheFreeEnergyisdefinedas
Theparametersoftheseapproximateposteriorsaretheniteratively (cid:4) (cid:5)
updated so as to minimise the Kullback–Liebler (KL)-divergence FðmÞ=∫∫qðθ;λjy;mÞlog pðy;θ;λjmÞ dθdλ ð18Þ
between the true and approximate posteriors. This algorithm is qðθ;λjy;mÞ
describedfullyin(Fristonetal.,2007a).
We emphasise here that the Variational Laplace framework and canbeestimatedusing aLaplaceapproximation (Fristonetal.,
assumes that the prior means and covariances (μθ,Cθ,μλ,Cλ) are 2007a), FL(m), as derived in Appendix B. As noted in (Wipf and
known.Theyarenotestimatedfromdata,asisthecaseforEmpirical Nagarajan,2009),becausetheLaplaceapproximationisnotexactly
Bayesmethods(CarlinandLouis,2000).Wewillreturntothisissuein equaltotheFreeEnergy,theabovelowerboundpropertynolonger
thediscussion. holds.Thatis,theLaplaceapproximationdoesnotlowerboundthe
logmodelevidence.Asweshallsee,however,itneverthelessprovides
HyperparametersinDCMforfMRI averyusefulmodelcomparisoncriterion.TheLaplaceapproximation
InDCMforfMRItheprecisionbasisfunctionsQ,definedinEq.(10), totheFreeEnergyisgiveninEq.(57)andcanbeexpressedasasumof
i
aresettoQ=Iforeachbrainregion.Thequantityγ=exp(λ)therefore accuracyandcomplexityterms(Beal,2003)
i i i
correspondstothenoiseprecisioninregioni.
F ðmÞ=AccuracyðmÞ−ComplexityðmÞ ð19Þ
The overall error covariance matrix C has a block structure L
y
correspondingtotheassumptionthatobservationnoiseisindepen-
dentandidenticallydistributedineachregion.Thisisvalidastime AccuracyðmÞ=−1eTC−1e −1logjC j−Nlog2π ð20Þ
2 y y y 2 y 2
series data are usually pre-whitened before entering into a DCM
analysis(Fristonetal.,2003).Thepriormeanandcovarianceofthe
associatedlatentvariablesaresetto ComplexityðmÞ= 12eTθCθ−1eθ+ 12logjCθj−12logjSθj ð21Þ
Cμλλ==01 ð14Þ + 12eTλCλ−1eλ+ 12logjCλj−12logjSλj
whereNisthenumberofdatapointsandthe‘errorterms’are
This corresponds to the assumption that the mean prior noise
precision, γi=1:7. These values, along with the priors on the ey=y−gðmθÞ
neurodynamicparameters,havebeensetsoastoproducedatasets eθ=mθ−μθ ð22Þ
withtypicalsignaltonoiseratiosencounteredinfMRI. eλ=mλ−μλ
Modelevidence ThefirstrowofEq.(21)isthecomplexitytermfortheparameters
andthesecondrowthecomplexitytermforthehyperparameters.If
Themodelevidence,alsoknownasthemarginallikelihood,isnot the hyperparameters are known then the last row of Eq. (21)
straightforwardtocompute,sinceitscomputationinvolvesintegrat- disappears.Inthiscasewecanwritethecomplexityas
ingoutthedependenceonmodelparameters
pðyjmÞ=∫∫pðyjθ;λ;mÞpðθjmÞpðλjmÞdθdλ: ð15Þ ComplexityðmÞ= 12eTθCθ−1eθ+ 12logjjCSθθjj ð23Þ
The following sections describe Free Energy, AIC and BIC Inthelimitthattheposteriorequalstheprior(eθ=0,Cθ=Sθ),the
approximationstothe(log)modelevidence.Oncetheevidencehas complexitytermequalszero.ThelastterminEq.(23),12logjjCSθθjj,isalso
beencomputedmodelsm andm canbecomparedusingtheBayes referred to as an Occam factor (see page 349 in (Mackay, 2003)).
1 2
factor Because the determinant of a matrix corresponds to the volume
spanned by its eigenvectors, this Occam factor gets larger and the
B = pðyjm1Þ ð16Þ model evidence smaller as the posterior volume, |Sθ|, reduces in
12 pðyjm2Þ proportion to the prior volume, |Cθ|. Models for which parameters
have to be specified precisely (small posterior volume) are brittle.
withavalueof20correspondingtoaposteriorprobabilityofgreater Theyarenotgoodmodels(complexityishigh).
than0.95infavourofmodelm .ThecorrespondinglogBayesfactor The above considerations also apply to cases where hyperpara-
1
is3.TheuseofBayesfactorsformodelcomparisonisdescribedmore metersareestimated.Thereisanadditionalcomplexityterm(lastline
fully elsewhere (Kass and Raftery, 1995; Penny et al., 2004). ofEq.21)andthereforeanadditionalOccamfactor.
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
4 W.D.Penny/NeuroImagexxx(2011)xxx–xxx
Correlatedparameters TheseparametervaluescanthenbepluggedintoEqs.(19)to(22)
Otherfactorsbeingequal,modelswithstrongcorrelationinthe to compute the Free Energy. If the hyperparameters are assumed
posteriorarenotgoodmodels.Forexample,givenamodelwithjust knownthentheFreeEnergyexpressioninEq.(19)isexactlyequalto
twoparametersthedeterminantoftheposteriorcovarianceisgiven thelogmodelevidence.Thatis,F(m)=logp(y|m).Wewillrevisitthis
L
by caseintheResultssection.Ifthehyperparametersareestimatedthen
(cid:2) (cid:3) theFreeEnergyprovidesaverycloseapproximation,asconfirmedby
jSθj = 1−r2 σθ2σθ2 ð24Þ samplingmethods(Fristonetal.,2007a).
1 2
where r is the posterior correlation, σθ and σθ are the posterior AICandBIC
1 2
standard deviations of the two parameters. For the case of two
parametershavingasimilareffectonmodelpredictionstheposterior Asimpleapproximationtothelogmodelevidenceisgivenbythe
correlationwillbehigh,thereforeimplyingalargecomplexitypenalty. BayesianInformationCriterion(Schwarz,1978)
However, there is another factor at play. This is that neither
parameterwillbeestimatedaccurately(theposteriorvarianceswill
p
behigh).Thissecondfactorcanoffsetthehighercomplexitydueto BIC=AccuracyðmÞ− logN ð28Þ
2
correlationandcanleadtoasituationinwhichadditionalextraneous
parameters will not lead to a significant drop in free energy. One
would then appeal to a further Occam's Razor principle (Mackay, wherepisthenumberofparameters,andN isthenumberofdata
2003), namely, that in the absence of significant free energy points.TheBICisaspecialcaseoftheFreeEnergyapproximationthat
differencesoneshouldpreferthesimplermodel(seeDiscussion). dropsalltermsthatdonotscalewiththenumberofdatapoints(see
WhenfittingDCMstofMRIdataitislikelythatsomeparameters e.g. Appendix A2 in (Penny et al., 2004) for a derivation). This is
willbecorrelatedwitheachother.Thiscorrelationcanbeexamined equivalenttothestatementthatBICisequaltotheFreeEnergyunder
bylookingattheposteriorcovariancematrixSθ.Agoodexampleof theinfinitedatalimit,andwhenthepriorsoverparametersareflat,
this is provided in Fig. 6 of Stephan et al. (2007) who describe andthevariationalposteriorisexact(seesection2.3in(Attias,1999)
posteriorcorrelationsamonghemodynamicandconnectivityparam- and page 217in (Bishop,2006)).In practise,as we shall see, these
eters. Importantly, these correlations are accomodated in the Free threerequirementsarealmostnevermetandBICwillproducemodel
Energy model comparison criterion (see Eq. 23 and above). This is comparisons that are often very different to those from the Free
possiblebecauseVariationalLaplacedoesnotassumethatparameters Energy.
areaposterioriindependentamongthemselves,ratheritisassumed An alternative model selection criterion is Akaike's Information
thattheparametersare aposterioriindependentofthehyperpara- Criterion(or‘AnInformationCriterion’)(Akaike,1973)
meters(seeEq.13).
Decompositions AIC=AccuracyðmÞ−p ð29Þ
It is instructive to decompose approximations to the model
evidence into contributions from specific sets of parameters or
predictions.InthecontextofDCM,onecandecomposetheaccuracy AICisnotaformalapproximationtothemodelevidencebutderives
terms into contributions from different brain regions, as described from information theoretic considerations. Specifically, AIC model
previously(Pennyetal.,2004).Thisenablesinsighttobegainedinto selectionwillchoosethatmodelinthecomparisonsetwithminimal
whyonemodelisbetterthananother.Itmaybe,forexample,thatone expected KL divergence to the true model (Akaike, 1973; Burnham
modelpredictsactivitymoreaccuratelyinaparticularbrainregion. andAnderson,2002).Thereareprecedentsintheliterature,however,
Similarly, it is possible to decompose the complexity term into forusingitasasurrogateforthemodelevidence,inordertoderivea
contributions from different sets of parameters. If we ignore posteriordensityovermodels(BurnhamandAnderson,2004)(Penny
correlation among different parameter sets then the complexity is etal.,2004).
approximately The AIC criterion has been reportedto performpoorly for small
numbers of data points (Brockwell and Davis, 2009; Burnham and
!
ComplexityðmÞ≈12∑j eTθjCθ−j1eθj+logjjCSθθjjj ð25Þ Anderson,2004).Thishasmotivatedtheinclusionofacorrectionterm
j
pðp+1Þ
AICc=AIC− ð30Þ
wherejindexesthejthparameterset.InthecontextofDCMthese N−p−1
could index input connections (j=1), intrinsic connections (j=2),
modulatoryconnections(j=3)etc.Wewillseeanexampleofthisin
theResultssection. knownasthe‘corrected’AIC(AICc)(HurvichandTsai,1989).TheAICc
criterion thus penalises parameters more than does AIC. The two
GeneralLinearModels criteria become approximately equal for N>p2 and identical in the
ForGeneralLinearModels(GLMs)modelpredictionsaregivenby limitofverylargesamplesizes.Wenote,however,thatforN<p+1
thedenominatorinthecorrectiontermbecomesnegativeandAICc
gðθÞ=Xθ ð26Þ penalisesparameterslessthandoesAIC.Intheempiricalworkinthis
paperwethereforeavoidthis(highlyunlikely)regime.
whereXisadesignmatrixandθarenowregressioncoefficients.The InapplicationsofAIC and BICtoDCMs(Pennyetal., 2004),the
posteriordistributionisanalyticandgivenby(Bishop,2006) estimatedparametersaretakentobeequaltotheposteriormeansmθ
andmλ.AICandBICareusefulapproximationsbecauseoneonlyneeds
to quantify the fit of the model to provide an estimate of the log-
Sm−θθ1==SXθ(cid:2)TXCy−TC1y−X1+y+Cθ−C1θ−1μθ(cid:3) ð27Þ eappvaiprdaremonxecitmee.raAtiinIoCnthaneindmBtohIdCaetla.rtehequsaalmitaetivfiexlyeddipffeenreanlttytoisthpeaifdreeforeneeargchy
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
W.D.Penny/NeuroImagexxx(2011)xxx–xxx 5
Results therefore different, with the full model design matrix having 12
regressorsandthenestedmodelhaving9regressors.
Linearmodels Estimatedregressioncoefficients,θˆ,andnoisevarianceestimates,
σ̂ =0.73 were extracted for a voxel showing a significant overall
e
We first compare the different approximations to the model response to faces (i.e. over all conditions). The corresponding fMRI
evidenceusingBayesianGLMs.Wedefinetheseusingthefollowing time series comprised N=351 values. We then created simulated
priorandlikelihood databasedonthisobservedfMRIdataasfollows.
First, we estimated the deviation of the fitted regression co-
pðypðjθθÞÞ==NNð(cid:2)θy;;μXθθ;;CCθÞ(cid:3) ð31Þ eesffiticmieanttiosnabwouastzbearsoedanodnspetartahmepetreiorrfiStDsftrootmhisdavtaaluaet,σaps=ing6l.e05v.oTxheils.
y Theuseofacommonσ valueforallregressioncoefficientsimplies
p
thattheeffectsareofsimilarmagnitudeforallfourconditionsandall
whereθisthe[p×1]vectorofregressioncoefficients,yisthe[N×1]
threetemporalbasisfunctions,andisareasonableassumption.We
vectorofdatapoints,Xisthe[N×p]designmatrix,andfortheprior then computed <σ >, the average signal standard deviation when
meanwehaveμθ=0.Fortheworkinthispaperweassumeisotropic drawingparametersythepriorp(θ).
covariancematrices
WethenproducedsimulateddatasetswheretheSignaltoNoise
ratio
Cθ=σp2Ip ð32Þ <σ >
C =σ2I SNR= y ð33Þ
y e N σ
e
where σp and σe are the standard deviations of the prior and was set to a range of values by choosing an appropriate σe. SNR
observationerror.Weassumethattheseparametersareknown. definedinthismannercanberelatedtotheproportionofvariance
We compare Bayes factors based on AIC, BIC and FL for nested explainedbythemodel,asshowninAppendixC.TheobservedfMRI
GLMsderivedfromanfMRIstudy.ThefMRIdatasetwascollectedto datahaveavalueofSNR=1.3.
studyneuronalresponsestoimagesoffacesandisavailablefromthe Eachsimulateddatasetwasthengeneratedbydrawingregression
SPM web site (http://www.fil.ion.ucl.ac.uk/spm/data/face_rep/ coefficients from their prior densities, producing model predictions
face_rep_SPM5.html.). Each face was presented twice, and faces g=Xθ (for both full and nested models) and adding zero mean
eitherbelongedtofamiliarorunfamiliarpeople.Thisgaverisetofour Gaussiannoisewithvarianceσ2.
e
conditions,eachofwhichwasmodelledwith3hemodynamicbasis Wethenfittedbothfullandnestedmodelstoeachsimulateddata
functions(Fristonetal.,2007b).Forafulldescriptionofthisdataset setandestimatedBayesfactorsusingAIC,BICandF.Thesecriteria
L
andWsiemfiirlastrdaenfianlyeseas‘nseesete(dH’emnsoodneleitnawl.,h2ic0h0o2n).ly3oftheseconditions wseecrteioncoinmtpouEtqe.d(2b7y)fsourbcsotimtuptuintgingX,thCey,pCoθs,tearniodrμmθeaasndaenfidnceodvairniatnhcies
aarsecomnotadienlilnedg,arneseuxlttriang3rineg9rersesgorressfsroorms.Wtheeathddenitidoenfianlceoand‘fiutlilo’nm(ofidrestl fwoerrleintehaernmcoomdeplus.teTdhefropmredEiqcsti.o(n22e)rraonrds,(e2y6,)a.nWdepcaorualmdettheernecroromrps,uetθe,
responsetounfamiliarfaces).Fig.1showsthedesignmatrixforthe the accuracy and complexity terms using Eqs. (20) and (21) (the
full model. The design matrices for the full and nested models are complexitytermsforλwereignoredastheobservationnoisevariance
wasknownforthesesimulations).
Fig.2showsresultsfordatadrawnfromthefullmodel.Thefigure
plots the log Bayes factors (differences in log model evidence) at
variousvaluesofSNR,whereeachpointineachcurvewasaveraged
over 1000 simulated data sets. At low SNRs, experimental effects
shouldbeimpossibletodetect.ThisisreflectedintheFreeEnergylog
Bayesfactorwhichcorrectlyasymptotestoavalueofzero,indicating
neithermodelispreferred.Inthisregime,however,BICandtoalesser
extentAICbothincorrectlyfavourthenestedmodel.Theerrorbarson
the plots (not shown) are extremely tight in this regime, being ±
0.0001, ±0.09 and ±0.35 for SNRs of 0.0025, 0.029 and 0.055
respectively(averagedoverthethreecriteria).Thismeanswecanbe
highlyconfidentthatF isunbiasedbutthatAICandBICarebiassed
L
towardsthenestedmodel.
Theaboveprocedurewasthenrepeatedbutthistimegenerating
datafromthenestedmodel.TheresultsareshowninFig.3(notethe
broader range of SNRs plotted). In the low SNR regime, model
comparisonshouldagainbeimpossible.Thisiscorrectlyreflectedin
theF criterionwithalogBayesfactorapproachingzero,butnotsoin
L
theAICorBICcriteria.
Finally,weexaminedthedependenceofmodelcomparisononthe
numberofdatapoints,N.WevariedNover20valuesbetween32and
512 with 1000 replications at each value, using SNR=0.5 (results
werequalitativelysimilarforotherSNRs).Theresultsareshownin
Fig. 4 for data generated from the full model. As expected, Bayes
factorsincreasewiththenumberofdatapoints.Thefreeenergy,AIC
Fig.1.DesignmatrixforthefullGLM.ThenestedGLMusesanidenticaldesignmatrix
andAICcshowverysimilarperformancewithF beingslightlybetter
butwiththefirstthreecolumnsremoved.ThefulldesignmatrixcomprisesN=351 L
atlowNandAIC/AICcathighN.TheBICcriterionisbiassedtowards
rows,oneforeachfMRIscan,andtwelvecolumns,oneforeachputativeexperimental
effect. thenestedmodel.
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
6 W.D.Penny/NeuroImagexxx(2011)xxx–xxx
Fig.2.LogBayesfactoroffullversusnestedmodel,LogBf,n,versusthesignaltonoiseratio,SNR,whenthetruemodelisthefullGLMforFL(black),AIC(blue)andBIC(red).(For
interpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
Fig.5showstheresultsfordatageneratedfromthenestedmodel. wehavelinearmodels,thelastrequirementismetbutthepriorover
The Bayes factors from the free energy and BIC increase with the parameters is Gaussian, rather than flat. A data set comprising 512
numberofdatapoints,whereasthisisnotthecaseforAICandAICc. points is about the maximum one could hope to get from a single
WeseethatAICandAICcareequivalentforlargesamplesizes.For sessionoffMRIscanning(approximately17minwithaTRof2s).We
small sample sizes AICc pays a larger parameter penalty. This is thereforeconclude that for neuroimaging applicationsBIC and Free
beneficialwhenthenestedmodelistrue(Fig.5)butnotwhenthefull Energyarelikelytogivedifferentresults.
model is true (Fig. 4). Overall, we do not see a good reason for
favouring AICc over AIC and so exclude it from subsequent model
comparisons. DCMforfMRI
Theory(Attias,1999)tellsusthatBICshouldconvergetotheFree
Energyforlargesamplesizes.However,thisisonlythecaseforflat We now compare the model comparison criteria using DCM for
priorsoverparametersandifthevariationalposterioriscorrect.As fMRI.WegeneratedatausingsyntheticDCMswithknownparameter
Fig.3.LogBayesfactorofnestedversusfullmodel,LogBn,f,versusthesignaltonoiseratio,SNR,whenthetruemodelisthenestedGLMforFL(black),AIC(blue)andBIC(red).
(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
W.D.Penny/NeuroImagexxx(2011)xxx–xxx 7
Fig.4.LogBayesfactoroffullversusnestedmodel,LogBf,n,versusthenumberofdatapoints,N,whenthetruemodelisthefullGLMforFL(black),AIC(blue),BIC(red)andAICc
(green).(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
values. However, to ensure the data are realistic we use parameter (Leffetal.,2008).Thetimeserieswhichweremodelledinthisstudy
valuesthatwereestimatedfromneuroimagingdata. compriseN=488datapointsineachofthreebrainregions.
Thisdataderivefromapreviouslypublishedstudyonthecortical WefocusonjusttwoofthemodelsconsideredbyLeffetal.(Leff
dynamicsofintelligiblespeech(Leffetal.,2008).Weuseddatafroma et al., 2008). These are a ‘nested’ model, which has full intrinsic
single representative subject. This study applied DCM for fMRI to connectivity with auditory input, u , entering region P, and a
aud
investigate activity among three key multimodal brain regions: the modulatoryconnectionfromregionPtoF(thisallowsregionFtobe
left posterior and anterior superior temporal sulci (subsequently differentiallyresponsivetointelligibleversustime-reversedspeech).
referredtoasregionsPandArespectively)andparsorbitalisofthe Wealsodefinea‘full’modelwhichisidenticalbuthasanadditional
inferiorfrontalgyrus(regionF).Theaimofthestudywastoseehow modulatoryconnectionfromregionPtoA(b —seebelow).Thetwo
AP
connectionsamongregionsdependedonwhethertheauditoryinput networksareshowninFig.6.Thetwomodelsdifferinonlyasingle
was intelligible speech or time-reversed speech. Full details of the connectionandwechosetheseverysimilarmodelstomakemodel
experimental paradigm and imaging parameters are available in comparisonaschallengingaspossible.
Fig.5.LogBayesfactorofnestedversusfullmodel,LogBn,f,versusthenumberofdatapoints,N,whenthetruemodelisthenestedGLMforFL(black),AIC(blue),BIC(red)andAICc
(green).(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
8 W.D.Penny/NeuroImagexxx(2011)xxx–xxx
fitted both full and nested models to each simulated data set and
estimatedBayesfactorsusingAIC,BICandF.
L
Fig.7showsresultsfordatadrawnfromthefullmodel.Thefigure
plots the log Bayes factors (differences in log model evidence) at
variousvaluesofSNR,whereeachpointineachcurvewasaveraged
over50simulateddatasets.FortheseDCMsimulations,theaveraging
wasimplementedusingthemedianoperator(ratherthanthemean)
Fig.6.Anested(left)andfull(right)DCM.ThefullDCMisidenticaltothenestedDCM
astheresultsweremorevariablethanfortheGLMcase.Thecurvesin
exceptforhavinganadditionalmodulatoryforwardconnectionfromregionPtoregion
A.Intrinsicconnectionsareindicatedbydottedarrows,modulatoryconnectionsby Fig. 7 show that only the Free Energy criterion is able to correctly
overlaidsolidarrowsandinputsbysolidsquareswithanarrow. identifythefullmodel.
Theaboveprocedurewasthenrepeatedbutthistimegenerating
Mathematically,neurodynamicsevolveaccordingto data from the nestedmodel. Again, each point in each curve is the
medianvalueover50simulateddatasets.Theresultsareshownin
2 3 02 3 2 312 3
z˙ a a a 0 0 0 z Fig.8(notethebroaderrangeofSNRsplotted).
4z˙P5=@4aPP aPF aPA5+u 4b 0 05A4zP5 Theresultsondatafromthenestedmodelareverysimilartothose
F FP FF FA int FP F
z˙ a a a b 0 0 z fortheGLMcase(compareFigs.3and8).Theresultsfordatafromthe
A AP AF AA AP A
2 3 ð34Þ fullmodel,however,arenot(compareFigs.2and7),asAICandBIC
4cP5 are unableto correctly identifythe fullmodel evenat high SNR.In
+uaud 0 ordertofindoutwhythisisthecaseweexaminedDCMsfittedtodata
0
at SNR=2, and examined the relative contributions to the model
evidence,asdescribedinDecompositionssection.
whereu isatrainofauditoryinputspikes,u indicateswhether ForthishighSNRscenariowefound,slightlytooursurprise,that
aud int
theinputisintelligible(Leffetal.,2008),a denotesthevalueofthe thefullDCMswereonlyslightlymoreaccuratethanthenestedDCMs.
AF
intrinsicconnectionfromregionFtoA,b andb arethestrengths Unsurprisingly, this increase in accuracy was realised in region A,
FP AP
ofthetwomodulatoryconnections,andc isthestrengthoftheinput which receives modulatory input in the full but not in the nested
P
connection.ForthenestedDCMwehaveb =0. model(seeFig.6).However,themainquantitydrivingthedifference
AP
Wefirstgenerateddatasetsfromthefullmodeloverarangeof inFreeEnergybetweenfullandnestedDCMswasnottheaccuracy
SNRsasfollows.TobestreflecttheempiricalfMRIdataallparameters butratherthecomplexity.
otherthanthemodulatoryparameterswereheldconstant.Foreach ItturnsoutthatthenestedDCMsareabletoproduceareasonable
simulateddatasetthemodulatoryparameterswerefirstdrawnfrom data fit by using a very large value for the intrinsic connection, a
AF
their prior densities (see Eq. 5). Additionally, the modulatory (fromregionFtoA).Thisconnectionvalue(typically1.5)wasabout5
parameters were then constrained to be positive (by taking the timesbiggerthanthevalueforafullDCM(typically0.3).Thismakes
absolutevalue)sothatmodulatoryeffectswouldbefacilitating. sense because, in the nested model, the connection from P to F is
Synthetic fMRI data was then generated by integrating the modulatedbyintelligibility,andbyfacilitatingtheintrinsicconnec-
neurodynamicandhemodynamicequationsandaddingobservation tionfromFtoAthis‘modulatorysignal’ispassedontoregionA.Since
noisetoobtainthetargetSNR.TheSNRwasdefinedinthesameway thismodulationisofanadditivenature,thisthereforecrudelymimics
asforthelinearmodels,butwiththesignalstandarddeviation,<σ >, adirectmodulationofthePtoAconnection.However,suchastrong
y
averaged over the three predicted time series (one for each brain intrinsicconnectionfromFtoAisa-prioriunlikely(thepriorisazero-
region).TheobservedfMRIdatahaveavalueofSNR=0.2.Wethen mean Gaussian, with standard deviation σ =0.5). The nested
cross
Fig.7.LogBayesfactoroffullversusnestedmodel,LogBf,n,versusthesignaltonoiseratio,SNR,whenthetruemodelisthefullDCMforFL(black),AIC(blue)andBIC(red).
(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
W.D.Penny/NeuroImagexxx(2011)xxx–xxx 9
Fig.8.LogBayesfactorofnestedversusfullmodel,LogBn,f,versusthesignaltonoiseratio,SNR,whenthetruemodelisthenestedDCMforFL(black),AIC(blue)andBIC(red).(For
interpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
models are therefore heavily penalised for having such unlikely lackthesensitivitythatisrequired,inthiscase,toinferthatdatawas
parametervalues(beingthreestandarddeviationsawayfromtheir drawnfromthefullmodel.
prior means). Only the Free Energy criterion is sensitive to such Weemphasisethatthiswillnotalwaysbethecase,andAIC/BIC
subtleties because AIC and BIC pay the same penalty for each can in general be sensitive to ‘full model’ effects in DCMs. This is
parameter,regardlessofmagnitude. demonstrated,forexample,inourpreviouswork(Pennyetal.,2004).
Asmentionedabove,theempiricalSNRforthisdataisSNR=0.2 However, if prior information about parameter values is available
whichisverylow.FittingthefullandnestedDCMstothisdatayielded then it should be used, and can be used to good effect in the Free
aFreeEnergydifferenceofonly0.11(infavourofthefullDCM).This Energycriterion.
differenceisnegligible,andpointstothedifficultyofmodelinference ItmayalsobearguedthatintheapplicationinthispaperAICand
for very similar models and at low SNR (as exemplified by Figs. 7 BICareimplicitlyusingpriorinformationinthattheaccuracytermis
and8).Inthisregimeitmaybeabetterideatomakeinferencesover computedatthemaximumposteriorvalue.Beingaposteriorestimate
families of models (Penny et al., 2010) and to look for consistent thisisnaturallyconstrainedbytheprior.Toavoidthisonewouldhave
differencesoveragroupofsubjects(Stephanetal.,2009). to implement a separate Maximum Likelihood optimisation. Given
thisfact,itthereforeseemsconsistenttoalsousepriorinformation
whenapproximatingtheevidence.
Discussion AccordingtoconventionsinBayesianstatistics(KassandRaftery,
1995), and as stated above, models can be considered clearly
We have described a simulation study which compared the distinguishable once the log Bayes factor exceeds three. The
relativemeritsofAIC,BICand FreeEnergymodelselection criteria. simulation results for both GLMs and DCMs show smaller Bayes
Differences in performance were examined in the context of GLMs factors when the true model is nested rather than full. This is
and DCMs and we found that the Free Energy has the best model particularlypronouncedforthe(challenginglysimilar)DCMsexam-
selection ability and recommended it be used for comparison of inedinthispaperforwhichtheFreeEnergyonlyachievesaLogBayes
DCMs. Similar conclusions have been reached in earlier work FactorofthreeatanSNRof10.Insuchacase,modellersandimaging
comparing Free Energy with BIC in the context of non-Gaussian neuroscientistsshouldappealtoasecondOccamprinciple(Mackay,
autoregressive modelling (Roberts and Penny, 2002) and Hidden 2003),notthenumericaloneembeddedintheequationfortheFree
MarkovModelling(ValenteandWellekens,2004). Energy,butaconceptualonethatwhentwomodelscannotbeclearly
TheGLMsimulationresultsshowedthat,atlowSNR,AICandBIC distinguishedoneshouldpreferthesimplerone.
incorrectlyselectednestedmodelswhendataweregeneratedbyfull In previous work (Penny et al., 2004) we have advocated the
models. At higher SNR, however, this bias disappeared and AIC/BIC combineduseofAICandBICcriteriaforthecomparisonofDCMs.This
showed increased sensitivity. We also investigated a corrected AIC wasmotivatedbyaconcernabouthowFreeEnergymodelinference
criterionbutthisshowednobenefitoverthestandardAICmeasure. dependsonthechosenvaluesofthepriormeansandvariances(see
TheDCMsimulationresultsshowedthatonlytheFreeEnergywas earlier section on priors). Specifically, the values σ , σ and σ
self cross s
able to correctly detect that data had been generated from the full implicitly set the penalty paid for intrinsic, modulatory and input
model.BydecomposingtheFreeEnergydifferenceintocontributions parameters(asgovernedbyEq.(21)viatheoverallpriorcovariance
fromdifferentregionsandparameters,wefoundthatthisabilitywas matrixCθ.).
mainlyduetopenalisingthenestedmodelforhavingaverylarge,and This therefore motivates the future application of an Empirical
a-priori unlikely, intrinsic connection from brain region F to A. Bayes(CarlinandLouis,2000)approachwhichwouldestimatethese
Because AIC and BIC use the same complexity penalty for every variance parameters from data. This would effectively perform a
parameter, and one that is not matched to prior expectations, they search in the continuous space of prior variances instead of the
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
10 W.D.Penny/NeuroImagexxx(2011)xxx–xxx
discretespace(e.g.,nestedversusfull)examinedinthispaper.Such TheonlyfreeparameteroftheBOLDsignalmodelis(cid:2),theratioof
anapproachcanbeimplementedwithinthenewframeworkofpost- intra-toextra-vascularsignal.Togethertheaboveequationsdescribe
hocmodelselection(FristonandPenny,2011). a nonlinear hemodynamic process and BOLD signal model that
convertneuronalactivityintheithregion,z,tothefMRIsignal,y.
i i
Acknowledgments
A.2.Priors
ThisworkwassupportedbytheWellcomeTrust.WethankGareth
Barnes, Guillaume Flandin, Karl Friston, Vladimir Litvak, Klaas Theunknownparametersare{κ,τ,(cid:2)}.Thesearerepresentedas
i i
Stephan, Maria Rosa and Ged Ridgway for useful feedback on this (cid:2) (cid:3)
work. κi=0:64e(cid:2)xp(cid:3)θκi
AppendixA.Hemodynamics τi=2exp θτi ð38Þ
(cid:2)=expðθ(cid:2)Þ
InDCM,neuronalactivitygivesrisetofMRIactivitybyadynamic
andwehaveGaussianpriors
processdescribedbyanextendedBalloonmodel(Buxtonetal.,2004)
(cid:2) (cid:3) (cid:2) (cid:3)
and BOLD signal model (Stephan et al., 2007) for each region. This
specifies how changes in neuronal activity give rise to changes in p(cid:2)θκi(cid:3)= =N(cid:2)θκi;0;0:135(cid:3)
bloodoxygenationthataremeasuredwithfMRI. p θτ = =N θτ;0;0:135 ð39Þ
i i
The hemodynamic model involves a set of hemodynamic state pð(cid:2)Þ= =Nð(cid:2);0;0:135Þ
variables,stateequationsandhemodynamicparameters.Fortheith
region,neuronalactivityz(i)causesanincreaseinvasodilatorysignal where h={θκ,θτ,(cid:2)} are the hemodynamic parameters to be
s that is subject to autoregulatory feedback. Inflow f responds in estimated. i i
i i
proportiontothissignalwithconcomitantchangesinbloodvolumev
i
anddeoxyhemoglobincontentqi. AppendixB.Laplaceapproximation
s˙ =zðiÞ−κs−γðf−1Þ In what follows we have simplified notation by dropping the
i i i i
˙ dependence on model m. The negative variational free energy
f =s
i i (henceforth‘FreeEnergy’)isdefinedas
τiv˙i=fi−v1i=α ð35Þ (cid:4) (cid:5)
τiq˙i=fiEðfρi;ρÞ−vi1=αqvi F=∫∫qðθjyÞqðλjyÞlog qðpθðjyyÞ;qθð;λλÞjyÞ dθdλ ð40Þ
i
where
Outflowisrelatedtovolumef =v1/αthroughGrubb'sexponent
out
α(Fristonetal.,2003).Theoxygenextractionisafunctionofflow pðy;θ;λÞ=pðyjθ;λÞpðθÞpðλÞ ð41Þ
Eðf;ρÞ=1−ð1−ρÞ1=f ð36Þ Wecanrewritethisas
F=I+HðθÞ+HðλÞ ð42Þ
where ρ is resting oxygen extraction fraction. The free parameters
ofthemodelaretherateofsignaldecayineachregion,κi,andthe where
transit time in each region, τ. The other parameters are fixed to
i
γ=α=ρ=0.32. I=∫∫qðθjyÞqðλjyÞUðθ;λÞdθdλ ð43Þ
A.1.BOLDsignalmodel
andH(x)isthe(differential)entropyofxand
The Blood Oxygenation Level Dependent (BOLD) signal is then Uðθ;λÞ=logpðy;θ;λÞ ð44Þ
takentobeastaticnonlinearfunctionofvolumeanddeoxyhemoglo-
bin that comprises a volume-weighted sum of extra- and intra- ForaGaussiandensityp(x)=N(x;m,S)theentropyis
vascularsignals.ThisisbasedonasimplifiedapproachfromStephan
etal.(Stephanetal.,2007)(Eq.12)thatimprovesupontheearlier HðxÞ= 1ðklog2πe+logjSjÞ ð45Þ
model(Fristonetal.,2003) 2
(cid:4) (cid:6) (cid:7) (cid:5)
wherek=dim(S).Hence
q
y =V k ð1−qÞ+k 1− i +k ð1−vÞ
i 0 1 i 2 v 3 i
i 1
k1=4:3θ0ρTE ð37Þ F=I+2ðplog2πe+logjSθjÞ ð46Þ
kk2==1(cid:2)r−0ρ(cid:2)TE + 12ðhlog2πe+logjSλjÞ
3
where p is the number of parameters and h is the number of
whereV0isrestingbloodvolumefraction,θ0isthefrequencyoffsetat hyperparameters.TheVariationalLaplaceapproximationtotheFree
the outer surface of the magnetised vessel for fully deoxygenated Energyisthengivenby
bloodat1.5T,TEistheechotimeandr istheslopeoftherelation
0
b(Settewpeheannethteal.i,n2tr0a0v7a)s.cIunlatrhirseplaaxpaetriowneruatseetahnedstoaxnydgaerndpsaatruarmateitoenr FL=IL+12ðplog2πe+logjSθjÞ ð47Þ
valuesV =4,r =25,θ =40.3andforourfMRIimagingsequencewe 1
haveTE=0 0.04.0 0 +2ðhlog2πe+logjSλjÞ
Pleasecitethisarticleas:Penny,W.D.,ComparingDynamicCausalModelsusingAIC,BICandFreeEnergy,NeuroImage(2011),doi:10.1016/
j.neuroimage.2011.07.039
Description:describes a simulation study which compares the relative merits of three model
selection criteria (i) Akaike's. Information Criterion (AIC), (ii) the Bayesian