Table Of ContentSyst.Biol.63(3):368–382,2014
©TheAuthor(s)2014.PublishedbyOxfordUniversityPress,onbehalfoftheSocietyofSystematicBiologists.Allrightsreserved.
ForPermissions,pleaseemail:[email protected]
DOI:10.1093/sysbio/syt108
AdvanceAccesspublicationJanuary3,2014
AnotherLookattheRootoftheAngiospermsRevealsaFamiliarTale
BRYANT.DREW1,2,∗,BRADR.RUHFEL1,2,3,STEPHENA.SMITH4,MICHAELJ.MOORE5,BARBARAG.BRIGGS6,MATTHEWA.
GITZENDANNER1,PAMELAS.SOLTIS2,ANDDOUGLASE.SOLTIS1,2
1DepartmentofBiology,UniversityofFlorida,Gainesville,FL32611,USA;2FloridaMuseumofNaturalHistory,UniversityofFlorida,Gainesville,FL
32611,USA;3DepartmentofBiologicalSciences,EasternKentuckyUniversity,Richmond,KY40475,USA;4DepartmentofEcologyandEvolutionary
Biology,UniversityofMichigan,AnnArbor,MI48105,USA;5DepartmentofBiology,OberlinCollege,Oberlin,OH44074,USA;and6National
HerbariumofNewSouthWales,BotanicGardensTrust,Sydney,NSW2000,Australia
∗
Correspondencetobesentto:DepartmentofBiology,UniversityofFlorida,Gainesville,FL32611-8525,USA;E-mail:[email protected]
Received27May2013;reviewsreturned19July2013;accepted16December2013
AssociateEditor:VincentSavolainen
Abstract.—Sincetheadventofmolecularphylogeneticsmorethan25yearsago,amajorgoalofplantsystematistshasbeento
discerntherootoftheangiosperms.AlthoughmoststudiesindicatethatAmborellatrichopodaissistertoallremainingextant
floweringplants,supportforthispositionhasvariedwithrespecttoboththesequencedatasetsandanalysesemployed. D
Recently,Goremykinetal.(2013)questionedthe“Amborella-sisterhypothesis”usinga“noise-reduction”approachand o
w
reportedatopologywithAmborella+Nymphaeales(waterlilies)sistertoallremainingangiosperms.Throughaseriesof n
analysesofbothplastidgenomesandmitochondrialgenes,wecontinuetofindmostlystrongsupportfortheAmborella-sister loa
hypothesisandofferarebuttalofGoremykinetal.(2013).ThemajortenetofGoremykinetal.isthattheAmborella-sister de
pinovseistitoignaitsedtheetesrimgninaledinbtyhensoeinsyoidsyatda—atathfuatrtihs,ecr,hwareaacntearlsywzeidththheigdhisrcaatredseodfcchhaarnagcetearnsdfrloamcktihnegirtrnuoeispeh-ryelodguecnedetaiclisgingmnaeln.Ttso. d fro
m
Werecoveredatreeidenticaltothatofthecurrentlyacceptedangiospermframework,includingthepositionofAmborellaas h
sistertoallotherangiosperms,aswellasallothermajorclades.Thus,thesignalinthe“noisy”dataisconsistentwiththat ttp
ofourcompletedatasets—arguingagainsttheuseoftheirnoise-reductionapproach.Wealsodeterminedthatoneofthe s
alignmentspresentedbyGoremykinetal.yieldsresultsatoddswiththeircentralclaim—theirdatasetactuallysupports ://a
Amborellaassistertoallotherangiosperms,asdolargerplastiddatasetswepresentherethatpossessmorecomplete ca
taxonsamplingbothwithinthemonocotsandforangiospermsingeneral.Previousunpartitioned,multilocusanalysesof de
mitochondrialDNA(mtDNA)datahaveprovidedthestrongestsupportforAmborella+Nymphaealesassistertoother m
ic
angiosperms.However,ouranalysisofthirdcodonpositionsfrommtDNAsequencedataalsosupportstheAmborella- .o
sisterhypothesis.Finally,wechallengetheconclusionofGoremykinetal.thatthefirstfloweringplantswereaquaticand up
herbaceous,reassertingthatevenifAmborella+waterlilies,orwaterliliesalone,aresistertotherestoftheangiosperms, .c
o
theearliestangiospermswerenotnecessarilyaquaticand/orherbaceous.[Angiosperms;Amborella;Nymphaeales;plastid m
genome;waterlilies.] /s
y
s
b
io
/a
Researchers have long sought to discern the root of Zhang et al. 2012) from the past 15 years have found rtic
flowering plants and, with it, the patterns of change that Amborella alone is sister to all other angiosperms le
/6
in floral and vegetative characters. Prior to molecular (the “Amborella-sister hypothesis”). However, internal 3
/3
systematics, it was generally accepted that taxa in support for this placement varies among studies, and /3
6
the traditionally recognized subclass Magnoliidae some analyses support Amborella + Nymphaeales as 8
/1
(Cronquist 1981, 1988; Thorne 1992; Takhtajan 1997) sister to all other extant angiosperms (e.g., Barkman 6
4
representedancestral(i.e.,‘primitive’)floweringplants. et al. 2000; Soltis et al. 2000; Goremykin et al. 2009b; 87
9
However, with the advent of molecular systematics Finet et al. 2010; Qiu et al. 2010; Wodniok et al. 2011; 1
b
over 25 years ago, definitively answering the question Goremykinetal.2013). y
g
of which extant angiosperm lineage(s) are sister to In the early to mid-1990s, molecular phylogenetic u
e
the remainder finally became tractable. For example, studies using plastid DNA sequences (primarily the st o
molecular analyses have shown that the Magnoliidae slowlyevolvingrbcLgene)wereambiguouswithregard n
2
ofCronquistandTakhtajanarepolyphyletic.Moreover, to the angiosperm root (Chase et al. 1993; Qiu et al. 5
D
molecular evidence, as well as DNA + morphology 1993).However,asadditionalplastidgeneregionswere e
c
(DoyleandEndress2000),stronglysupportsaplacement added, plastid analyses generally supported Amborella e
m
of Amborellaceae, Nymphaeales, and Austrobaileyales aloneassistertoallotherextantangiosperms,followed b
e
as early-diverging groups of angiosperms, although by Nymphaeales and then Austrobaileyales (Borsch r 2
0
the relative branching order of Amborellaceae and et al. 2003; Hilu et al. 2003; Soltis et al. 2011). These 2
2
Nymphaeales (water lilies) has proven contentious. relationships have received strong support in analyses
Most large-scale molecular phylogenetic studies (e.g., employingcompleteplastidgenomeswhensamplingof
Parkinson et al. 1999; Qiu et al. 1999; Soltis et al. angiosperms was adequate (Leebens-Mack et al. 2005;
1999, 2000; Zanis et al. 2002; Borsch et al. 2003; Caietal.2006;Jansenetal.2007;Mooreetal.2007,2010).
Hilu et al. 2003; Stefanovic´ et al. 2004; Leebens-Mack In contrast to the widespread use of mitochondrial
et al. 2005; Qiu et al. 2005; Jansen et al. 2007; Moore DNA (mtDNA) in animals, the use of mtDNA in plant
et al. 2007; Moore et al. 2010; Graham and Iles 2009; phylogenetics has been stymied by various issues such
Lee et al. 2011; Moore et al. 2011; Soltis et al. 2011; ascomplexgenomestructure(PalmerandHerbon1988;
368
[12:147/4/2014Sysbio-syt108.tex] Page:368 368–382
2014 DREWETAL.—RE-EXAMININGTHEANGIOSPERMROOT 369
Andre et al. 1992), low nucleotide sequence variability apparent synapomorphy analysis; Lyons-Weiler et al.
(PalmerandHerbon1988;Palmer1992),andhorizontal 1996) and found support for Amborella + Nymphaeales
gene transfer (Bergthorsson et al. 2003; Brown 2003; as sister to other angiosperms. However, the results of
Bergthorsson et al. 2004; Richardson and Palmer 2007; Barkmanetal.(2000)varieddependingonthemethod
KeelingandPalmer2008;Goremykinetal.2009a;Renner ofanalysis(e.g.,parsimonyvs.likelihood).
andBellot2012;Xietal.2013).Despitethesehindrances, As a consensus emerged regarding the position
mtDNAsequencedatahavebeenusedinseveraldeeper- of Amborella as sister to other extant angiosperms,
scale phylogenetic studies, mostly in combination with researchers (e.g., Doyle and Endress 2000; Feild et al.
plastidDNAandnuclearribosomalDNA(nrDNA;Qiu 2000) sought to identify morphological features to
et al. 1999; Barkman et al. 2000; Zanis et al. 2002; Qiu corroborate the finding. A suite of morphological
et al. 2005; Soltis et al. 2011), but also alone (Qiu et al. characters,includingfloralmerosity,floralorganization,
2010). When mtDNA data have been used in concert phyllotaxy, perianth differentiation, stamen and
withsequencesfromothergenomiccompartments,the carpel morphology, nodal anatomy, presence of vessel
combined data have usually supported the Amborella- elements,andembryosacformation,hasbeenanalyzed
sister hypothesis (e.g., Qiu et al. 1999; Zanis et al. in light of the molecular angiosperm phylogeny, but
2002; Qiu et al. 2005; Soltis et al. 2011). In contrast, no feature or combination of features unambiguously
the recent study of Qiu et al. (2010), which included discriminates between Amborella alone vs. Amborella + D
o
sequencedatafrom4mtDNAgenesfor380seedplants, Nymphaeales as sister to all other angiosperms (Baily w
n
estimated (using total evidence without gene or codon and Swamy 1948; Carlquist 1987; Doyle and Endress lo
a
partitioning)thatAmborella+Nymphaealesformaclade 2000; Feild et al. 2000; Herendeen and Miller 2000; de
d
thaNtuisclseiastreDrNtoAalsleoqtuheenrcaensghioavspeenromtbs.eenwidelyusedin DCaoryllqeu2is0t08a;nrdevSiecwhneedidinerE2n0d0r1e,ss20a0n2d;DSoolytlies2e0t0a9l;.D2o0y0l5e; from
large-scalephylogeneticstudiesofangiospermstodate, 2012).Infact,manyofthesefeaturesconsistentlysupport h
anduntilrecentlyhavelargelybeenconfinedtothe18S Amborellaceae, Nymphaeales, and Austrobaileyales as ttps
and26SnrDNAregions(Soltisetal.1997,1999;Qiuetal. sisters to all other angiosperms, but without providing ://a
c
1999; Barkman et al. 2000; Soltis et al. 2000; Zanis et al. clearinsightsintothebranchingorder.Althoughsome a
d
e
2002;Qiuetal.2005;Soltisetal.2011)andphytochrome researchers favor a woody, rather than herbaceous, m
genes(MathewsandDonoghue1999).Nearlyallofthese originforangiospermsbecauseallextantgymnosperms ic.o
analyses found Amborella or Amborella + Nymphaeales (the sister group of angiosperms) and nearly all early up
asthesistertoallotherangiosperms.Recently,however, lineages of angiosperms are woody, the ancestral habit .co
m
several large-scale phylogenetic studies have used low of the angiosperms remains equivocal in rigorous /s
copy nuclear genes (Finet et al. 2010; Lee et al. 2011; character-state analyses (e.g., Soltis et al. 2005, 2008a; ys
b
Morton 2011; Wodniok et al. 2011; Zhang et al. 2012), Doyle 2012), regardless of whether Amborella alone or io
w(Lietehestomal.e2s0u1p1p, oMrtLinBgSth>e65A%m;bZorheallna-gsiesttearl.hy2p01o2t,heMsiLs Alivminbogrefllloaw+erNinygmpplhaanetsa.leAssisaprelascueldt,atwsosishtyeprotothaelsleosthaerer /article
BS = 83%, posterior probability [PP] = 0.99) and others currentlyadvancedfortheancestralangiospermsbased /6
3
recoveringAmborella+Nymphaealesassistertoallother onthesereconstructions—earlyangiospermsmayhave /3
/3
extantangiosperms,althoughwithBSsupportlessthan been either understory shrubs (“dark and disturbed”; 6
8
50%inonecase(Finetetal.2010;MLBS=49%)andwith Feild et al. 2004; Coiffard et al. 2007; Soltis et al. 2008a) /1
6
only7angiospermssampledintheother(Wodnioketal. or aquatic (“wet and wild”; Coiffard et al. 2007; Soltis 48
7
2011). etal.2008a). 9
1
Although most research has found either Amborella Angiosperms contain both ancient clades and recent b
y
alone or Amborella + Nymphaeales as sister to radiations, and there is heterogeneity of evolutionary g
u
all other angiosperms, a few DNA studies have ratesbothamonggenesandamonglineages(Bousquet es
suggested alternative rootings such as a topology that etal.1992;Qiuetal.2000;Soltisetal.2002).Somesites, t o
n
moderatelysupportedCeratophyllumassistertoallother whenviewedacrossallangiosperms,arehighlyvariable 2
5
angiosperms (e.g., Chase et al. 1993 [rbcL]; Savolainen and others are nearly constant (Chase and Albert 1998; D
e
et al. 2000 [rbcL + atpB]; Morton 2011 [xdh]), but these Olmstead et al. 1998; Soltis and Soltis 1998). Various ce
m
appeartobetheresultofspuriousrootingbasedonrbcL methods have been developed to try to accommodate b
e
(e.g., Chase et al. 1993), or low taxon density (Morton this heterogeneity and these highly variable sites (e.g., r 2
2011).Additionally,someresearchershavealsoproposed Barkman et al. 2000; Rokas et al. 2003; Delsuc et al. 0
2
that monocots may be sister to all other angiosperms 2005;Parksetal.2012;Rajan2013).Recently,theeffects 2
(e.g., Burger 1981; Goremykin et al. 2003). Studies that ofiterativelyremoving“saturated”nucleotidepositions
havecombineddatafromdifferentorganellargenomes (“noisereduction”)inplastidDNAwereinvestigatedby
andthathaveemployedbroadtaxonomicsampling(e.g., Goremykinetal.(2009b,2013).Theyusedanovelmethod
Qiu et al. 1999; Zanis et al. 2002; Qiu et al. 2005; Soltis (described in detail in Goremykin et al. (2010, 2013)) to
et al. 2011) have generally concluded that Amborella sort their alignment according to variability and then
alone is sister to all other extant angiosperms, with excludedtheleastconserved(mostvariable)characters
the notable exception of Barkman et al. (2000), who before analyses. While insufficient taxon sampling has
employed the noise-reducing program RASA (relative beenshowntoadverselyaffectresults(e.g.,Hillis1996;
[12:147/4/2014Sysbio-syt108.tex] Page:369 368–382
370 SYSTEMATICBIOLOGY VOL.63
Zwickl and Hillis 2002; Soltis and Soltis 2004; Soltis analyses for a terrestrial versus aquatic origin of
etal.2004;Stefanovic´ etal.2004;Heathetal.2008),and angiosperms?
increasedsamplinghasbeenadvocatedasthesolution,
there is no consensus on how to handle variable sites.
Although the effects of iteratively removing variable
plastidDNAsiteshavebeenexplored(e.g.,Delsucetal. MATERIALSANDMETHODS
2005;Philippeetal.2005;RegierandZwick2011;Rajan
GenomicSequencing,TaxonSampling,andSequence
2013), no general agreement has been reached as to
Alignment
methodology,orperhapsmoreimportantly,howmuch
and which data to remove. Furthermore, the impact Thegoalsofthisstudyweretoascertainthemostlikely
on phylogenetic inference from sparse taxon sampling rootoffloweringplantsusingplastidandmitochondrial
combined with iterative reduction of the most variable DNA sequence data, and also to assess how taxon and
sitesisevenlessclear. character sampling affect this inference. We therefore
Goremykin et al. (2013) reported that removing the designed a series of analyses and taxon sampling
most variable sites (2000 out of 40,553) in a data set schemes to examine the effects that different codon
of 31 taxa yields Amborella + Nymphaeales as sister partition analyses, taxon sampling, outgroup selection,
to all other angiosperms (PP = 1.00). Moreover, they and gene sampling would have on the inference D
o
also claim that phylogenetic relationships throughout of the angiosperm root. We also included, as did w
n
the tree are affected by the number of highly variable Goremykinetal.(2013),arecentlyrecognizedmemberof lo
a
sites they remove (Goremykin et al. 2009b, 2013). Nymphaeales, Trithuria (Hydatellaceae), in most of our de
d
Talhigantmise,natsarechitaerraacttievreslyfrreommovthede,e“vneonisoyt”heernwdiseofwtehlle- oannaeloyfs1e6s.faTmritihliuersiao,fftohremmeorlnyopcloatcceldadinePCoeanlterso,lwepaisdsahcoewaen, from
supportedmonocotandeudicotrelationshipscollapse. tofallinsteadinNymphaeales(Saarelaetal.2007).With h
Theirapproachconflatescharactervariabilitywithloss theexceptionofGoremykinetal.(2013),Trithuriahasnot ttps
of phylogenetic signal. We argue instead that the previouslybeenincludedingenomic-scaleplastidDNA ://a
c
relationshipamongcharactervariability,homoplasyand phylogenetic studies. We obtained fresh plant material a
d
e
itsdistribution,andtaxonsamplingdetermineswhether of Trithuria filamentosa (B.G. Briggs 9859; GenBank# m
or not highly variable characters carry phylogenetic KF696682) for whole plastome sequencing. Purified ic.o
signal. Thus, the utility of characters for phylogenetic plastid DNA was isolated using sucrose gradient up
analysis cannot be determined a priori on the basis ultracentrifugation and amplified via rolling circle .co
m
of character variability. It is especially noteworthy amplification (RCA) following the protocols in Moore /s
that although the explicit goal of Goremykin et al. et al. (2006, 2007). The RCA product was sequenced at ys
b
(2009b, 2013) was to “reduce noise” in the data, they theInterdisciplinaryCenterforBiotechnologyResearch io
Gusneedtanleosnv(Gasocruelmarykpilnanettsa(lG. 2o0r1e3m),yakignyemtnaol.sp20e0rm9b)claandde (SUysntievmers(iGtySo2f0F;l4o5r4idLa)ifuesSincigenthceesG, eBnraonmfoerSde,qCuTe,nUceSrA2)0, /article
characterized by very long branches, as outgroups. as outlined in Moore et al. (2006, 2007). Gaps between /6
3
The inclusion of these groups seemingly adds little, if the contigs derived from 454 sequence assembly were /3
/3
anything, toward resolving the angiosperm root, but bridged by designing custom primers near the ends of 6
8
increases the potential for variation across the data the GS 20 contigs for PCR and conventional capillary- /1
6
set, possibly resulting in “noisy” characters across the based sequencing. The completed plastid genome was 48
7
matrix. annotated using DOGMA (Wyman et al. 2004) and 9
1
Determining the root of extant angiosperms is subsequently aligned to the other sequences using b
y
important not only because it will drive how we think MAFFTv.6.859(Katohetal.2002).Theentirealignment g
u
aboutangiospermevolutionasawhole,butalsobecause was inspected in Mesquite (Maddison and Maddison es
it will orient specific character-state reconstructions 2011) to ensure that all nucleotide positions were in t o
n
and thus permit inferences of ancestral states. In this readingframe. 2
5
article, we analyze a suite of data sets, including In total, coding plastome sequence data from 235 D
e
plastid DNA, mtDNA, and nuclear gene regions, to seed plants with complete or nearly complete plastid ce
m
address the root of flowering plants. We re-examine genome sequences were downloaded from GenBank. b
e
some previously published studies (Qiu et al. 2010; Only one exemplar per genus was downloaded; when r 2
Soltis et al. 2011; Goremykin et al. 2013) and also multiple accessions from a genus were available, we 0
2
present several new data sets, including the largest chose the taxon with the most complete sequence 2
complete plastid genome data set yet assembled for data. Our sampling included virtually all angiosperm
angiosperms. We then use these data sets to address orderssensuAPGIII(2009)andupto19gymnosperm
the following questions: (i) Which angiosperm lineage genera as outgroups. To investigate the effect of long
is sister to all others? (ii) Do the noise-reduced branches and accompanying alignment uncertainty
data alignments presented by Goremykin et al. (2013) typically associated with Gnetales, these taxa were not
convincingly show that Nymphaeales, either alone or included in some of our analyses. The first plastid
withAmborella,aresistertotheremainingangiosperms? DNA data set consisted of sequences for 235 taxa that
(iii) What are the implications of these phylogenetic were downloaded from GenBank. This alignment of
[12:147/4/2014Sysbio-syt108.tex] Page:370 368–382
2014 DREWETAL.—RE-EXAMININGTHEANGIOSPERMROOT 371
235 accessions (216 angiosperms and 19 gymnosperm alignments”Third,wereanalyzedtheir25,246-character
outgroups)contained78plastidDNAgenesand58,218 alignmentcomposedoffirstandsecondcodonpositions
characters. A second alignment of 236 taxa includes fromthein-framealignment(S2fromGoremykinetal.
our previously unreported Trithuria filamentosa plastid (2013);inDryad).
genome with the above (59,944 aligned characters). Furthermore, we reanalyzed the published mtDNA
Four additional plastid DNA alignments are as above alignment of Qiu et al. (2010). Because the alignment
except that we: (i) removed Gnetales accessions (78 used in their analysis was not in frame, we modified
genes,233taxa,58,935characters),(ii)removedGnetales their published alignment so the data could also
accessionsandexcludedrps16andallndhgenes,which be partitioned by codon position. These alignment
have been lost in many gymnosperms (Braukmann adjustments consisted of simple manual modifications
et al. 2009) (66 genes, 233 taxa, 48,222 characters), such as moving characters at the beginning/end of
(iii) removed the gymnosperms that are missing ndh indels and ensuring all character blocks and gaps
and rps16 genes (78 genes, 222 taxa, 58,860 characters), occurred in multiples of three to maintain the reading
and (iv) excluded all characters except the ndh genes frame. Our final mtDNA alignment had 356 taxa and
(11 genes, 177 taxa, 10,479 characters). For the latter 7752 characters. Each gene (atp1, matR, nad5, rps3) was
alignment, we deleted all taxa that had more than 10% also analyzed individually to see if they exhibited
missing data, and the latter two alignments included different phylogenetic signal. We also analyzed an D
o
only the five gymnosperms from our data set that alignmentthatexcludedGnetalestoinvestigatewhether w
n
have not lost ndh genes (Cycas taitungensis, Ginkgo the long-branch and alignment uncertainty associated lo
a
biloba, Cephalotaxus wilsoniana, Taiwania cryptomerioides, with Gnetales would affect the root of the angiosperm de
d
ainncdluCdreydptionmtehreiaajnaaploynsiecsa)o.fBGecoaruesmeynkdinh egtenale.s(2w0e1r3e),nwoet panhdyl7o7g0e1nyc.hTahraisctGernse.tales-absentalignmenthad354taxa from
considereditimportanttoinvestigatewhateffect,ifany, Finally, we assessed per-site variation between h
theirinclusionorexclusionhadonthetopology. alternative topologies by examining data from the 17- ttps
We attempted to run the Goremykin et al. (2010, gene (nrDNA, plastid DNA, mtDNA) alignment from ://a
c
2013) NoiseReductor scripts several times on our Soltis et al. (2011), both with mtDNA gene regions a
d
e
full data set, as well as a reduced data set of 25 presentandaftertheirremoval.Goremykinetal.(2013) m
taxa and 56,838 characters. However, the Goremykin questioned the results reported by Soltis et al. (2011), ic.o
et al. (2010, 2013) scripts were never able to run to statingthey“wereunabletoconfirmthis[Amborellasister up
completion, requiring more than 16 GB of RAM and to remaining angiosperms] finding” instead inferring .co
m
long-run times, despite some attempts to modify the “a phylogenetic tree wherein a clade comprising /s
scripts to improve performance. In the end, we were Amborella,TrithuriaandNymphaealesreceived94%non- ys
b
unable to run Goremykin’s (2010, 2013) Perl scripts to parametricbootstrapsupport” io
cwoemtphleerteiofonreonsuebisthtieturttehdeafunlelworPreerdlusccreidptd(aMtaiasoete,taanld., /article
unpublished data) that sorts characters according to /6
3
the sum of pairwise distances (least variable sites at PhylogeneticAnalyses /3
/3
the beginning of the alignment, most variable sites Maximum likelihood (ML) analyses were conducted 6
8
at the end; see Supplementary Files S15 and S17 at using a parallel version (RAxML-PTHREADS-SSE3) /1
6
http://datadryad.org, doi:10.5061/dryad.68n85). This of RAxML v.7.3.0 (Stamatakis et al. 2005, 2008) as 48
7
methodissimilartothatusedbyGoremykinetal.(2010, implemented on the high-performance computing 9
1
2013)toobtaintheirsortedalignments.WerantheMiao cluster at the University of Florida. To assess the b
y
et al. script to sort two of our alignments: (i) the 236- effects that different phylogenetic methods might have g
u
taxon,78-genealignmentand(ii)the222-taxon,78-gene on our results, ML and Bayesian analysis were used es
alignment. to analyze the concatenated DNA alignment from t o
n
WealsoanalyzedthreeplastidDNAalignmentsfrom Goremykinetal.(2013)thatmaintainsthereadingframe 2
5
Goremykin et al. (2013). First, we created an alignment (Supplementary Data S4 from Goremykin et al. 2013). D
e
using the most variable 2000 characters from the Bayesian analysis was performed in MrBayes v.3.1.2 ce
m
“observed variability” (OV) sorted alignment (S3 from (HuelsenbeckandRonquist2001)asimplementedonthe b
e
Goremykin et al. (2013); in Dryad). This corresponded CyberinfrastructureforPhylogeneticResearch(CIPRES) r 2
to the final 2000 characters from their 40,553-character cluster (http://www.phylo.org/; last accessed January 0
2
sorted alignment. Next, we used their 31,674-character 12,2014).Allanalyseswererunfor5milliongenerations 2
alignment that maintained the reading frame (S4 from with the covarion-like model option turned on and
Goremykin et al. (2013); from “A MUSCLE alignment GTR + (cid:2) as the model of evolution. The first 25% of
of translated nucleotide sequences from 56 individual trees were discarded as burnin, and gaps were treated
Fasta files”; in Dryad). This in-frame alignment was as missing data in all the analyses. Convergence and
producedusingadifferentapproachthantheir“noise- mixing of two independent runs were assessed using
reduction”method.Althoughnotreeswereshownfrom Tracer1.5(Suchardetal.2001;RambautandDrummond
this alignment in their paper, Goremykin et al. (2013) 2009).Afterremovingburnin,effectivesamplesize(ESS)
stated:“similaranalyticalresultswereobtainedforboth values of the combined sump files were ∼2000, and
[12:147/4/2014Sysbio-syt108.tex] Page:371 368–382
372 SYSTEMATICBIOLOGY VOL.63
visualobservationoftheassociatedtracefilesindicated AmborellasistertoNymphaeales.Weranatleast10ML
thattheseparaterunshadconverged.Toassesswhether analyses using these two constraint trees with RAxML
our runs achieved convergence, we checked that the v.7.2.8usingtheGTR+(cid:2) model(usedforFigure6cin
standarddeviationofsplitfrequenciesfellbelow0.01. Goremykinetal.2013)ofmolecularevolution.Foreach
Alignmentswereanalyzedusingseveralpartitioning constraint, we analyzed the data set with and without
schemes.ForthesixplastidDNAalignmentsassembled mtDNAdata.UsingtheMLtrees,wecalculatedthesite
for this study, as well as the mtDNA alignment likelihoods and the differences between the trees with
from Qiu et al. (2010), the data were partitioned and Amborella sister to all other angiosperms and Amborella
analyzed by codon position (first and second positions sistertoNymphaeales.
only [third positions excluded], third positions only
[first and second positions excluded], all three codon
positionsincluded),gene,andcodonposition(different
substitution rates were allowed for first, second, and RESULTS
third positions) + gene. For the reanalysis of the Mostofthealignmentsusedinthisstudycontainca.
Goremykin et al. (2013) reduced-variability, in-frame 200-350taxa,andthustheresultanttreesaredifficultto
data set, we analyzed the alignment by codon position viewonasinglepage.Becausethepurposeofthisstudy
(first and second positions only, third positions only, is to evaluate only a small portion of the larger trees D
o
all three positions combined), but were unable to (at the root of the angiosperms), the pertinent results w
n
partitionthealignmentbygenebecausetheboundaries from this study have been summarized in Table 1, and lo
a
of the gene regions are unknown. For the alignment the relevant sections of the trees are summarized in de
d
fbeyattuhreingnothiseer2e0d0u0cmtioonstpvarorigarbalmechoafraGcoterresmaysksinpeceitfieadl. Fniogtusrheosw1n–4in. TThabeleon1layrerefsroumltsthpere2s0e0n0t-ecdhairnacttheirspalratsitcilde from
(2013; supplementary alignment S3), we conducted a DNA alignment from Goremykin et al. (2013), which h
single ML analysis that included all characters. No wasnotanalyzedinpartitions.Theindividualtreesare ttps
modifications were performed on the Goremykin et al. availableasSupplementaryDataatdatadryad.org. ://a
c
(2013)alignmentsdownloadedfromDryad. a
d
e
To assess the number of deleted variable characters m
ntoepcoeslosagryy, wtoe aitcehriaetviveelyanreAmmobvoerdellathe+ mNoysmt pvhaareiaableles PlastidDNAAnalyses ic.oup
characters from our two sorted alignments in 0.5% ForalloriginalplastidDNAalignmentsandpartitions .co
m
increments. This amounted to a removal of 295 analyzed here, Amborella is sister to a clade of all other /s
charactersperiterationinour236-taxonalignmentand angiosperms in the ML trees (Fig. 1, Table 1). The ys
b
294charactersperiterationinthe222-taxonalignment. unpartitioned analyses using all three codon positions io
AthfeterredruemceodvainliggntmheenvtasriinabRleAxchMaLraacstedrse,scwriebeadnaalbyozveed. agnendeonalllyftohuirnddco10d0o%npboosoittisotrnaspwsiuthpopuotrtpafortritiAomnibnogrebllya /article
Furthermore, to investigate how our sorting script sister to all other angiosperms (i.e., all angiosperms /6
3
compared with the sorting scheme of Goremykin et al. except Amborella formed a clade with 100% BS) as did /3
/3
(2013),weusedoursortingscriptontheunsorted40,553- all the analyses of these positions partitioned by gene. 6
8
character alignment from Goremykin et al. (2013) and Five of our six analyses that included just first and /1
6
followedtheirgeneralapproachofiterativelyremoving secondcodonpositionssupportedAmborellaasthesole 48
7
themostvariable250characters.Wethenanalyzedthe sister to all other extant angiosperms (bootstrap values 9
1
reducedalignmentsinRAxMLandcomparedtheresults for all other angiosperms ranged from 66% to 100%). b
y
fromoursortingscripttotheresultstheyobtainedusing The 66-gene, 233-taxon alignment that excluded the g
u
their noise-reduction protocol. Files containing all of ndh and rps16 genes was the lone outlier among our es
our original alignments and corresponding character plastidDNAdatapartitionsinthatthebootstrapanalysis t o
n
partitions used for this study have been deposited at of first and second codon positions weakly supported 2
5
datadryad.org. (BS=57%)acladeofAmborella+Nymphaealesassisterto D
e
Another way to examine where the differences lie the remaining angiosperms. As previously mentioned, ce
m
between the competing hypotheses of Amborella sister however,thesinglebestMLtreeshowsAmborellasisterto b
e
to Nymphaeales versus Amborella sister to all other allotherangiosperms(Fig.1d).Foralldataalignments r 2
angiosperms is to examine the differences in site other than first and second codon positions, Amborella 0
2
likelihoods between trees constrained to support the alone was sister to all remaining angiosperms, which 2
differenthypotheses(Evansetal.2010;Smithetal.2011). formedacladewith100%bootstrapsupport.
This analysis allows for the examination of patterns in Our analysis of the in-frame plastid DNA alignment
thesupportbetweenthetwotrees.Withthiscomparison, from Goremykin et al. (2013; S4 in Dryad) produced
the expectation is that the sum of the differences will similar results to those noted above for the analyses of
equal the difference in total ln likelihood. To conduct our own plastid DNA data sets. The ML trees and the
theseanalyses,weconstructedtwoconstrainttreesbased majority-ruleBayesianconsensustreesthatweobtained
onthealignmentofSoltisetal.(2011):onewithAmborella from their data each showed Amborella alone as sister
sister to the rest of the angiosperms and one with to all other angiosperms in all partitioned analyses
[12:147/4/2014Sysbio-syt108.tex] Page:372 368–382
2014 DREWETAL.—RE-EXAMININGTHEANGIOSPERMROOT 373
TABLE1. SummaryofAmborellaplacementrelativetoNymphaealesandtherestofangiosperms
Alignment Threecodon Firstandsecond Thirdcodon Partitionedbygene Partitionedbygene
positions codonpositions positions andcodon
cpDNA-235taxa,(no Ambo.sister;100% Ambo.sister;84%BS Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100%
Trithuria),78genes BSforrestof forrestofangio forrestofangio BSforrestof BSforrestof
58,218chars angio angio angio
cpDNA-236taxa, Ambo.sister;100% Ambo.sister;66%BS Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100%
(w/Trithuria),78 BSforrestof forrestofangio forrestofangio BSforrestof BSforrestof
genes,58,950chars angio angio angio
cpDNA-233taxa,78 Ambo.sister;100% Ambo.sister;76%BS Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100%
genes,58,935chars BSforrestof forrestofangio forrestofangio BSforrestof BSforrestof
angio angio angio
cpDNA-233taxa,66 Ambo.sister;100% Ambo.+Nymphaeales Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100%
genes,48,222chars BSforrestof 57%BS;** forrestofangio BSforrestof BSforrestof
angio angio angio
cpDNA-222taxa,78 Ambo.sister;100% Ambo.sister;100%BS Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100%
genes,58,860chars BSforrestof forrestofangio forrestofangio BSforrestof BSforrestof
angio angio angio
D
cpDNA-ndhgenes, Ambo.sister;100% Ambo.sister;76%BS Ambo.sister;100%BS Ambo.sister;100% Ambo.sister;100% o
w
177taxa;10,479 BSforrestof forrestofangio forrestofangio BSforrestof BSforrestof n
chars angio angio angio lo
a
Goremykinetal. Ambo.sister;97%BS Ambo.sister;53%BS Ambo.sister;98%BS NA NA d
e
(2013)align.S4: forrestofangio forrestofangio;** forrestofangio d
ML;31,674chars fro
Goremykinetal. Ambo.sister;1.00PP Ambo.sister;0.92PP Ambo.sister;1.00PP NA NA m
(3210,61734)cahliagrns.S4:BI; forrestofangio forrestofangio forrestofangio https
mtDNA-356taxa,4 Ambo.+ Ambo.+Nymphaeales Ambo.sister;66%BS Ambo.+ Ambo.+ ://a
genes,7752chars Nymphaeales 89%BS forotherangio Nymphaeales Nymphaeales ca
71%BS 78%BS 75%BS de
mtDNA-noGnetales, Ambo.+ Ambo.+Nymphaeales Ambo.sister;78%BS Ambo.+ Ambo.+ m
4genes,7752chars Nymphaeales 79%BS forotherangio Nymphaeales Nymphaeales ic.o
68%BS 61%BS 72%BS u
p
.c
o
m
**IndicatesAmborellaissistertoangiospermsinbestMLtree. /s
y
s
b
io
/a
(Figs. 2 and 3). Other than the discrepancy regarding remainingangiosperms.Thisrelationshiphadlessthan rticle
theplacementof Amborella,thetreesweobtainedwere 50%bootstrapsupport,however. /6
3
nearly identical in topology to the noise-reduced tree /3
/3
shownbyGoremykinetal.(2013).Theresultsfromour 6
analysesoftheGoremykinetal.(2013)alignmentvaried mtDNAAnalyses 8/1
6
from those obtained using our new plastid DNA data Our analyses of the mtDNA alignment of Qiu 48
7
set and alignments in two major ways: (i) Support was et al. (2010) generally yielded results similar to 9
1
lowerinouranalysesforthecladeofangiospermssister their published total-evidence study. A weakly to b
y
to Amborella, but still very high; ML BS = 97%, 98% moderatelysupportedclade(71%–89%bootstrapvalues) g
u
and PP = 1.00 in analyses including all three codon consisting of Amborella + Nymphaeales was sister to a es
positions and third codon positions, respectively, and clade containing all remaining angiosperms (Table 1). t o
n
(ii) Amborella was sister to all other angiosperms in all However, Qiu et al. (2010) did not examine data 2
5
of our bootstrap analyses as well (albeit with only 53% partitions. Although two partitioning schemes also D
e
bootstrap support for the data set of first and second supported Amborella + Nymphaeales (first and second c
e
codonpositions).Theanalysisofthe2000mostvariable codon positions, all three positions), the analysis that m
b
plastid DNA characters identified by the Goremykin included only third codon positions found Amborella er 2
etal.(2013)noise-reductionprotocolyieldedatreethat alone to be sister to all other angiosperms, which 0
2
was congruent with all other plastid DNA trees in our formedaweaklysupported(BS=66%)clade.Analyses 2
study (Fig. 4). Amborella was recovered as sister to all of single mtDNA gene alignments yielded trees that
other angiosperms in the ML tree, with the clade of were generally poorly resolved, especially within
remainingangiospermshaving75%bootstrapsupport. angiosperms. In the atp1 analysis, the ML tree found
Thereanalysisofthe25,246-characteralignmentofonly Nymphaeales(BS=76%)assistertoallangiosperms,but
firstandsecondcodonpositionsyieldedresultssimilar branches were very short throughout the tree, and this
to those found in Goremykin et al. (2013): the ML tree relationshipdidnotreceiveBS>50%;Amborellaformed
showedNymphaealesassistertoallotherangiosperms, a clade with Austrobaileyales (BS = 70%). The matR
with Amborella and Illicium as successive sisters to all analysisfoundAmborella+Nymphaeales(BS=87%)as
[12:147/4/2014Sysbio-syt108.tex] Page:373 368–382
374 SYSTEMATICBIOLOGY VOL.63
a) 78 genes; 235 taxa b) 78 genes; 236 taxa
Amborella trichopoda
Amborella trichopoda
Trithuria
Nuphar advena
100 100 Nuphar advena
Nymphaea alba
100
Nymphaea alba
Illicium oligandrum
84 66 Illicium oligandrum
Chloranthus spicatus
69 Chloranthus spicatus
Piper cenocladum
100 79 76 Piper cenocladum
Drimys granadensis 100 91
86 Drimys granadensis
Calycanthus floridus 91
Calycanthus floridus
100 100 100Liriodendron tulipifera 100 100 Magnolia kwangsiensis
Magnolia kwangsiensis 100
Liriodendron tulipifera
remaining angiosperms
59 remaining angiosperms
71
0.05 0.05
c) 78 genes; 233 taxa d) 66 genes; 233 taxa D
o
Amborella trichopoda Amborella trichopoda w
n
Trithuria Trithuria lo
a
100 Nuphar advena 100 Nymphaea alba de
100Nymphaea alba 10N0uphar advena d fro
76 Illicium oligandrum Illicium oligandrum m h
Chloranthus spicatus Chloranthus spicatus ttp
100 7591 Drimys granadensis Piper cenocladum 100 9187 Drimys granadensisPiper cenocladum s://aca
91 Calycanthus floridus 98 Calycanthus floridus dem
100 10010LM0iraigondoelniad rkowna ntuglsiipeinfesirsa 100 97 10ML0iarigondoelniad krwona ntuglsiipeinfseirsa ic.oup
73 remaining angiosperms 76 remaining angiosperms .com
0.05 0.05 /sys
b
e) 78 genes; 222 taxa f) 11 genes; 177 taxa io/a
Amborella trichopoda Amborella trichopoda rtic
100 Nymphaea alba Trithuria 100 Nymphaea alba Trithuria le/63
100 10I0lNliucpiuhmar oadlivgeannadrum 76 1DIl0lri0icmiNuyusmp g horaalrin gaaaddnveednnrsauism /3/368/16
100 790275 DCriChmlaoylsry agcnartanhntuhasud ssep nfilcsoiasrtiudsusPiper cenocladum 100 998198 100LMiraigondoelniaCd rkhowlonCar aantulngyltsichipaeuinnsfe stshripsauics afltousridus 48791 by gu
100 99 Liriodendron tulipifera Ceratophyllum demersum es
10M0agnolia kwangsiensis 100 Piper cenocladum t on
58 remaining angiosperms 100 monocots 25
100 remaining angiosperms D
e
c
0.05 0.05 em
b
e
FIGURE 1. Best ML trees showing relationships of basal angiosperms. All alignments involve first and second codon positions only. r 2
a)78-gene,235-taxonplastidDNAalignmentnotincludingTrithuria.b)78-gene,236-taxonalignmentincludingTrithuria.c)78-gene,233-taxon 0
2
alignmentthatexcludesGnetales.d)66-gene,233-taxonalignmentthatexcludesndhandrps16genes.e)78-gene,222-taxonalignmentthat 2
excludesgymnospermsthathavelostndhgenes.f)177-taxonalignmentincludingonlyndhgenes.MLbootstrapvaluesindicatednearnodes
(completetreesavailableatdatadryad.org).
sister to the remaining angiosperms (BS = 100%). The Amborellaassistertotheremainingangiosperms,which
nad5 tree recovered a clade consisting of Amborella + receivedBSsupportof64%.
Nymphaeales (BS = 83%) that was in turn sister to all Removing Gnetales from the mtDNA analyses
otherangiosperms(BS=80%).Therps3analysisfound reducedsupportfor Amborella+Nymphaealeswithall
[12:147/4/2014Sysbio-syt108.tex] Page:374 368–382
2014 DREWETAL.—RE-EXAMININGTHEANGIOSPERMROOT 375
a) 1st & 2nd positions
Amborella trichopoda
Trithuria inconspicua
100 Nuphar advena
100
Nymphaea alba
Illicium oligandrum
53
Chloranthus spicatus
Piper cenocladum
66 57
Calycanthus fertilis
100 82
Drimys granadensis
60
Liriodendron tulipifera
Lemna minor
100 98 Dioscorea elephantipes
100
100 Phalaenopsis aphrodite
100 Acorus calamus
Acorus americanus
63 Ceratophyllum demersum
Nandina domestica
100 Ranunculus macranthus
60 100
90 Megaleranthis saniculifolia D
Vitis vinifera o
100 w
100 Buxus microphylla n
Platanus occidentalis lo
0.02 a
d
e
d
b) 3rd codon position fro
Amborella trichopoda m
Trithuria inconspicua h
100 100 Nuphar aNdvyemnpahaea alba ttps
100 Illicium oligandrum ://a
Piper cenocladum c
99 a
Drimys granadensis d
100 e
98 96 Calycanthus fertilis m
Liriodendron tulipifera ic
Lemna minor .o
100 100 100 100 Dioscorea elephantipes Phalaenopsis aphrodite up.c
o
100 Acorus calamus m
62 Acorus americanus /s
y
Ceratophyllum demersum s
Chloranthus spicatus bio
76 100 100 100 NaMndeignaa ldeoramnetshtiicsa saniculifolia Ranunculus macranthus /article
100 Vitis vinifera /6
100 PlatanuBsu oxuccs imdeincrtaolpishylla 3/3
0.02 /3
6
8
c) all 3 codon positions /16
4
Amborella trichopoda 8
7
Trithuria inconspicua 9
100 100 Nuphar advena 1 b
Nymphaea alba y
Illicium oligandrum g
97 u
Chloranthus spicatus e
s
99 Piper cenocladum t o
100 100 Drimys graCnaaldyecnasnitshus fertilis n 2
98 5
Liriodendron tulipifera D
100 100 Dioscorea elephantipes Lemna minor ece
100 m
100 Phalaenopsis aphrodite b
89 100 AAccoorruuss acmalearmicuasnus er 20
Ceratophyllum demersum 2
Nandina domestica 2
58 100 Ranunculus macranthus
100
Megaleranthis saniculifolia
100
Vitis vinifera
100
100 Buxus microphylla
Platanus occidentalis
0.02
FIGURE2. PhylogramsresultingfromMLanalysesofGoremykinetal.(2013;S4inDryad)in-framealignment.a)firstandsecondcodon
positions.b)thirdcodonpositions.c)allthreecodonpositions.
[12:147/4/2014Sysbio-syt108.tex] Page:375 368–382
376 SYSTEMATICBIOLOGY VOL.63
a) 1st & 2nd codon positions
Amborella trichopoda
Trithuria inconspicua
1.00 Nuphar advena
1.00
Nymphaea alba
Illicium oligandrum
.92 Piper cenocladum
.86
Calycanthus fertilis
1.00
Drimys granadensis
1.00 .86
Liriodendron tulipifera
.99
Chloranthus spicatus
Lemna minor
1.00 1.00 Dioscorea elephantipes
1.00
1.00 Phalaenopsis aphrodite
1.00 Acorus americanus
.99 Acorus calamus
Ceratophyllum demersum
1.00 1.00 Platanus occidentalis
Vitis vinifera D
1.00 Buxus microphylla ow
Nandina domestica n
1.00 1.00 MegaleranRtahnisu nscaunliucsu mlifaoclriaanthus loade
d
0.03 fro
b) 3rd codon position m
h
Amborella trichopoda Trithuria inconspicua ttps
1.00 1.00 1.00NuphNayrm apdhvaeneaa alba ://ac
1.00 IlliciumD roimligyas ngdrraunmadensis Piper cenocladum adem
1.00 11.0.000 Calycanthus fertilis ic.o
Liriodendron tulipifera u
Ceratophyllum demersum p
.97 1.00 Dioscorea elephantipes Lemna minor .com
1.00 1.00 1.00 Phalaenopsis aphrodite /s
.97 1.00 AAccoorruuss acamlearmicuasnus ysb
1.010.00ChlorantPhulasBt asupnxiucusas tomucsiccirdoepnhtVayilltilisas vinifera io/article
1.00 1.00 1.00 Nandina domestica Ranunculus macranthus /63/3
Megaleranthis saniculifolia /3
6
8
c) all 3 codon positions 0.03 /16
4
8
Amborella trichopoda 7
Trithuria inconspicua 9
1.00 1.00Nuphar advena 1 b
Nymphaea alba y
Illicium oligandrum g
1.00 1.00 Drimys granadensis Piper cenocladum ues
1.00 Calycanthus fertilis t o
1.00 1.00 Liriodendron tulipifera n 2
Chloranthus spicatus 5
Lemna minor D
1.00 1.00 Dioscorea elephantipes e
1.00 c
1.00 Phalaenopsis aphrodite e
1.00 Acorus americanus m
1.00 Acorus calamus be
1.00 1.00 Platanus occidCeVnertiaatiltsios pvhinyilfleurma demersum r 202
1.00 Buxus microphylla 2
1.00 Ranunculus macranthus
1.00
1.00 Megaleranthis saniculifolia
Nandina domestica
0.03
FIGURE3. PhylogramsresultingfromBayesiananalysesofGoremykinetal.(2013;S4inDryad)in-framealignment.a)firstandsecondcodon
positions.b)thirdcodonpositions.c)allthreecodonpositions.
[12:147/4/2014Sysbio-syt108.tex] Page:376 368–382
2014 DREWETAL.—RE-EXAMININGTHEANGIOSPERMROOT 377
Cycas taitungensis
Ginkgo biloba
100 Ephedra equisetina
100 Welwitschia mirabilis
80 100
Gnetum parvifolium
76
Keteleeria davidiana
100 Pinus thunbergii
100
100
Pinus koraiensis
Cryptomeria
Amborella trichopoda
Trithuria inconspicua
99 Nuphar advena
100
100
Nymphaea alba
Illicium oligandrum
75
Chloranthus spicatus
52 Piper cenocladum
67 D
100 100 Drimys granadensis ow
Liriodendron tulipifera n
lo
Calycanthus fertilis ad
e
Lemna minor d
100 100 Dioscorea elephantipes fro
100 m
100 Phalaenopsis aphrodite h
100 Acorus calamus ttps
73 Acorus americanus ://ac
Ceratophyllum demersum a
d
Nandina domestica em
72 100 Ranunculus macranthus ic
100 .ou
Megaleranthis saniculifolia p
.c
Vitis vinifera o
100 m
0.1 53 PlaBtaunxuuss omciccirdoepnhtaylllias /sys
b
io
chaFraIGcUteRrEsf4r.omPalhigylnomgreanmtSs3hionwDinrygaMd)L. analysisofdeletedcharactersfromnoise-reducedalignmentinGoremykinetal.(2013)(final2000 /artic
le
/6
3
/3
codon positions included (BS = 68% vs. BS = 71% with an Amborella + Nymphaeales clade was recovered, /3
6
Gnetales),firstandsecondcodonpositions(BS=79%vs. virtually no structure remained in the backbone of 8
/1
BS=89%),gene-partitioned(BS=61%vs.BS=78%),and eithertree.Afterapplyingourvariabilitysortingscript 6
4
gene- and codon-partitioned (BS = 72% vs. BS = 75%) tothe40,553-characteralignmentfromGoremykinetal. 8
7
9
alignments and increased support for the topology of (2013), we needed to remove 1750 of the most variable 1
Amborella sister to all other angiosperms (BS = 78% vs. characterstofindAmborella+Nymphaealesassisterto by
g
BS = 66%) in the analysis of the third codon position the remaining angiosperms, similar to the findings of u
e
alignment(Table1). Goremykin et al. (2013), who needed to remove 1250 s
t o
sites to find this topology. This result suggests that n
2
the sorting script used here (Miao et al., unpublished 5
D
AnalysesofVariability-SortedAlignments data) behaves similarly to the sorting script used by e
c
Goremykinetal.(2010,2013). e
Using our variability-sorted, 236-taxon, 58,944- m
b
characteralignment,itwasnecessarytoremove22%of er 2
themostvariablepositions(12,968characters)torecover 0
atopologyshowing Amborella+Nymphaealesassister AnalysisofPer-SiteLikelihoods 22
totheremainingangiosperms(S16inDryad).Usingour The per-site likelihood results for analyses including
sorted222-taxon,58,860-characteralignment,35%ofthe mitochondrial genes found that the topology with
mostvariablepositions(20,601characters)neededtobe Amborella sister to all other angiosperms is 19.4 log
deleted to achieve an Amborella + Nymphaeales clade likelihood units better than a topology with Amborella
(S18inDryad).Inanalysesofbothdatasets,monophyly sister to Nymphaeales (Supplementary Fig. S1). The
of recognized clades (e.g., monocots, eudicots) broke resultsforanalyseswithoutmtDNAsequencedatashow
down after removal of about 10% (∼5900) of the that Amborella as sister to the rest of the angiosperms
most variable characters, and at the point at which is favored by 8.4 log likelihood units (Supplementary
[12:147/4/2014Sysbio-syt108.tex] Page:377 368–382
Description:reported a topology with Amborella + Nymphaeales (water lilies) sister to all sister to all other angiosperms, as well as all other major clades. Thus