Table Of ContentPhylotranscriptomics Reveals Discordance in the Phylogeny of
Hawaiian Drosophila and Scaptomyza (Diptera: Drosophilidae)
Samuel H. Church*,1,2 and Cassandra G. Extavour1,3,4
1DepartmentofOrganismicandEvolutionaryBiology,HarvardUniversity,Cambridge,MA,USA
2DepartmentofEcologyandEvolutionaryBiology,YaleUniversity,NewHaven,CT,USA
3DepartmentofMolecularandCellularBiology,HarvardUniversity,Cambridge,MA,USA
4HowardHughesMedicalInstitute,ChevyChase,MD,USA
*Correspondingauthor:E-mail:[email protected].
Associateeditor:MichaelRosenberg
Abstract D
o
w
Islandradiationspresentnaturallaboratoriesforstudyingtheevolutionaryprocess.TheHawaiianDrosophilidaeareone nlo
a
such radiation, with nearly 600 described species and substantial morphological and ecological diversification. These d
e
sppreecseienstanreewlaarsgseelmybdlievdidterdanisnctroipatofmewesmfraojmor1c2lasdpeesc,ibesutactrhoessrethlaetsieoncslahdipesb,eatnwdeuensectlhaedseestrreamnsacirnipstuonmceesrttaoinr.eHsoelrvee,twhee d from
baseoftheevolutionaryradiation.Werecoveranewhypothesisfortherelationshipbetweenclades,anddemonstrateits h
ttp
supportoverpreviouslypublishedhypotheses.Wethenusetheevolutionaryradiationtoexploredynamicsofconcor- s
danceinphylogeneticsupport,byanalyzingthegeneandsiteconcordancefactorsforeverypossible topologicalcom- ://a
c
a
binationofmajorgroups.Weshowthathighbootstrapvaluesmasklowevolutionaryconcordance,andwedemonstrate d
e
m
that the most likely topology is distinct from the topology with the highest support across gene trees and from the
ic
topology with highest support across sites. We then combine all previously published genetic data for the group to .o
u
p
estimate a time-calibrated tree for over 300 species of drosophilids. Finally, we digitize dozens of published Hawaiian .c
o
Drosophilidaedescriptions,andusethistopinpointprobableevolutionaryshiftsinreproductiveecologyaswellasbody, m
/m
wing,andeggsize.Weshowthatbyexaminingtheentirelandscapeoftreeandtraitspace,wecangainamorecomplete
b
e
understandingofhowevolutionarydynamicsplayoutacrossanislandradiation. /a
Keywords:Drosophila,phylogenetics,discordance,Hawaii,Diptera. rticle
/3
9
/3
/m
s
Introduction show that the relationships between the major groups of a
c
0
these flies are best understood by using methods that em- 1
Intheeraofgenome-scaledata,wehaveanopportunityto braceevolutionarydiscordance. A2/6
5
unpack the biological meaning of phylogenetic support. In The Hawaiian Drosophila have a long history as a model r12
pbehtywloegeenneotrigcaannisamlyss,essutphpatorsteeiskotfotednisdcoefvienredthaesrtehlaetipornosphoiprs- c(Ola’dGeradfoyranthdeDiemSapllleem2e0n1t8a)t.iMonoroefthpahnyl2o0geyneeatriscagmo,etBhaokdesr ticl066 by
tionofinformationthatfavorsaparticularbranchinanevo- e g
andDeSalle(1997)usedtheHawaiianradiationofDrosophila u
e
lutionary tree (Simon 2020). Methods have been developed to perform one of the first analyses to demonstrate incon- st o
thatemphasizeextractingthetreewiththegreatestamount gruencebetweenanoverallspeciestreeandunderlyinggene n 0
of support from out of an otherwise rugged landscape of trees. Their study focused on the resolution between major 5 A
Htreoewsepvaecre, a(Sgwroowffoinrdg neutmable.r1o9f96s;tuHdieedstkheaveetemalp.h2as0i0ze6d). cwlaodrkesdoofneHabwyaCiiaarnsoDnroinsotphheila19a7n0ds ibnufeilrtrinogntthheeplahnyldomgeanrky pril 20
thebiologicalrelevanceofthatlandscapetoourunderstand- ofasubgroupofHawaiianDrosophila,thepicture-wingflies, 22
ing of the evolutionary process (Kellogg et al. 1996; Baum based on the banding pattern of polytene chromosomes
2007; Smith et al. 2015). For example, many new studies (CarsonandKaneshiro1976),amongotherearlyphylogenetic
have contributed evidence that, even with trees with high studies(BeverleyandWilson1985;ThomasandHunt1991).
measures of conventional support, we can expect large During the past 20 years, the relationships between major
amounts of discordance among sites and genes, especially groups have been revisited several times (Kambysellis et al.
when examining speciation events with short internodes or 1995; Bonacum 2001). Most recently, O’Grady et al. (2011)
withalikelihoodofintrogression(Weisrocketal.2012;Pease usedmitochondrialgenesandexpandedtaxonsampling,and
et al. 2016). Here, we use the island radiation of Hawaiian Magnaccaand Price(2015) used an expandednuclear gene
drosophilid flies to study the landscape of treespace, and set. The study presented here builds on this foundational
(cid:2)TheAuthor(s)2022.PublishedbyOxfordUniversityPressonbehalfoftheSocietyforMolecularBiologyandEvolution.
ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(https://creativecommons.
org/licenses/by/4.0/),whichpermitsunrestrictedreuse,distribution,andreproductioninanymedium,providedtheoriginalworkis Open Access
properlycited.
Mol.Biol.Evol.39(3):msac012 doi:10.1093/molbev/msac012 AdvanceAccesspublicationJanuary20,2022 1
MBE
ChurchandExtavour . doi:10.1093/molbev/msac012
work, presenting the first phylogenetic analysis of genome- usingtheBayesiansoftwarepackageBEASTshowedanalter-
scaledataforthegroup. nativerelationship,withhaleakalaefliesasthesistercladeto
TheHawaiianDrosophilidaeconsistof566describedspe- all other Hawaiian Drosophila, a clade uniting the
cies (O’Grady et al. 2010; Magnacca and Price 2012), with MMþPNAþD. primaeva, and closer affinity between
hundreds more estimated to be awaiting description D. primaeva and PNA species than between D. primaeva
(O’Grady et al. 2010). These species have been divided into and MM species (fig. 1C). This latter arrangement is largely
the following major clades (O’Grady et al. 2010): 1) the pic- consistent with relationships proposed by Throckmorton
ture-wing,nudidrosophila,ateledrosophila(PNA)clade,which (1966) and reiteratedinseveral subsequent studies (supple-
hasservedasamodelcladeforthestudyofsexualselection mentaryfig.S1,SupplementaryMaterialonline)(Kambysellis
(KaneshiroandBoake1987)andspeciation(Kangetal.2016); etal.1995;BakerandDeSalle1997;Bonacum2001).
2) the antopocerus, modified-tarsus, ciliated-tarsus (AMC) Resolvingtheserelationshipsiscriticalforourunderstand-
clade, first proposed by Heed (1968) and (O’Grady et al. ing of the morphological and ecological evolution of these
2010) and confirmed by subsequent phylogenetic studies flies (Kambysellis et al. 1995; Bonacum 2001; O’Grady et al.
D
(O’Grady et al. 2011; Lapoint et al. 2014); 3) the modified- 2011). Hawaiian Drosophila demonstrate a large diversity in o
w
mouthparts(MM)clade;and4)thehaleakalaeclade,anenig- body size (Stevenson et al. 1995), wing size (Edwards et al. n
lo
matic group in need of further study (Hardy et al. 2001). 2007),andeggsize(Montagueetal.1981);inthenumberand a
d
e
Several other smaller clades have been suggested as falling position of structural features such as wing spots (Edwards d
outsideofthesemajorgroups,includingtherusticagroupof etal.2007);inthenumberofegg-producingunitsintheovary fro
m
threespecies(O’Gradyetal.2001),andthemonotypicline- (ovarioles)(KambysellisandHeed1971;Sarikayaetal.2019); h
ages of D. primaeva and D. adventitia. The position of and in the type of substrate used for oviposition and larval ttp
s
D.primaevahasbeensomewhatuncertain,butseveralstud- feeding(Kambysellisetal.1995;Magnaccaetal.2008).Some ://a
c
ies have suggested it is the sister taxon to picture-wing flies clades demonstrate unique suites of morphological and be- ad
e
(Bonacum 2001), including the work on polytene chromo- havioraltraits,whoseevolutionaryhistoryisunclearbecause m
ic
somesbyCarsonandStalker(1969).ThespeciesD.adventitia ofuncertaintiesinthephylogeny.Forexample,thehaleakalae .o
u
wasoriginallysuggestedtobepartoftheMMclade(Hardy fliesexclusivelyusefungalovipositionsubstratesandarecon- p
.c
1965), but recent studies placed it as the sister taxon to sidered to have less complex mating behaviors than other, om
D.primaeva(Bonacum2001)orpossiblyothermajorclades. morewell-studiedgroups(e.g.,picture-wingflies)(Hardyetal. /m
b
Additionally,theHawaiianDrosophilaarethesistercladeof 2001). It is unclear whether this suite of traits represents a e/a
the genus Scaptomyza, which is nested within the broader secondarytransitionrelativetotheancestralstate,becauseit rtic
paraphyletic genus Drosophila and is hypothesized to have isnotknownwhetherhaleakalaefliesarethesistercladetoall le/3
9
colonized the island independently (Throckmorton 1966; other Hawaiian Drosophila or nested within the radiation. /3
Lapoint et al. 2013), possibly more than once (Katoh et al. Resolution in the relationships at the base of this lineage /m
s
a
2017). Throughout this manuscript, we use Hawaiian will be key in identifying which branches experienced sub- c
0
Drosophila to refer to non-Scaptomyza Hawaiian species, stantial trait diversification, and especially in identifying 12
/6
and Hawaiian Drosophilidae to refer to the clade of whetheranyofthesetraitsdemonstratepredictablepatterns 5
1
HawaiianDrosophilaþScaptomyza. ofcoevolution. 20
6
Many phylogenetic studies have been performed which Here,wepresentthefirstphylogenomicrelationshipsbe- 6
b
haveconfirmedthemonophyly of eachofthesecladesand tweenthemajorgroupsofHawaiianDrosophilidae.Wecom- y g
providedresolutionforinternalrelationships(PNA,Bonacum bine 12 new transcriptomes sequenced in this study with ue
s
et al. 2005; Magnacca and Price 2015; AMC, Lapoint et al. recentlypublishedgenomesfortwoHawaiianDrosophilaspe- t o
n
2011, 2014; haleakalae, O’Grady and Zilversmit 2004; and cies (Kim et al. 2021), four non-Hawaiian Scaptomyza (Kim 0
5
Scaptomyza,Lapointetal.2013;Katohetal.2017).Previous etal.2021),andsixoutgroupspecies(Larkinetal.2021).By A
p
phylogeneticstudies,however,havenotresultedinaconsen- increasingthenumberofgenesusedtoinferrelationships,we ril 2
sus relationship between the major clades within Hawaiian begintounpacktheevolutionaryhistoryintheshortintern- 0
2
2
Drosophila (supplementary fig. S1, Supplementary Material odes at the base of the Hawaiian Drosophila radiation.
online) (Magnacca and Price 2015). Magnacca and Price Following up on the critical study by Baker and DeSalle
(2015)showedthatdifferentphylogeneticmethodsofanal- (1997) 25years ago, we explore the landscape of treespace
ysis (e.g., using software based on Bayesian statistics rather and the discordance between species and gene trees using
thanmaximumlikelihoodforinference)producedhighlyin- ourphylotranscriptomicdataset.Wethenusetheresultsof
congruent topologies (fig. 1) (Magnacca and Price 2015). In ouranalysisasinitialconstraintsonsubsequentphylogenetic
thatstudy,themostlikelytopologyhadD.primaevaasthe analyses using a data set of 316 species and 44 genes, com-
sistertaxontoallotherHawaiianDrosophila,andincludeda piled using all previous phylogenetic studies of Hawaiian
clade uniting MMþAMCþhaleakalae, with the haleakalae Drosophilidae. Finally, we estimate the age of the radiation,
clade showing greater affinity to AMC species relative to andusethistime-calibratedtreetoidentifybrancheswhere
MMspecies(fig.1B).Thistopology wasconsistentwiththe shiftsintraitevolutionlikelyoccurred.Ourfindingssuggesta
treesuggestedbyO’Gradyetal.(2011)analyzingmitochon- relationshipbetweenmajorcladesthatisdistinctfromboth
drialdataandusingmaximumlikelihoodandBayesiananal- previously hypothesized topologies, and that is well sup-
yses. However, the analyses of Magnacca and Price (2015) ported by both maximum likelihood and Bayesian analyses.
2
MBE
DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012
D
o
w
n
lo
a
d
e
d
fro
m
h
ttp
s
://a
c
a
d
e
m
ic
.o
u
p
.c
o
m
/m
b
e
/a
rtic
le
/3
9
/3
/m
s
a
c
0
1
2
/6
5
1
FIG.1.Phylotranscriptomicanalysisindicatesrelationshipsbetweenmajorcladesdistinctfromthosepreviouslyhypothesized.Photosshowsixof 20
6
the12specieswithdenovotranscriptomespresentedinthisstudy,listingtheirparentcladeandtheHawaiianislandonwhichtheyarefound.(A) 6
b
Resultsnoveltothisstudy,showingbestsupportedtreeacrossmaximumlikelihoodandBayesiananalyses.Nodelabelsindicateultrafastbootstrap y
g
values,genetreeconcordancefactors(gCF),andsiteconcordancefactors(sCF),seeconcordancefactoranalysisbelow.Drosophilaadventitiawas u
e
notpresentinphylotranscriptomicanalyses;seefigure3forinformationonitsplacement.(B,C)Previouslyhypothesizedrelationshipsbetween st o
thepicturewing-nudidrosophila-ateledrosophila(PNA),modified-mouthparts(MM),antopocerus-modifiedtarsus-ciliatedtarsus(AMC),halea- n
0
kalae,andScaptomyzaclades,aswellastwomonotypicclades,D.primaevaandD.adventitia.TopologyBwasrecoveredinO’Gradyetal.(2011) 5
A
andMagnaccaandPrice(2015).TopologyCwasrecoveredusingtheBayesiansoftwareBEASTinMagnaccaandPrice(2015),showingincongruent p
relationshipsbetweencladesatthebaseoftheradiationofHawaiianDrosophila. ril 2
0
2
2
Weshowthatexaminingacomprehensivelandscapeoftree (Stamatakis2014),aswellastheconsensustreewithhighest
andtraitspacecanallowforamorecompleteunderstanding posterior probability estimated using PhyloBayes (Lartillot
ofevolutionarydynamicsinthisremarkableislandradiation. et al. 2013) (fig. 1A and supplementary figs. S2 and S3,
Supplementary Material online). Bootstrap support for all
Results
branches was 100 and posterior probability was 1, with the
Phylotranscriptomics SuggestaNewPhylogenyof exception of the branch subtending the clade uniting
HawaiianDrosophilidae MMþAMC(IQtreeultrafastbootstrap:66,RAxMLbootstrap:
Using a phylotranscriptomic approach, werecovered a new 57,PhyloBayesposteriorprobability:0.52).Wealsoestimated
topology between the major clades of Hawaiian the phylogeny using a multispecies coalescent model with
Drosophilidae, distinct from those previously hypothesized ASTRAL(Zhangetal.2018),andrecoveredthesametopol-
(fig. 1 and supplementary fig. S1, Supplementary Material ogywiththeexceptionoftheplacementofD.primaeva(as
online).Thistopologywasthemostlikelytreeestimatedus- the sister taxon to PNA, supplementary fig. S4,
ing IQtree (Minh, Schmidt, et al. 2020) and RAxML Supplementary Material online). Each of these analyses
3
MBE
ChurchandExtavour . doi:10.1093/molbev/msac012
wereperformedonasupermatrixof10,949putativelyorthol- expectation based on chance (Minh, Hahn, et al. 2020). We
ogousgenes,alignedandassembledusingtheagalmapipeline foundthatformanybranchesinourtreebothgCFandsCF
(Dunnetal.2013)withnofilteringbasedonoccupancy(ac- are high, indicating these relationships are supported by a
tualgeneoccupancywas41.7%).Totestthesenstivityofour majorityofgenesandsitesinourdataset.Forexample,the
resultstomissingdata,werepeatedtheIQtreeanalysisona branchunitingHawaiianDrosophilahasagCFof91.2,andsCF
datasetreducedusinganoccupancythresholdthatensures of72.1(fig.1A).However,forthebranchessubtendingmost
representationof80%oftaxaateachgene(1,926genes),and relationships between the major clades of Hawaiian
recovered the same topology as with the full set of genes Drosophila, gCF and sCF are low. For example, the branch
(supplementaryfig.S5,SupplementaryMaterialonline). uniting D. primaevaþMMþAMCþhaleakalae to the exclu-
ThemostlikelytreeindicatesthatthePNAclade,including sionofPNAhasabootstrapvalueof100,butagCFof19.3
picture-wingspecies, isthesister cladeto allotherHawaiian andsCFof32.2.
Drosophila. Drosophila primaeva is found to be the sister Wealsotestedtheextenttowhichpotentialerrorinmul-
taxon to a clade containing non-PNA Hawaiian Drosophila, tiplesequencealignmentaffectedconcordancevaluesbyfil-
D
thoughthiscladereceivedlowersupportwhenusingthedata teringoutpoorlyalignedsequencefragments,andrepeating o
w
set reduced by occupancy (supplementary fig. S5, the tree inference and concordance analyses. After filtering n
lo
SupplementaryMaterialonline,ultrafastbootstrapof85).A poorlyalignedsequences,werecoveredthesametopologyas a
d
e
secondmonotypiclineage,D.adventitia,wasnotsampledfor figure 1A, with the exception of the arrangement of MM, d
phylotranscriptomicanalyses,butusingspecificgenemarkers, AMC, and haleakalae, here showing a MMþhaleakalae as fro
m
we recover this as the sister taxon to a clade including monophyletic(IQtreeultrafastbootstrap:97,supplementary h
MMþAMCþhaleakalae (seesection onexpanded phyloge- fig.S6,SupplementaryMaterialonline).Concordancefactors ttps
neticanalysisbelow).Thislattercladewasrecoveredinpre- betweenanalysesonfilteredandnonfiltereddatawerenearly ://a
c
vious phylogenetic analyses (O’Grady et al. 2011; Magnacca equivalent (e.g., the branch separating PNA from the other ad
e
andPrice2015).Incontrasttothosestudies,whichsuggested Hawaiian Drosophila received a gCF of 19.0 and sCF of 31.6 m
ic
a monophyletic clade of AMCþhaleakalae, we do not re- afterfiltering,andagCFof19.3andsCFof32.2usingalldata). .o
u
cover sufficient support for any particular arrangement of Furthermore, using a series of stringency thresholds to filter p
.c
MM, AMC, and haleakalae (ultrafast bootstrap from both thedata,weobservednopatternofincreasingordecreasing om
thefullandreducedoccupancymatrixis<95). concordancefactorsacrossbranches(supplementaryfig.S7, /m
b
Wetestedthemostlikelytreeemergingfromouranalysis Supplementary Material online). These results suggest that e/a
(fig.1A)againsttwopreviouslysuggestedalternativehypoth- values of discordance in this phylogeny are not artificially rtic
eses(fig.1BandC)usingtheSwofford–Olsen–Waddell–Hillis inflatedduetotechnicalerrorsfromthealignmentstep. le/3
9
(SOWH) test (Swofford et al. 1996), a parametric bootstrap Weinterpretthemeasuresofdiscordanceasreflectingreal /3
approach for comparing phylogenetic hypotheses. In both variation in the phylogenetic signal of different genes and /m
s
a
cases, the difference in likelihood between the most likely sites, which is not unexpected for a radiation such as this c
0
treeandthesealternativeswaslargerthanwewouldexpect with short internodes subtending major clades (Minh, 12
/6
bychance(Pvalueforboth<0.01,withasamplesizeof100). Hahn,etal.2020).Furthermore,thepresenceofdiscordance 5
1
Betweenfigure1AandBthedifferenceinlog-likelihoodwas doesnotmeanthatthereislittlethatcanbesaidaboutthe 20
6
1,774.1,andbetweenfigure1AandCwas6,132.1,whereasthe relationshipsbetweenthesegroups.Incontrast,byunpacking 6
b
nulldistributionaccordingtotheSOWHtesthadnodiffer- this discordance, we can begin to qualitatively describe the y g
encesgreaterthan15foreithercomparison.Takentogether, amount and distribution of phylogenetic signal for multiple ue
s
our results suggest a new phylogeny for Hawaiian alternative,plausiblebipartitions. t o
n
DrosophilidaerelationshipswhereinMM,AMC,andhaleaka- To this end, we first visualized hotspots of concordance 0
5
lae represent a monophyletic group, and the PNA clade, acrosstreespace(fig.2).Wecreatedall105topologicalcom- A
p
rather than either the haleakalae clade or D. primaeva, is binationsofthepossiblearrangementsbetweenmajorclades, ril 2
thesistercladetoallothers(fig.1A). andthenre-estimatedgCFandsCFforeach.Visualizingthe 0
2
2
meanvaluesforgCFandsCFplottedintreespaceshowsthat
IdentifyingHotspotsofGeneandSiteConcordancein themostlikelytree,asestimatedwithIQtree,isnotthetree
Treespace withthehighestmeangCFandsCF,butitisnearahotspotof
Weanalyzedthestrengthofphylogeneticconcordanceinour alternativearrangementsfor whichbothofthesevaluesare
phylotranscriptomicdatasetbyestimatingthegeneandsite high(fig.2,treespace,mostlikelytreeindicatedbydarkred
concordancefactorsforeachbranchinourtree.Genecon- outline).Incontrasttothemostlikelytopology,thetreeswith
cordance factors (gCF) are calculated as the proportion of the top three mean gCF values and two of the three trees
informativegenetreesthatcontainagivenbranchbetween withthetopmeansCFvaluesuniteD.primaevaþPNAtothe
taxa,andcanrangefrom0to100(Baum2007;Minh,Hahn, exclusion of other Hawaiian Drosophila. Variation between
etal.2020).Siteconcordancefactors(sCF)arecalculatedas these top trees largely depends on the placement of halea-
the average proportion of informative sites that support a kalaerelativetootherclades(fig.2,topgCFandsCFtrees).
givenbranchbetweentaxa.Becauseonesitecanonlysupport Calculating the mean gCF and sCF across branches may
oneofthreearrangementsforaquartetoftaxa,sCFtypically not always provide an informative metric, given that some
ranges from (cid:2)33.3 to 100, with 33.3 representing our null topologies may contain one highly supported branch and
4
MBE
DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012
D
o
w
n
lo
a
d
e
d
fro
m
h
ttp
s
://a
c
a
d
e
m
ic
.o
u
p
.c
o
m
/m
b
e
/a
FIG.2.Thelandscapeoftreespaceshowshotpotsofconcordanceamonggenesandsites.Thelandscapeoftreespaceforallpossibletopological rtic
combinationsofthefivecladesofHawaiianDrosophilastudiedhere:PNA,D.primaeva,haleakalae,MM,andAMC.Individualpointsrepresent le/3
differentarrangementsofthefiveclades,labeledrandomlywithtwo-letterIDsfromaathroughea.Thedistancebetweenpointsindicatestree 9/3
similarity(calculatedfromRobinson–Fouldsdistances).Thesizeofpointsrepresentsthemeangeneconcordancefactor(gCF)acrossrelevant /m
s
branches,andthecolorrepresentsthemeansiteconcordancefactor(sCF;purple,low;yellow,high).Thepointoutlinedinred(treedm)indicates a
c
thebesttopologyfoundwithIQtree,RAxML,andPhyloBayes,whichisdistinctfromthetoptreesaccordingtomeangCF(av,ah,andap)ormean 01
2
sCF(av,az,andcf).Concordancemeasurementsforalltopologiesareavailable,seeMaterialsandMethodsandDataAvailability. /6
5
1
2
0
others with very low support. Therefore, we also analyzed Haeseler1997; Minh,Schmidt,etal. 2020). Thisanalysis cal- 6
6
concordanceforalltheuniquebipartitionsacrossthesetof culatesthelikelihoodsupportforthethreepossiblearrange- b
y
possible topologies (supplementary figs. S8 and S9, ments of each quartet of taxa in an alignment, and then gu
e
Supplementary Material online; see supplementary concor- countsthenumberofinformativequartetsthatstronglysup- st o
dance factor analysis, Supplementary Material online). We portonearrangementovertheothertwo(StrimmerandVon n
0
foundthatforgCF,thereisclearsignalsupportingbipartitions Haeseler 1997). Here, we performed likelihood mapping on 5 A
tMhaMtþuAniMteCDþ.hparliemakaaelvaaeþPNA(s,uapsplwemellenatsartyhose ftihga.t unSit8e, eqaucahrtoeftsthreele1v0a,9n4t9togetnheespinosoituiornfuollfdthateaPsNetA,rgersouultpinagnind23,,108725 pril 20
2
SupplementaryMaterialonline).WefoundthatforsCF,con- quartetsrelevanttothepositionofthehaleakalaegroup.Our 2
cordance values across bipartitions are more variable, but resultsshowedthat,inbothcases,thevastmajorityofquar-
those that unite PNAþhaleakalae show less support than tetsareuninformative,withnostronglikelihoodsupportfor
we might expect by chance, whereas those that unite anyonearrangement(supplementaryfig.S10,Supplementary
D. primaevaþPNA and AMCþMM show more support Material online). Although support was for the most part
(supplementary fig. S9, Supplementary Material online). In evenly divided among possible arrangements, we observed
addition,betweengCFandsCF,wefoundconflictingsignals morequartetsthatunitehaleakalaeþMM,totheexclusion
for bipartitions that define one clade as sister to the rest of of AMC, PNA, and D. primaeva, as well more quartets that
Hawaiian Drosophila, with gCF indicating support for PNA uniteD.primaevaþPNAtotheexclusionofotherHawaiian
(consistentwiththemostlikelytopology),andsCFindicating Drosophilidae, thanotherarrangements(supplementary fig.
supportforhaleakalae. S10AandC,SupplementaryMaterialonline).Wealsotested
Toinvestigatethesourceofdiscordanceacrossgenes,we whether genes that support one topology over another at
performed a likelihood mapping analysis that assesses the thesenodeswereenrichedforeitherlongorshortgenes,or
phylogenetic information in each gene (Strimmer and Von fastorslow-evolvinggenes.Ourresultsshowednocorrelation
5
MBE
ChurchandExtavour . doi:10.1093/molbev/msac012
betweengenetreetopologyandgenelengthorevolutionary estimates for the ages at which Hawaiian islands became
conservation (supplementary fig. S11, Supplementary habitable, based on models of island emergence, growth,
Material online), with the exception of genes supporting and decline via erosion and subsidence (Lim and Marshall
D. primaeva as the sister to all other Hawaiian Drosophila, 2017) (table 1). Similar results were obtained using the cali-
which was supported by somewhat slower evolving genes brationschemefromRussoetal.(2013)thatincludesasingle
than alternative arrangements (supplementary fig. S11D, fossil calibration point for the clade, based on the taxon
SupplementaryMaterialonline). S. dominicana recovered from dominican amber (median
In summary, across all analyses, we found consistent evi- rootage22.9mya).However,withbothschemes,uncertainty
denceforabipartitionthatseparatesPNAandacladethat aroundtherootageremainssubstantial(95%highestposte-
includes MM and AMC. Although the placement of rior density confidence interval 17.4–29 mya), and small
D. primaeva with MMþAMCþhaleakalae received strong changesinthecalibrationtimesusedcanleadtosubstantial
support in our maximum likelihood and Bayesian analyses, differences in this estimate. When calibrating the tree using
weobservedsubstantialdiscordanceinthisarrangement,and the same island age estimates as in Magnacca and Price
D
detectsignalsuggestingasignificantamountofsharedhistory (2015),whicharemarginallyyounger(table1),weestimated o
w
betweenD.primaevaandPNA.Similarly,althoughtheclade theageofHawaiianDrosophilidaetobe(cid:2)15Myold(median n
lo
uniting MMþAMCþhaleakalae received strong bootstrap root age 15.5 mya). Furthermore, we note that calibrating a
d
e
support, we observed substantial discordance in the place- usingprimarilyvicariancebasedestimatesoftimeisconsid- d
mentofhaleakalae,andsuggestthatfurtherresolutioninits ered to be imprecise and should be avoided fro
m
placementwillbepossiblewithadditionaltaxonsamplingin (Kodandaramaiah 2011). Taken together, we consider this h
thatclade. estimateoftheageofHawaiianDrosophila,aswellasthose ttps
previouslypublished,tobetentative,andsuggestthatfurther ://a
c
CalibratinganExpandedPhylogenytoTime data(e.g.,newfossilevidence)willbenecessarytodetermine ad
e
Buildingonthephylotranscriptomicanalysesabove,wecol- the age of diversification relative to island emergence with m
ic
lectedallpubliclyavailablegenomicandtranscriptomicdata greatercertainty. .o
u
forspeciesfromHawaiianDrosophilaandScaptomyza.These According to this estimate, we find that the division be- p
.c
datawereaccessionedinnineanalysespublishedsince1997, tween major Hawaiian Drosophila clades occurred around om
mostofwhichfocusedonresolvingthephylogeneticrelation- 10 mya (fig. 3), prior to the estimated time when the /m
b
shipswithinamajorclade(BakerandDeSalle1997;O’Grady Hawaiian island Kaua’i became habitable (between 6.3 and e/a
andZilversmit2004;Bonacumetal.2005;Lapointetal.2011, 6.0mya;LimandMarshall2017).Ourresultsshowthatthe rtic
2013,2014;MagnaccaandPrice2015;Katohetal.2017).The diversification of lineages within MM also occurred around le/3
datasetwecompiledcontained44genes(sixmitochondrial thattime,whereasthelineageswithintheAMC,haleakalae, 9/3
and 38 nuclear) from 316 species (including 271 described andgrimshawigroups(PNA)allarosewithinthelast5million /m
s
and 45 undescribed putative species), with an overall occu- years,aroundthetimeOahubecamehabitable.Wenotethat ac
0
pancy of 17.3% (supplementary fig. S12, Supplementary theMMgroupssufferfromlowerrepresentationacrossgenes 12
Materialonline).Weusedthisdatasettoinferthephylogeny used to calibrate the tree to time (supplementary fig. S12, /65
1
with IQtree, constraining the relationships between major SupplementaryMaterialonline),andsuggestthatmoredata 20
6
cladestoconformtothetopologyshowninfigure1A. may help shed light on differences in the age of this clade 6
b
Theresultingtopologyistoourknowledgethemostspe- relativetoothers. y g
ciesrichphylogenetictreeoftheHawaiianDrosophilidaeto ue
s
date(supplementaryfig.S13,SupplementaryMaterialonline). AncestralStateReconstructionofOvipositionand t o
n
Severalsupportvaluesarelow(ultrafastbootstrap<95),es- LarvalFeedingEcology 0
5
peciallyfornodesnearthebaseoftheradiation.However,this With this time-calibrated tree for 316 species, we have an A
p
is not unexpected, given that this phylogeny is estimated opportunitytoinvestigatetheevolutionarydynamicsoftrait ril 2
primarilyfromthesamedatapreviouslyanalyzed,whichre- diversification.Bymodelingtheevolutionofthediversesuite 0
2
2
coveredalternativerelationshipsatthosenodes.Ofnoteare of ecological and morphological features across the phylog-
thelowsupportvaluesfortherelationshipswithintheMM eny, wecanidentify which lineageshaveexperienced major
and haleakalae clades (supplementary fig. S13, shiftsintraitevolution.Predictingthenumberandphyloge-
Supplementary Material online, polytomies), emphasizing neticpositionoftheseshiftswillinturnbecriticalforinform-
theneedforfurtherstudyinthesegroups. ingfuturestudiesondevelopment,life-history,andevolution
Weusedthisexpandedgeneticdatasetandtopologyto oftheseflies.Inthefollowinganalysesoftraitevolution,we
estimatetheageoftheHawaiianDrosophilidaebycalibrating used the maximum clade credibility tree from the con-
this tree to time using the software package BEAST strained BEAST analysis described above. Using this tree
(Bouckaert et al. 2014). Consistent with recent publications allowsustomaximizethenumberoftaxaforwhichgenetic
(Obbard et al. 2012; Magnacca and Price 2015; Katoh et al. data are available, painting the most complete picture of
2017),our resultsindicatethat theageofthesplit between ecological and morphological evolution in these flies up to
Hawaiian Drosophila and Scaptomyza occurred between 20 thisdate.However,duetothefractionofgeneticdatamissing
and 25 million years ago (fig. 3, median root age 22.8 mya). acrosstaxa,italsoincludesnodeswithlowbootstrapsupport
The results shown here were calibrated using updated (supplementary fig. S13, Supplementary Material online,
6
MBE
DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012
D
o
w
n
lo
a
d
e
d
fro
m
h
ttp
s
://a
c
a
d
e
m
ic
.o
u
p
.c
o
m
/m
b
e
/a
rtic
le
/3
9
/3
/m
s
a
c
0
1
2
/6
5
1
2
0
6
6
b
y
g
u
e
s
t o
n
0
5
A
p
ril 2
0
2
2
FIG.3. Time-calibratingthephylogenyof316Drosophilidaespecies.ThisphylogenywasinferredusingIQtreetoanalyzeallpubliclyavailable
geneticdataforHawaiianDrosophilaandScaptomyza.ItwasthencalibratedtotimeusingthesoftwareBEAST,withfourcalibrationpointsat
nodesthatshowabiogeographicprogressionrule(MagnaccaandPrice2015).Similarresultswereobtainedusingacalibrationscheme(Russoetal.
2013)thattakesintoaccountasinglefossiltaxonforthegroup(table1).The95%highestposteriordensityintervalsforeachnodeareshownas
graybars,indicatingthecredibleintervalfortheageofthatgroup.TheageatwhichfourHawaiianislandsareestimatedtohavebecomehabitable
isshowningreen.Coloredlabelsindicatethecladetowhichtaxabelong,andcolorscorrespondtofigure1;taxawithoutacoloredlabelarespecies
withgeneticdatathatareasofyetundescribed.SeesupplementaryfigureS13,SupplementaryMaterialonline,forbootstrapsupport.Calibration
usingonlyislandbiogeographyisknowntobeimprecise(Kodandaramaiah2011);therefore,thedivergencetimesshownhereareconsidered
tentative.
7
MBE
ChurchandExtavour . doi:10.1093/molbev/msac012
Table1. CalibrationPointsforDatingwithBEAST.
Group SpeciesIncluded Magnacca Magnacca Russoetal.
andPrice andPrice (2013)
(2015) (2015),
UpdatedAges
Scaptomyzagenus AllScaptomyzaspecies 25.7(3.0)
planitibiagroup D.anomalipes,D.quasianomialipes,D.oahuensis,D. 3(0.5) 4.135(0.5)
obscuripes,D.hemipeza,D.melanocephala,D.pla-
nitibia,D.heteroneura,D.silvestris,D.neoperkinsi,
D.neopicta,D.ingens,D.differens,D.substenoptera,
D.cyrtoloma,D.hanaulae,D.nigribasis
lanaiensissubgroup D.lanaiensis,D.hexachaetae,D.digressa,D.moli 3(0.5) 4.135(0.5)
picticornissubgroup D.picticornis,D.setosifrons,D.pilipa 4.7(0.1)
planitibiasubgroup D.differens,D.hemipeza,D.planitibia,D.silvestris,D. 2.8(0.1)
heteroneura D
o
cyrtolomasubgroup D.neoperkinsi,D.obscuripes,D.melanocephala,D. 2.8(0.1) w
n
cyrtoloma,D.ingens,D.hanaulae,D.oahuensis lo
sobrina1orthofascia1ciliaticrus D.sobrina,D.orthofascia,D.ciliaticrus 1.7(0.3) 2.55(0.3) ad
e
silvestris1heteroneura D.silvestris,D.heteroneura 0.5(0.2) 1.2(0.2) 0.6(0.04) d
fro
NOTE.—Valuesaremean(standarddeviation)ageinmillionyearsfornormallydistributedtimepriors. m
h
ttp
s
polytomies).Therefore,forinternallineagesforwhichevolu- Werecoveratransitionfrombarktoleafbreedingatthebase ://a
tionaryrelationshipsremainuncertain,thepositionofthese oftheAMCcladethathasgenerallypersistedthroughoutthe ca
d
evolutionaryshiftsaresubjecttochangeasmoregeneticdata diversification of that group. As previously reported em
become available and further phylogenetic resolution is (Magnaccaetal. 2008),wefindseveralgroupsthatdemon- ic
.o
achieved. strate no reported variation in substrate type (e.g., fungus up
.c
The Hawaiian Drosophilidae use a wide variety of plant, breedinghaleakalae,fig.5B). o
m
animal, and fungal species for egg laying and larval feeding Over 1,000 stochastic character maps, we recovered an /m
b
(fig.4)(Heed1968;Montgomery1975;Magnaccaetal.2008). average of 44 transitions in oviposition substrate over the e
/a
Themajorityofspeciesbreedinrottingsubstrates,withvar- evolutionaryhistoryofHawaiianDrosophilidae.Themajority rtic
iationinthepartoftheplantorfungusinquestion,including of thesechanges occurred along branches leading to extant le
/3
rottingbark,leaves,flowers,andfruit.Afewspeciesbreedon tips,withfewtransitionsatinternalnodes(onthesummary 9
/3
live tissue, and one notable Scaptomyza subgenus, tree,8outof36totalchanges).Onaverage,70%oftransitions /m
s
Titanochaeta, have been reared exclusively from spider egg werebetweenusingasinglesubstratetypeasaprimaryhost a
c
0
masses(Knab1914).In2008,Magnaccaetal.(2008)reviewed (“specialist” species) and using multiple types (“generalist” 1
2
host plant and substrate records and found that, whereas species,definedasusinganytwosubstratesthateachcom- /65
1
manyspeciescanbeconsideredspecialiststospeciesorsub- prise >1 of all rearing records, or with no substrate that 2
strate,hostshiftingwascommonandmanyspeciesoccasion- comprise4s >2 of rearing records; Magnacca et al. 2008). 066
3 b
ally use nonpreferred substrates. The type of oviposition Other transitions were primarily between using rotting y
g
substrate has been suggested as a driver for diversification bark,leaves,orsap.Pinpointingbranchesoflikelytransitions ue
s
of the reproductive traits ovariole number and egg size showsthatsomegroupshaveexperiencedmanymoretran- t o
n
(Kambysellis and Heed 1971; Kambysellis et al. 1995; sitions than others, especially MM and non-flower/non-spi- 0
5
Sarikaya et al. 2019). However, the previous reconstruction der egg breeding Scaptomyza. Most generalist species fall in A
p
ofovipositionsubstratebyKambysellisetal.(1995)wasper- oneofthesetwoclades,whichalsoincludespecialistbarkand ril 2
formedwithaphylogenythatincludedonlythreenon-PNA leafbreeders,amongothersubstrates. 0
2
species, and was therefore unable to resolve the ancestral 2
oviposition substrate for Hawaiian Drosophila or to identify EvolutionofWing,Thorax,andBodyLength
when evolutionary shifts in substrate outside of PNA were Alongside ecological diversification, the Hawaiian
likelytohaveoccurred. Drosophilidae show substantial diversity in adult body size.
We combined the phylogenetic results presented here Weusedthetime-calibratedphylogenytomodelthenumber
with the data summarized in Magnacca et al. (2008), to re- andtimingofmajorchangesintheevolutionarydynamicsof
constructtheancestralovipositionsubstratefortheHawaiian sizeacrossthephylogeny.First,wedigitized795recordsfrom
Drosophilidae (fig. 5A and supplementary fig. S14, 26publications(GrimshawandSpeiser1901;Grimshaw1902;
Supplementary Material online). Using stochastic character Frederick1914;Bryan1934,1938;Wirth1952;Hackman1959;
mapping (Huelsenbeck et al. 2003), we recover the most Hardy 1965, 1966, 1969, 1977; Hardy and Kaneshiro 1968,
probable ancestral oviposition substrate for the Hawaiian 1969, 1971, 1972, 1975, 1979; Kambysellis and Heed 1971;
Drosophila as bark breeding (defined as including rearing Hardy et al. 2001; O’Grady et al. 2001, 2003; Starmer et al.
recordsfrombark, stems, branches,roots, and fernrachises, 2003; Magnacca and O’Grady 2008, 2009; Craddock et al.
seesupplementarytableS5,SupplementaryMaterialonline). 2016; Sarikaya et al. 2019), including descriptions of body,
8
MBE
DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012
D
o
w
n
lo
a
d
e
d
fro
m
h
ttp
s
://a
c
a
d
e
m
ic
.o
u
p
.c
o
m
/m
b
e
/a
rtic
le
/3
9
/3
/m
s
a
c
0
1
2
/6
5
1
2
0
6
6
b
y
g
u
e
s
t o
n
0
5
A
p
ril 2
0
2
2
FIG.4. Ancestralstatereconstructionofovipositionsubstrateindicatesdozensofevolutionarytransitions.(A)Weusedstochasticcharacter
mappingtoreconstructtheancestralsubstrateusedforovipositionandlarvalfeeding,andidentifieddozensoflikelytransitionsinsubstrate(gray
circles). Branch color indicates the ancestral substrate type with highest probability, and tip box indicates extant oviposition substrate. (B)
Ovipositionsubstratecategorywasdefinedbasedonrearingrecords,usingthedatasummarizedinMagnaccaetal.(2008).Generalistspeciesare
definedasthosewithanytwosubstratesthateachcomprise>1ofrearingrecords,oranyspecieswithnosubstratethatcomprises>2ofrearing
4 3
records.
wing,andthoraxlengthacross552species.Thenwemapped In the case of wing length, we find evidence for several
these traits onto our phylogenetic results, and used the R highlysupportedregimeshiftsintheevolutionaryhistoryof
package bayou (Uyeda and Harmon 2014) to identify Hawaiian Drosophilidae (fig. 5). Some of these are indepen-
branchesthatrepresentprobableshiftsintraitdiversification. dentshiftsonbranchessubtendinggroupswithlargerwings
ThispackageusesOrnstein–Uhlenbeck(OU)modelstode- thantheirnearbyrelatives,includingfliesintheantopocerus
scribeshiftsinevolutionaryregimes,definedaslineagesthat group (AMC) and in the EngiscaptomyzaþGrimshawomyia
shareanOUoptimumtraitvalue. subgenera (Scaptomyza). Others are independent shifts on
9
MBE
ChurchandExtavour . doi:10.1093/molbev/msac012
D
o
w
n
lo
a
d
e
d
fro
m
h
ttp
s
://a
c
a
d
e
m
ic
.o
u
p
.c
o
m
/m
b
e
/a
rtic
le
/3
9
/3
/m
s
a
c
0
1
2
/6
5
1
2
0
6
FIG.5.Multipleshiftsinevolutionaryregimeshelpexplainthediversityofwinglength.(A)UsingtheRpackagebayou(UyedaandHarmon2014), 6 b
wemodeledtheevolutionofwinglength(mm)onthephylogenyanddetectedseveralprobableshiftsinevolutionaryregimes(graycircles,larger y
g
indicates greater posterior probability that a shift occurred on that branch). Locations of probable shifts include at the base of the u
e
s
MMþAMCþhaleakalaeclade,subtendingtheantopocerusgroup(AMC),andsubtendingthenudidrosophila(PNA),amongothers.(B)The t o
distributionofwinglengthsacrossthephylogenyofHawaiianDrosophilaandScaptomyza. n 0
5
A
p
branchessubtendinglineageswithsmallerwingsthannearby antopocerus, and nudidrosophila, consistent with the shifts ril 2
relativessuchasthenanellaþischnotrix(MM)andthenudi- recovered for wing size. In the case of body size, the most 0
2
drosophilasubgroups(PNA).Thissuggeststhattheevolution- probable shifts are located at the base of the Hawaiian 2
ary history of Hawaiian Drosophila has included multiple DrosophilaandtheEngiscaptomyzaþGrimshawomyiasubge-
convergent transitions to both larger and smaller wings. In nera. However, for body length, no regime shifts received
thecaseofnudidrosophila(PNA),wenotethatthetopology substantially more support than others, despite running
recovered in the summary tree dividing this subgroup into bayou for an extra million generations and achieving afinal
twolineageshasverylowbootstrapsupport(supplementary effectivesizeforlog-likelihoodof401.9.
fig.S13,SupplementaryMaterialonline,polytomies),andwe
suggestthatthetwoshiftstosmallerwingsrecoveredwithin EvidenceforConvergentEvolutionofOvariole
PNA may represent a single shift if this group is indeed NumberandEggSize
monophyletic. Wealsoperformedtheseanalysesonreproductivetraits,in-
Wefoundsimilarresultswhenconsideringthoraxlength cluding egg size, egg shape (aspect ratio, calculated as egg
(supplementaryfig.S15,SupplementaryMaterialonline)and length/width), and the number of egg-producing compart-
bodylength(supplementaryfig.S16,SupplementaryMaterial ments in the ovary (ovarioles). These traits have been the
online).Inthecaseoftheformer,wefindshiftsatthebaseof subject of several life-history studies regarding the
10