Table Of ContentEnvironmentalMicrobiology(2007) doi:10.1111/j.1462-2920.2006.01203.x
Proteorhodopsin photosystem gene clusters exhibit
co-evolutionary trends and shared ancestry among
diverse marine microbial phyla
JayMcCarrenandEdwardF.DeLong* Differentvarietiesofthesephotoactiveintegralmembrane
DepartmentofCivilandEnvironmentalEngineeringand proteins are capable of performing two known functions;
DivisionofBiologicalEngineering,Massachusetts ion-translocating rhodopsins can act as light-driven ion
InstituteofTechnology,Cambridge,MA02139,USA. pumps, and sensory rhodopsins, in the course of light-
activatedconformationalchanges,interactwithtransduc-
erstodirectphototaxisandmotility(Spudichetal.,2000).
Summary
Rhodopsins can be broadly categorized as either type I
Sincetherecentdiscoveryofretinylideneproteinsin rhodopsins,whichareallofmicrobialoriginandfoundin
marine bacteria (proteorhodopsins), the estimated all three domains of life, or type II rhodopsins, which are
abundance and diversity of this gene family has all animal photoreceptors (Spudich etal., 2000). Type I
expandedrapidly.Toexploreproteorhodopsinphoto- rhodopsinsincludeseveralclassesoriginallyidentifiedin
system evolutionary and distributional trends, we extremely halophilic archaea: the two ion-pumping
identifiedandcompared16differentproteorhodopsin- rhodopsins, halorhodopsin (HR) and bacteriorhodopsin
containing genome fragments recovered from natu- (BR),whichpumpchlorideionsandprotons,respectively,
rally occurring bacterioplankton populations. In and two sensory rhodopsins (SRI and SRII). Proteor-
addition to finding several deep-branching proteor- hodopsins (PR) are another diverse clade of type I
hodopsinsequences,proteorhodopsinswerefoundin rhodopsins first identified in an uncultivated gammapro-
noveltaxonomiccontexts,includingabetaproteobac- teobacterium (Beja etal., 2000a), but now known to be
teriumandaplanctomycete.Approximatelyone-third prevalentinotherbacteria(delaTorreetal.,2003;Sabehi
oftheproteorhodopsin-containinggenomefragments etal.,2005).RecentadditionstothelargetypeIrhodop-
analysed, as well as a number of recently reported sinfamilyincludeamonophyleticcladeoffungalrhodop-
marinebacterialwholegenomesequences,contained sins (Ruiz-Gonzalez and Marin, 2004), another well-
alinkedsetofgenesrequiredforbiosynthesisofthe supported clade of sequences of unknown function from
rhodopsinchromophore,retinal.Phylogeneticanaly- diverse microbes, and several sequences representing
sesoftheretinalbiosyntheticgenessuggestedtheir independent branches in the type I rhodopsin phyloge-
co-evolution and probable coordinated lateral gene netictrees(Fig.1).
transferintodisparatelineages,includingEuryarcha- Originally, microbial rhodopsins were typically associ-
eota,Planctomycetales,andthreedifferentproteobac- ated with Archaea, until a PR was identified in an uncul-
terial lineages. The lateral transfer and retention of tured gammaproteobacterium of the SAR86 clade (Beja
genes required to assemble a functional proteor- etal.,2000a).Genomefragmentsfromothermembersof
hodopsin photosystem appears to be a coordinated the SAR86 group from diverse geographic locales, and
and relatively frequent evolutionary event. Strong other taxa including Alphaproteobacteria (de la Torre
selectionpressureapparentlyactstopreservethese etal.,2003),werealsoshowntoencodethephotoprotein
light-dependent photosystems in diverse marine (Sabehietal.,2003).Now,PR-likesequencesnumberin
microbiallineages. thehundreds(Venteretal.,2004),andPRsareestimated
to be present in 13% of all marine bacteria in the photic
zone (Sabehi etal., 2005). Surprisingly, recent metage-
Introduction nomic analyses revealed the presence of marine
bacterial-like PRs within the genomes of marine plank-
Retinylideneproteins(commonlycalledrhodopsins)com-
tonicEuryarchaea(Frigaardetal.,2006).Conversely,the
prise a large and functionally diverse family of proteins.
complete genome sequence of the extremely halophilic
bacteriumSalinibacterrubercontainedseveralarchaeal-
Received 21 September, 2006; accepted 2 November, 2006. *For
likerhodopsins(HR,SRIandSRII),similartothosefrom
correspondence. E-mail [email protected]; Tel. (+1) 6172530252;
Fax(+1)6172532679. itsarchaealneighboursthatdwellinthesamehypersaline
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd
2 J.McCarrenandE.F.DeLong
Type I Rhodopsins
Bacteroides
Gloeobacter
Pyrocystis Tenaci-
Salinibacter Psychro- baculum
flexus Polaribacter
rhSoednosposriyn s SalinibacteSralinibacter EBO_41B09_PR2 CellulopEhBa0Hg_OAT3aN7T953m2mHe4Cd11A221m7eRd8mAe1d9HRAF81E71RBH098O_Em0BT1A_0e6mC0dEH3A21B101DA53mO00r8e88_1dA5105Ar9b190 Marine
Haloarcula 100/100 medAE15BrM90Eb_D73459C0D607 rhodopsins
HOT4E07
MED49C08
HF10_05C07
Halobacterium 100/100 HF1H30F_1801_H490E708
HOT2C01 HF10_12C08
Photobacterium
100/92 Vibrio
100/99 100/73 72/61 EBO_41B09_PR1
Halobacterium RED22E04
MED13K09
medA17r811
100/97 medA15r8b8
RED17H08
Haloarcula 93/--
HF10_25F10*
100/100 HF10_19P19*
EB0_55B11
100/84 EB0_39F01
Foupnsignasl NeLuerpotsopsoprhHaaaeloribaacterium 100/100 HF10_3D09HF70_39H11HF70P_1eE9lBBa18g20ib_a6c9tHGe0rF71M0E_DM415E8DBG04022H1AE1T1BC 0C_232M50ED7D0823F1M0 ED46A06
Archaea
Exiguo-
Haloarcula bacterium
Halorhodopsins Salinibacter HaloarculaHaloarcula baHcatleor-ium
0.05 changes Bacteriorhodopsins Anabaena
Fig.1. Neighbour-joiningtree(ofinferredaminoacidsequences)containingrepresentativesfromthemajorgroupsoftypeIrhodopsins.New
sequencesreportedinthisstudyareindicatedinred[theasterisk(*)indicatessequencesreportedbyA.MartinezandE.F.DeLong,in
preparation]andsequencesfromrecentlyreportedmarinebacterioplanktongenomesequencing(http://www.moore.org/microgenome/)are
indicatedingreen.Bootstrapsupportvaluesfrom1000pseudoreplicatesgivenformajornodes(neighbour-joining/maximumparsimony;--,
unresolvednode).
habitat (Mongodin etal., 2005). The distribution of these sequencing of large insert bacterial fosmids that contain
ubiquitous photoproteins has been elucidated through proteorhodopsinrevealedthepresenceofretinalbiosyn-
analysis of both cultivated isolates (Kaneko etal., 2001; thesis genes alongside the PR gene in several clones
Sineshchekov etal., 2002; Galagan etal., 2003; Jung (Sabehietal.,2005).Retinalisanapocarotenoidformed
etal., 2003; Nakamura etal., 2003; Okamoto and Hast- by the oxidative cleavage of b-carotene (Giuliano etal.,
ings, 2003; Giovannoni etal., 2005a) (http://www.moore. 2003). With the exception of most oxyphotoautotrophic
org/microgenome/) and through culture-independent bacteriaandplants,b-carotenesynthesisfromisoprenoid
approaches(Bejaetal.,2000a;2001;Sabehietal.,2003; precursors is accomplished with the four crt genes crtE,
2004;2005;delaTorreetal.,2003;Frigaardetal.,2006). crtI,crtBandcrtY(Armstrong,1997).Oxidativecleavage
Thesereportsprovideexamplesofthebroaddistribution of b-carotene yields the rhodopsin chromophore retinal
of different type I rhodopsins found among all three (Giuliano etal., 2003).The gene products ofblh and brp
domainsoflife. have been shown to have this activity in Halobacterium
Two components are required for a functional, photo- salinarum (Peck etal. 2001), while expression of a
activerhodopsin:theopsinapoproteinitselfanditschro- homologous gene from PR-containing clones had the
mophore retinal. Perhaps not surprisingly then, recent same activity in Escherichia coli (Sabehi etal., 2005).
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
Proteorhodopsinphotosystemgeneclusters 3
Table1. Summaryofproteorhodopsin-containingenvironmentalclonesanalysedinthisstudy.
Clonea Putativeorigin Evidenceb Retinalbiosynthesis
EB0_35D03 Proteobacteriumc CBHs(53.8%)
EB0_39F01 Alphaproteobacterium CBHs(88.6%) crtEIBY,blh
EB0_39H12 Proteobacteriumc CBHs(80.7%)
EB0_41B09 Betaproteobacterium Phylogenyofthreeribosomalproteins crtEIBY,blh
CBHs(60.%)
EB0_49D07 Proteobacteriumc CBHs(60.9%)
EB0_50A10 Gammaproteobacterium CBHs(88.1%)
HF10_05C07 Proteobacterium CBHs(73.5%)
HF10_12C08 Alphaproteobacterium Phylogenyoffourribosomalproteins
CBHs(91.9%)
HF10_19P19d Proteobacteriume CBHs(64.7%) crtEIBY,blh,idi
HF10_25F10d Proteobacteriume CBHs(92.7%) crtEIBY,blh,idi
HF10_29C11f Euryarchaea RNApolymeraseA″phylogeny crtI,crtBY,blh
CBHs(38.5%)
HF10_45G01 Proteobacteriume CBHs(47.4%)
HF10_49E08 Planctomycete XylAphylogeny crtEIBY,blh
CBHs(65.8%)
EB80_02D08 SAR-86 16SrRNA
EB80_69G07 Alphaproteobacterium CBHs(89.7%)
HF130_81H07 Gammaproteobacterium CBHs(50.0%)
a. Characterspreceding‘_’indicatelocation(HF,Hawaiianfosmid;EB,MontereyBayBAC)anddepth(0m,10m,80mand130m)ofsample
collection.
b. PercentageofORFsmatchinglistedoriginbasedonCBHsoverthresholde-valueofe-10.WhenCBHwasanuncultivatedenvironmentalclone,
thenextbesthitwithanidentifiedtaxonomicaffiliationwasused.
c. CBHsspreadamongproteobacterialtaxaprecludingassignmenttoaspecificclass.
d. SequencesreportedbyA.MartinezandE.F.DeLong(inpreparation)
e. CBHsmatchuncultivatedPR-containingproteobacterialclones.
f. ProteorhodopsinwasnotlinkedtoretinalbiosynthesisgenesinEuryarchaea.
CBH,closestBLASTPhomologue;ORFs,openreadingframes.
Theseretinalbiosyntheticgeneshavebeenfoundimme- Results
diatelydownstreamofPRinanumberoflargeinsertBAC
Proteorhodopsinsequencediversity
and fosmid clones. In a few instances idi genes that
catalyse isomerization of the isoprenoid precursors ThirteenlargeinsertDNAcloneswereidentifiedbycolony
(Kaneda etal., 2001; Kuzuyama and Seto, 2003) and hybridizationofmacroarraysusingamplifiedPRgenesas
enhance the yield of b-carotene (Kajiwara etal., 1997; probes, and subsequently sequenced. Additionally, two
Sedkova etal., 2005) have also been found adjacent to fullysequencedfosmidclonesthatcontainproteorhodop-
thesegeneclusters. sin photosystem genes were identified by expression
Through the analysis of large fragments of chromo- analysis(A.MartinezandE.F.DeLong,unpublished)and
somal DNAthe first PR sequence was discovered (Beja are also included in these analyses. All PR-containing
etal., 2000a) and continued prospecting of large insert clonesoriginatedfromeitherMontereyBay(EBprefix)or
environmental DNA libraries has extended the known North Pacific Subtropical Gyre (HF prefix) euphotic zone
distribution of PR-containing organisms (Sabehi etal., waters, between 0 and 130m depth (Table1). Phyloge-
2005; Frigaard etal., 2006). The recent full genome netic analysis of the PR sequences indicated that all fell
sequencing of a large number of marine bacterial iso- intocladesofpreviouslyreportedpolymerasechainreac-
lates has also added to the variety of microorganisms tion(PCR)-amplifiedorshotgunclonePRs(Venteretal.,
knowntopossessPRsequences(http://www.moore.org/ 2004; Sabehi etal., 2005) with one notable exception
microgenome/).Inlightoftheapparentlycommonlateral (Fig.1). A few of the large genome fragments reported
transfer of PR genes, determining the origin of a prote- here were nearly identical in gene content, order and
orhodopsin based exclusively on its sequence without amino acid sequence to previously reported
additionalgeneticcontextualdataisproblematic.Weare PR-containing clones. Many of the BAC or fosmid-
then left with sequencing whole genomes, or large chro- encodedPRs,however,representeddivergentsubclades
mosomal fragments, to ascertain their origins. We withinthemarinemicrobialPRtree.
present here an extended survey and analysis of 16 Surprisingly,BACcloneEB0_41B09containedtwotan-
large insert clones, recovered from naturally occurring demly arranged PR-like open reading frames (ORFs)
bacterioplankton populations, that possess PR-like (Fig.2).WhileEB0_41B09_PR1groupedunambiguously
sequences. within the large marine bacterial PR clade,
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
4 J.McCarrenandE.F.DeLong
Planctomycete fosmid (HF10_49E08)
Betaproteobacterial fosmid (EB0_41B09)
Alphaproteobacterial fosmid (EB0_39F01)
Archaeal fosmid (HF10_29C11)
proteorhodopsin alphaproteobacteria planktomycete
retinal biosynthesis betaproteobacteria bacteroides
archaea deltaproteobacteria chloroflexi
eukarya gammaproteobacteria cyanobacteria
e-value < e-5 = 2kb
Fig.2. Geneorganizationoffourenvironmentalfosmidclonescontainingretinalbiosynthesisgenesand,inthreeinstances,proteorhodopsin
genes.Homologousregionsareindicatedbyshadedregions.Genesflankinghomologousregionsarecolourcodedaccordingtophylogenetic
affiliationofbestBLASThit.
EB0_41B09_PR2 is quite divergent (Fig.1). whileseveralcanbeassignedmorespecificallytoeither
EB0_41B09_PR2 belonged to a well-supported clade thegamma-oralphaproteobacterialclasses.
containingotherPR-likesequencesfromthegenomesof OneBACclone,EB80_02D08,containedaSSUrRNA
S.ruber (Mongodin etal., 2005) and Gloeobacter viola- gene that was nearly identical to the original
ceus(Nakamuraetal.,2003)andageneidentifiedfroma proteorhodopsin-containing clone, EBAC_31A08, having
Pyrocystis lunula cDNA microarray analysis (Okamoto only a single-base mismatch placing this clone within of
and Hastings, 2003). The PR-like protein from S.ruber, the SAR82-II cluster of Gammaproteobacteria (Suzuki
named xanthorhodopsin, has been shown to act as a etal.2001).IncloneswithoutrRNAgenes,itwassome-
light-activatedprotonpump(Balashovetal.,2005).Most times possible to use ribosomal protein sequences to
unusually, EB0_41B09_PR2 did not contain the nearly assigntaxonomicorigins(delaTorreetal.,2003;Teeling
universally conserved (Spudich etal., 2000) lysine and Gloeckner, 2006). Clones HF10_12C08 and
residue,withwhichretinalformsacovalentSchiffbaseto EB0_41B09 each contained multiple ribosomal protein
form the functional photoprotein. Consequently, whether genes, and concatenated ribosomal protein sequence
EB0_41B09_PR2 can bind retinal remains to be analyses yielded well-supported trees, allowing prelimi-
determined. naryassignmentofeachofthesePRstoaspecifictaxon
(FigsS1 and S2 in Supplementary material).Aphyloge-
netictreebasedonfourribosomalproteins(L1,L10,L11
Phylogeneticdistributionofproteorhodopsingeneson
and L7/L12) present on clone HF10_12C08 confirmed
largegenomefragments
thatthisclonewasderivedfromanalphaproteobacterium.
As an initial step, top scoring BLASTP returns for each The closest homologues of these four proteins are from
predicted ORF in a given clone were tabulated. Con- another PR-containing alphaproteobacterial clone,
served, phylogenetically informative genes in clones HOT2C01 (de la Torre etal. 2003). HF10_12C08 and
(wheretheywereavailable)werethenalignedtoasetof HOT2C01 share an extensive region of homology
homologues and phylogenetic analyses performed (see (approximately88%DNAidentityover28kb),whichalso
below). In each case where phylogenetic marker genes contained other phylogenetically informative genes
were present and analysed, there was good agreement including translation elongation factor Tu and RNApoly-
withtheputativetaxonomicassignmentsbasedoncumu- merase beta subunit (rpoB) in addition to the ribosomal
lativeBLASTPreturns(Nesboetal.,2005).Basedonthese proteins listed. Clone EB0_41B09 also contained genes
data, 11 clones are of putatively proteobacterial origin, foranumberofribosomalproteins(S4,S6,S18andL9).
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
Proteorhodopsinphotosystemgeneclusters 5
Concatenated ribosomal protein alignments of the three Gammaproteobacteria (Photobacterium sp. strain
ribosomal proteins S6, S18 and L9 produced a well- SKA34, Vibrio angustum strain S14, and marine gam-
supported phylogenetic tree that indicates a betaproteo- maproteobacterial strain HTCC2207), as well as several
bacterial origin for clone EB0_41B09, a novel taxonomic Bacteroidetes isolates, can now be added to the list of
origin for a proteorhodopsin. The most similar homo- cultured PR-containing marine microbes, based on data
logues for nearly two-thirds of the ORFs encoded on from the recent Moore Foundation Marine Microbial
EB0_41B09 are of betaproteobacterial origin (Fig.2). Genome Sequencing Project (http://www.moore.org/
Virtually all ORFs that were not most similar to beta- microgenome/). The proteorhodopsins from the
proteobacterial homologues clustered together on the flavobacteria-like Bacteroidetes, including Cellulophaga,
chromosome immediately flanking the PR gene. This Polaribacter, Psychroflexus and Tenacibaculum species,
clusterofgenesisboundedononeendbyatRNAgene, clustertogetherinaseparatedeep-branchingcladecom-
whicharefrequentinsertionsitesforlateralgenetransfer paredwithpreviouslyreportedproteorhodopsins(Fig.1).
(LGT),andmayrepresentaninsertionintoabetaproteo- Unlike most proteorhodopsins identified to date, the
bacterialbackground. BacteroidetesPRsencodeamethionineresidueatposi-
Seventeen of the 26 predicted ORFs from clone tion105,asiteshowntobelargelyresponsiblefortuning
HF10_49E08 shared high similarity to homologues from thewavelengthoflightabsorbedbyPR(Manetal.,2003),
planctomycete-derivedproteinsequences(Fig.2).More- althoughotherresidueswithinthePRmoleculeareputa-
over,sixORFsencodedputativesulfatases,knowntobe tively involved in spectral fine tuning (Bielawski etal.,
highly over-represented in the genome of the plancto- 2004).
mycete Pirellula sp. strain 1 (Glockner etal., 2003).
HF10_49E08alsopossessedaxyuloseisomerase(xylA)
Proteorhodopsin-linkedphotopigmentbiosynthetic
gene, which appeared to be well conserved across a
genes
wide range of bacteria. Phylogenetic analyses of XylA
protein sequences largely recovered the monophyly of Highly similar proteorhodopsins in different large insert
included taxa, and unambiguously grouped the clones often were flanked by similar sets of genes. The
HF10_49E08 XylA with the other two planctomycetes gene content and order around closely related proteor-
(Fig.S3 in Supplementary material). Taken together, hodopsin sequences in large insert clones frequently
these data indicate that members of the Planctomyc- shared extensive regions of DNA homology, suggesting
etales,likeotherplanktonicmarinebacteria,containpro- that these clones originated from closely related organ-
teorhodopsin photosystems. isms.ComparisonsofPRflankingregionsfrommoredis-
Wesearchedfortheretinalbiosyntheticgenesinplank- tantly related proteorhodopsins generally (but not
tonic marine Euryarchaea, as they did not appear to be always, see below) revealed few commonly linked
linkedtoarchaealPRsinapreviousstudy(Frigaardetal., genes. Aside from the PR itself, no genes were found
2006).CloneHF10_29C11(whichdoesnotcontainaPR common to all or even most of these clones. However,
gene,butencodesseveralretinalbiosynthesisgenes,see one trend was observed in searching for additional com-
below)containedacompleteRNApolymeraseA″subunit monalities between these clones. Nearly one-third of the
gene (rpoA2), which are of some utility in archaeal phy- large insert PR-containing DNA clones from marine
logeneticanalysis(Brochieretal.,2004).TherpoA2from surface water microbes (Table1) encoded all the genes
HF10_29C11 is clearly of euryarchaeal origin based on required for retinal biosynthesis from isoprenoid precur-
our phylogenetic analysis, although its exact position sors.Thefivegenesinvolvedinretinalbiosynthesiswere
withinthegroupispoorlyresolved(Fig.S4inSupplemen- encoded on the same DNAstrand as the PR gene, and
tary material). HF10_29C11 also contained a partial were located immediately downstream, similar to that
rpoA1geneandtwogenesencodingribosomalproteins, shown for a few other PR-containing environmental
all of which support a euryarchaeal origin of this clone. clones (Sabehi etal., 2005). Furthermore, the clones
Further supporting its archaeal origins, HF10_29C11 containing linked retinal biosynthesis genes spanned a
encodesaconservedhypotheticalORFthatispresentin broad phylogenetic range, including Alpha-, Beta- and
nearly two-thirds of all Archaea sequenced to date and Gammaproteobacteria, and Planctomycetales. Marine
hasyettobefoundoutsideofthearchaealdomain. planktonic euryarchaeal-associated retinal biosynthetic
Genome sequencing of marine bacterioplankton iso- genes were also identified, but these were not closely
lateshasrevealedthepresenceofPR-likesequencesin linked to PR.
anumberofisolatescurrentlyinculture.TwoPelagibacter The carotenoid biosynthesis genes crtE, crtB, crtI and
ubiquestrainsoftheabundantSAR11cladeofAlphapro- crtY encode the enzymes necessary for the synthesis of
teobacteria possess a PR (Giovannoni etal., 2005b) b-carotene (Sandmann, 1994), and blh cleaves this
(http://www.moore.org/microgenome/). Three cultivated carotenoid to yield retinal (Sabehi etal., 2005). Gener-
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
6 J.McCarrenandE.F.DeLong
ally, these genes are arrayed downstream of the PR tolyases was present. At the opposite end of the PR
gene in the putative operonal arrangement crtEIBY, blh photosystemgenecluster,andencodedontheopposing
(Fig.2, Table1). Clones HF10_19P19 and HF10_25F10 strand,isagenebearinghighsimilaritytobacterialcryp-
possess an additional downstream ORF similar to IPP tochromes (Hitomi etal., 2000). DNA photolyases and
isomerase (idi) to give the arrangement crtEIBY, blh, idi. cryptochromes share significant sequence homology
IPP isomerases catalyse the interconversion of the iso- (Cashmore etal., 1999), which suggests potentially
prenoid precursors required for b-carotene, and have similar mechanisms of light energy activation (Essen,
been shown to enhance carotenoid production (Kajiwara 2006). The exact function of bacterial cryptochromes,
etal., 1997; Sedkova etal., 2005). Notably crtI and crtB however,remainstobedetermined.
incloneHF10_49E08appeartohaveundergoneagene
fusion and are present as a single ORF. Finally the
Discussion
archaeal clone HF10_29C11 lacks a crtE gene, and the
crtI gene is encoded on the opposite strand compared Proteorhodopsins, a class of retinylidene proteins only
with its other retinal biosynthesis gene homologues recently discovered, are rapidly becoming recognized as
(Fig.2). widespread,diverseandabundantphotoproteinsfoundin
Phylogenetic trees were constructed for each gene diverse ecological and genomic contexts. A variety of
involvedinretinalbiosynthesis(Figs3and4,andFig.S5 approacheshavehelpedrevealtheirdistributionandvari-
in Supplementary material). For each tree, several ability including environmental PCR surveys (Man etal.,
strongly supported clades were present, which generally 2003;Sabehietal.,2003),directedsequencingofBACof
corresponded to the clone’s presumptive overall taxo- fosmid clones (Beja etal., 2000a; Sabehi etal., 2003;
nomicorigin.Therewereinterestingexceptions,however, 2004;2005;delaTorreetal.,2003;Frigaardetal.,2006;
where chromophore biosynthetic gene relationships dis- A. Martinez and E.F. DeLong, unpublished), whole
agreedwithtaxonomicorigins.Theclearestillustrationof genomesequencing(Giovannonietal.,2005b;Mongodin
this could be seen in carotenoid biosynthetic genes etal., 2005; http://www.moore.org/microgenome/), and
derived from the planctomycete and euryarchaeal large random shotgun sequencing of environmental DNA
insert DNAclones.The phylogenetic trees for CrtE, CrtB (Venteretal.,2004;DeLongetal.,2006).Thisstudypro-
and CrtI all show a strongly supported planctomycete videscomparativeperspectiveonthephylogeneticdistri-
clade (Blastopirellula and Rhodopirellula homologues), butionofPRgenes,theirgeneticlinkageandevolutionary
while the planctomycete clone HF10_49E08 carotenoid relatedness, and the likely co-transfer, retention and
biosynthetic genes group with the clade of marine co-evolution of the proteorhodopsin photosystem gene
PR-containing microbes, representing mostly Proteo- cluster.
bacteria.Likewise,CrtBandCrtYtreesstronglysupported Forseveralyearsproteorhodopsinswereonlyreported
specific archaeal clades distinct from bacterial inuncultivatedmembersoftheGamma-andAlphaproteo-
homologues. The exception was the archaeal clone bacteria(Manetal.,2003;Sabehietal.,2003;2004;dela
HF10_29C11, whose CrtB and CrtY were affiliated with Torreetal.,2003).Proteorhodopsinsarenowknowntobe
the marine PR-containing clade. This is similar to the much more widely distributed. For example, in the North
relationshipobservedfortheeuryarchaeal-associatedPR Pacific Subtropcial Gyre, proteorhodopsins were appar-
that is phylogenetically closely related to the PRs of entlypresentinmostoftheeuryarchaealpopulationinthe
marine planktonic bacteria (Frigaard etal., 2006). euphoticzone(Frigaardetal.,2006).Additionally,proteor-
In contrast, CrtE, CrtB and CrtI from PR-containing hodopsins are present within the genome of both
Bacteroidetesgenomesformedanindependentclade.In sequenced strains of the abundant SAR11 clade of
onlyonecase(CrtY)didBacteroidetessequencescluster Alphaproteobacteria (Giovannoni etal., 2005a,b) (http://
with the PR-linked proteobacterial clade (Fig.4B). Phy- www.moore.org/microgenome/).Whilethemajorityofthe
logeniesforCrtB,CrtYandBlhalsorevealthatthehalo- clones we report here do appear to originate from either
philic Bacteroidetes bacterium S.ruber possesses Gamma- or Alphaproteobacteria, two clones are derived
severalretinalbiosynthesisenzymesmoresimilartohalo- frommarinebacterioplanktontaxanotpreviouslyreported
philicarchaeathantootherBacteroidetes. to harbour proteorhodopsins. These include putative
In addition to two tandem proteorhodopsin genes and membersoftheBetaproteobacteriaandthePlanctomyc-
the adjacent retinal biosynthetic pathway genes, the etales,bothgroupsknowntobewellrepresentedinmarine
betaproteobacterial clone EBO_41B09 also contained bacterioplankton, especially in the coastal regions from
additional ORFs with homology to other light-activated which these genome fragments originate (Suzuki etal.,
genes.Approximately300basesupstreamfromthestart 2004). Shotgun sequencing had earlier hinted (Venter
site of EB0_41B09_PR1, and encoded on the opposite etal.,2004),andfullgenomesequencingofisolatesnow
DNAstrand an ORF bearing high similarity to DNApho- hasconfirmed(http://www.moore.org/microgenome/),that
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
Proteorhodopsinphotosystemgeneclusters 7
A)
Misc.
HF10_49E08
99.5/78.0 Flavobacterium
Paracoccus Pantoea marine gamma Marine PR
Bradyrhizobium HTCC2207
Cyano- HF10_25F10* containing
bacteria MED46A06 99.9/85.4
Synechococcus RED17H08
100/99.8
Crocosphaera EB0_39F01
Synechocystis
Plancto- Rhodo- MED66A03
pirellula
mycete
Blasto-
100/52.8 pirellula HF10_19P19*
Polaribacter
EBO_41B09
Tenacibaculum
Leeuwenhoekiella Vibrio
Photobacterium
Bacteroides
Cellulophaga
MED82F10
100/100
Psychroflexus Pelagibacter
Haloarcula Pyrococcus MED13K09 Marine PR
Halobacterium
Methanococcus containing
Sulfolobus
99.8/81.3
Archaea
83.8/51.3
B) Misc.
Cyano- 50.1/--
Rhodo- Pantoea
bacteria Yellowstone bacter Paracoccus
cyano Salinibacter
100/96.9 Oryza sativa
MED82F10
Zea mays EBO_41B09
Synechococcus
MED13K09 Marine PR
Plancto-
mycete Blasto- Pelagibacter containing
pirellula 80.3/82.3
100/92.0
marine gamma
Rhodo- HTCC2207
pirellula EB0_39F01
Psychroflexus
RED17H08
Leeuwenhoekiella
MED46A06
HF10_25F10*
Cellulophaga
Bacteroides
HF10_29C11
100/100 Tenacibaculum
MED66A03
Polaribacter
HF10_19P19*
Halobacterium Photobacterium
Vibrio
Haloarcula HF10_49E08
Natrono- Sulfolobus
Archaea monas
Picrophilus
100/99.9 Methano-
Archaea
thermobacter
0.1 changes
100/99.8
Fig.3. Phylogenetictreesofcarotenoidbiosynthesisenzymes(A)CrtEand(B)CrtBbasedoncomparisonsof336and317parsimony
informativecharactersrespectively.Branchesofmajorcladesaredifferentiatedbycolourwithbootstrapvaluesof1000pseudoreplicatesfor
thesecladesgiven(neighbour-joining/maximumparsimony;--,unresolvednode).Newlydepositedsequencesmarkedinbold[theasterisk(*)
indicatessequencesreportedbyA.MartinezandE.F.DeLong,inpreparation].Colouredboxesaroundindividualsequencenameshighlight
thosethatgroupoutsideoftheirtaxonomicclade.
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
8 J.McCarrenandE.F.DeLong
Cyanobacteria
A) Misc.
100/100
100/99.5
Anabaena Synechococcus
Rhodo- Nostoc Prochlorococcus Rhodobacter Misc.
pirellula 97.2/60.1
Rhodospirillum
Paracoccus
Plankto- Salinibacter
Blasto-
Pantoea
mycete pirellula
Bradyrhizobium
99.9/83.8
HF10_29C11
Rhodo-
HF10_49E08
pirellula
marine gamma HTCC2207
Blasto- MED66A03
pirellula
RED17H08
Psychroflexus
MED46A06
Tenacibaculum HF_25F10*
EB0_39F01
Bacteroides Polaribacter HF10_19P19*
MED82F10
100/100 Leeuwenhoekiella EBO_41B09
Cellulophaga Photobacterium
Vibrio
Marine PR
Pelagibacter
Natrono-
monas RED22EM0E4D13K09 containing
Haloarcula 98.6/97.7
Archaea Salinibacter Picrophilus
Sulfolobus
100/100 Archaea
100/99.8
B) Misc.
100/99.5 Erythrobacter
Paracoccus
Vibrio
Cyano- Pantoea Photobacterium Marine PR
Bradyrhizobium MED82F10
bacteria containing
Prochlorococcus ε MED66A03
100/99.9 100/99.0
Prochlorococcus β EB0_39F01
HF10_29C11
Synechococcus β
H10_25F10*
MED46A06
Arabidopsis
RED17H08
HF10_19P19*
HF10_49E08
marine gamma HTCC2207
Haloarcula EBO_41B09
Polaribacter
Salinibacter
Leeuwenhoekiella
Archaea
Cellulophaga
100/99.7 Natrono-
monas Tenacibaculum
Picrophilus Pelagibacter Psychroflexus
MED13K09
0.1 changes Sulfolobus Sulfolobus
Fig.4. Phylogenetictreesofcarotenoidbiosynthesisenzymes(A)CrtIand(B)CrtYbasedoncomparisonsof519and409parsimony
informativecharactersrespectively.Branchesofmajorcladesaredifferentiatedbycolourwithbootstrapvaluesof1000pseudoreplicatesfor
thesecladesgiven(neighbour-joining/maximumparsimony).Newlydepositedsequencesmarkedinbold[theasterisk(*)indicatessequences
reportedbyA.MartinezandE.F.DeLong,inpreparation].Colouredboxesaroundindividualsequencenameshighlightthosethatgroup
outsideoftheirtaxonomicclade.
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
Proteorhodopsinphotosystemgeneclusters 9
many marine Bacteroidetes (flavobacteria-related) also gene order across distantly related taxa is strongly sug-
encodePR-likegenes.Interestingly,evencommonlycul- gestive of a laterally transferring operon (Koonin etal.,
tivated taxa, such as marine Vibrio or Photobacterium 2001).Phylogeneticanalysisoftheb-carotenebiosynthe-
species, also encompass strains that have acquired this sis crt genes from these clones also indicates that these
photoprotein. Proteorhodopsins are widely distributed geneswereacquiredlaterally.Forexample,anumberof
acrosstheuniversaltreeoflife[e.g.proteorhodopsin-like crt genes (from a planctomycete and an archaeon) were
sequences are present in the eukaryote P.lunula not associated with homologues of their parent taxa, but
(Okamoto and Hastings, 2003), in marine group II eur- rather clustered with those of other PR-containing
yarchaeotes(Frigaardetal.,2006)andwidelydistributed bacteria.Aparallel example is also evident in the retinal
among the eubacteria]. It is likely that the list of taxa biosynthesis gene phylogenies of crtI and crtY from the
containing PRs will continue to expand with continued bacterium S.ruber that grouped with halophilic archaea
explorationofgenomesandcultivars. and not with the other Bacteriodetes, similar to previous
Multiple lines of evidence suggest that proteorhodop- observationsforblh(Mongodinetal.,2005).
sins are relatively frequently transferred between differ- Nearly a third of the PR-containing environmental
ent lineages. For example, proteorhodopsins are found genome fragments from diverse sources reported here
widely dispersed across diverse organisms, and prote- contained PR-linked retinal biosynthetic genes.Although
orhodopsin phylogenies do not generally correspond to notallPRswerelinkedtoretinalbiosynthesisgenes,itis
phylogenies for highly conserved genes. While similar probable these genes reside elsewhere within the
PR sequences are found in quite different organisms, genome. Among the sequenced marine microbial
the converse is also true with closely related organisms genomes that contain a single copy of proteorhodopsin,
possessing disparate PR sequences (Sabehi etal., somehavethecrtgenesimmediatelydownstreamofPR,
2004). Additionally, and similar to the case for the whileothersdonot.Phylogenetictreesfortheb-carotene
archaeal bacteriorhodopsin family (Baliga etal., 2004), biosynthetic genes also provide clues about differential
there are now several instances where a member of the genelinkageandtransferthatappear,inpart,dependent
Bacteria has been shown to possess multiple, divergent on the recipient’s genetic context. Marine Flavobacteria,
rhodopsin sequences. Similarly, clone HF10_41B09 has for example, are capable of producing b-carotene (Ber-
two very different PR sequences arrayed in tandem. It nardet etal., 2002) and their b-carotene biosynthesis
appears that rhodopsin gene duplication and diversifica- enzymesaregenerallyquitedistinctfromtheothermarine
tion, as well as lateral transfer, is a common phenom- proteorhodopsin-containingorganisms.However,thecrtY
ena. The different varieties of PR generated by these generepresentsanexception,andhighlightsthefactthat
processes appear to be maintained by considerable theindividualgeneswithinthispathwaydonotnecessar-
positive selection for their retention and utilization in the ily share a common evolutionary history (Sandmann,
photic zone. 2002).Inthegeneticcontextofab-carotene-synthesizing
Withouttheabilitytoproducethechromophoreretinal, microorganism,onlytwogenes(popandblh)arerequired
lateral acquisition of a PR gene is functionally irrelevant. to assemble a functional PR photosystem. Presumably
To gain selective advantage from PR, a recipient cell the marine Flavobacteria would only need to acquire a
needstoalreadypossessretinalbiosynthesiscapabilities, single gene to synthesize retinal. Consistent with this, in
or to acquire this capacity along with the PR gene. The each of the PR-containing marine Flavobacteria
acquisitionofonlysixgenesisrequiredtogaintheability genomes,whilethecrtgenesarenotlinkedtothePR,a
to harvest biochemical energy from light. This modest blh homologue is present immediately flanking the PR
requirement,forsuchabioenergeticadvantage,seemsto gene. Further, a blh homologue is absent from the
have favoured the widespread acquisition and utilization genome of the marine flavobacterium Leeuwenhoekiella
of the PR photosystem.Aprevious report (Sabehi etal., blandensisstr.MED217,whichpossessesb-carotenebio-
2005)showedthatseveralenvironmentalcloneshadthe synthesisgenessimilartotheothermarineFlavobacteria
genes necessary to biosynthesize retinal, immediately (Figs3and4)yetlacksaproteorhodopsin.
downstream of proteorhodopsin. Many of the environ- Proteorhodopsins are broadly distributed across
mental clones we report here, although from disparate diverse marine microbial taxa. While this distribution is
sources, have a similar chromosomal arrangement, with partly a consequence of LGT, it also probably reflects
the PR gene immediately upstream of the b-carotene a substantial fitness advantage conferred by the
biosynthetic pathway genes and the gene encoding its photosystem.As the enzymatic machinery for synthesiz-
cleavage to retinal. The close linkage of these genes ingacompleteproteorhodopsin-basedphotosystemisso
suggests that they may all have been acquired together simple,requiringonlysixlinkedgenes,thereisareason-
(Wolf etal., 2001).Although mosaic operons have been ablyhighprobabilityoftransferringafullyfunctionalgene
observed (Omelchenko etal., 2003), the conservation of complement. In a single LGT event, a microorganism
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
10 J.McCarrenandE.F.DeLong
might easily acquire the ability to derive energy from transposon insertion library for each clone using the Tn5
sunlight.Asaresult,therelatednessofcoupledrhodopsin vector EZ-Tn5<KAN> (Epicentre, Madison WI, USA). Tn5
andretinalbiosynthesisgenesseemsmorecloselyasso- insertion clones were sequenced in both directions using
KAN-2 FP-1 and KAN-2 RP-1 primers (Epicentre, Madison,
ciated with function and environmental context, than to
WI, USA) using BigDye v3.1 cycle sequencing chemistry
the particular taxonomic relationships of the organisms
and run on anABI Prism 3700 DNAanalyser (Applied Bio-
fromwhichtheyarederived.Asimilartrendisobservedin
systems, Forest City, CA, USA). Sequences were
hypersaline environments where the bacterium S.ruber assembled with Sequencher v4.5 (Gene Codes,AnnArbor,
and halophilic Archaea appear to have exchanged MI, USA) and annotated with both FGENESB (Softberry,
homologous, ‘archaeal’ elements of the same Mount Kisco, NY, USA) and Artemis release 7 (Rutherford
photosystem.Genetransferamongdistantlyrelatedtaxa etal., 2000).
inhabiting the same environment has been proposed as
oneofthesalienttrendsofLGT(Beikoetal.,2005).This Libraryprobing
appearstobethecaseforPR-basedphotosystems,and
North Pacific Subtropical Gyre fosmid library clones from 0,
islikelydrivenbyastrongselectionforenergyacquisition
70 and 130m depths were arrayed on Performa II filters
usinglight,innutrientlimitedenvironments.Atleast10%
(Genetix, Boston, MA, USA) and hybridized with a probe
of all Bacteria and Archaea in oceanic surface waters
targetingacarotenoidbiosyntheticgene.Theprobe,a509bp
appear to contain this photosystem (Sabehi etal., 2005; fragment of a crtY gene, was PCR amplified from clone
Frigaardetal.,2006).Inthecaseofmarineeuryarchaeal HF10_31E03,gelpurifiedandAlkPhoslabelled(Amersham,
genomes,thephotoproteingeneisabsentinEuryarchaea Piscataway, NJ, USA). Following overnight hybridization at
livingbelowtheeuphoticzone(Frigaardetal.,2006).The 60°C, membranes were incubated with ECF substrate
(Amersham)andvisualizedonaFujifilmFLA5100scanner.
proteorhodopsin photosystem appears to represent an
importantfunctionalcomponentinthemarinephoticzone
‘habitat genome’ (defined as the pool of genes that are Sequenceanalyses
beneficial for adaptation to a given set of environmental
BLASTanalyses(Altschuletal.,1997)wereusedtocompare
constraints;Mongodinetal.,2005).Moreover,theidenti-
complete clone sequences with results visualized using the
fication of additional PR-linked genes encoding putative customperlscriptblastView3(J.Chapman©2005)(BLASTN
‘photoactive’ products (DNA photolyase and crypto- settingsX=150,q=-1,F=F)andtheArtemisComparison
chrome from EB0_41B09) suggests potential co-transfer Tool (Carver etal., 2005). Ribosomal proteins sequences
of other light-related genes within the marine euphotic wereextractedfromtheRibAlignDBdatabaseconcatenated
andtheiraminoacidsequenceswerealignedusingRibAlign
zone. Proteorhodopsins are found broadly distributed
software(TeelingandGloeckner,2006).Allotheralignments
among diverse taxa in the microbial plankton, and are
were generated from inferred amino acid translations using
dynamicelementsingeneecology.Itwillbeimportantto
CLUSTALXv1.83.1.Phylogenetictreeswereconstructedusing
definetheirvariousfunctionalrolesandbioenergeticcon- bothneighbour-joiningdistancemethodsandmaximumpar-
tributionstobetterunderstandtheecologicalsignificance simony heuristics, performed with PAUP v4.0b10 (Sinauer
of proteorhodopsin photophysiology in the sea, and Assoc., Sunderland, MA, USA). Tree topologies for both
elsewhere. maximumparsimonyandneighbour-joiningtreesweretested
with bootstrap analyses of 1000 pseudoreplicates using the
TBRbranch-swappingalgorithm.
Experimentalprocedures
Accessionnumbers
FosmidandBAClibraryconstructioncloning
AnnotatedBACandfosmidsequenceshavebeendeposited
Thesamplingandconstructionofthefosmid(DeLongetal.,
to the GenBank/EMBL/DDJB databases under Acces-
2006)andBAC(Bejaetal.,2000b)librarieshavebeenpre-
sion No. EF089397–EF089402, EF107099–EF107106,
viouslydescribed.LargeinsertDNAclonescontainingthePR
EF100190andEF100191.
genes were identified by colony hybridization macroarrays,
as previously described (de la Torre etal., 2003; DeLong
etal.,2006). Acknowledgements
WethankLynneChristiansonforassistancewithmacroarray
Sequencingandannotation hybridization experiments. Special thanks toAsuncion Mar-
tinezforsharingunpublishedsequencedata.Thanksalsoto
Sequencing was performed by shotgun cloning of and the JGI staff for assistance in sequencing. This work was
sequencing of large insert clones, as previously described supported by a grant from the Gordon and Betty Moore
(Hallam etal., 2004; 2006), with the exception of two Foundation (E.F.D.), an NSF Microbial Observatory award
clones: HF10_49E08 and HF10_29C11. These two fosmid (MCB-0348001) and a US Department of Energy Microbial
clones were sequenced by first constructing a random GenomesProgramaward(E.F.D.).
©2007TheAuthors
Journalcompilation©2007SocietyforAppliedMicrobiologyandBlackwellPublishingLtd,EnvironmentalMicrobiology
Description:Proteor- hodopsins (PR) are another diverse clade of type I rhodopsins first identified in an .. including translation elongation factor Tu and RNA poly-.