Table Of ContentAbortive Initiation as a Bottleneck for Transcription in the Early
Drosophila Embryo
Alexander S. Serov∗1,2, Alexander J. Levine1, and Madhav Mani3
1Department of Chemistry And Biochemistry, University of California, Los Angeles, CA 90095,
USA
2Decision and Bayesian Computation Group, Institut Pasteur, 25 Rue du Docteur Roux, Paris,
75015, France
3Engineering Sciences and Applied Mathematics, Northwestern University, 2145 Sheridan Road,
7
Evanston, IL 60208, USA
1
0
2
January 24, 2017
n
a
J
Abstract
1 ical constraints on the kinetics of transcription and re-
2 evaluatesthevalidityofthestandardtotallyasymmetric
Gene transcription is a critical step in gene expression. simple exclusion process model of transcription. Several
h] The currently accepted physical model of transcription ideas on experimental validation of our hypothesis are
p of MacDonald, Gibbs & Pipkin (1969) predicts the ex- also suggested.
- istence of a physical limit on the maximal rate of tran-
o
scription,whichdoesnotdependonthetranscribedgene. Keywords: transcription, Drosophila fly embryonic
i
b Thislimitappearsasaresultofpolymerase“trafficjams” development, pattern formation, k-TASEP, extended
s. forming in the bulk of the 1D DNA chain at high poly- particles
c meraseconcentrations. Recentexperimentshave, forthe
i
s first time, allowed one to access live gene expression dy-
y
namics in the Drosophila fly embryo in vivo under the
h 1 Introduction
conditions of heavy polymerase load and test the predic-
p
[ tions of the model (Garcia et al., 2013). Our analysis
Pattern formation in the developing organism relies on
of the data shows that the maximal rate of transcription
1 precisespatiotemporalcontrolofgeneexpressionGilbert
is indeed the same for the Hunchback, Snail and Knirps
v
(2013). Geneexpressionoccursastheresultofacomplex
9 gap genes, and modified gene constructs in nuclear cy-
dynamical process, the first step of which is transcrip-
7 cles 13, 14, but the experimentally observed value of the
0 maximal transcription rate corresponds to only 40% of tion Alberts et al. (2002). During transcription, a multi-
6 protein polymerase complex moves sequentially down a
the one predicted by this model. We argue that such
0 strand of DNA Alberts et al. (2002) and produces an
a decrease must be due to a slower polymerase elonga-
.
1 tion rate in the vicinity of the promoter region. This mRNA molecule, which, after various downstream pro-
0 cesses, produces a particular protein. While it is clear
effectively shifts the bottleneck of transcription from the
7 in some situations that the observed rate of protein pro-
bulk to the promoter region of the gene. We suggest a
1
duction is selected by the developmental program, that
: quantitative explanation of the difference by taking into
v rate is also subject to physical limits resulting from its
account abortive transcription initiation. Our calcula-
i molecular mechanisms Bialek (2012).
X tions based on the independently measured abortive ini-
Recently, techniques have been developed that allow
r tiationconstantin vitro confirmthishypothesisandfind
a us to track mRNA levels over time and across nuclear
quantitative agreement with MS2 fluorescence live imag-
cycles of a fruit fly early development at the individual
ing data in the early fruit fly embryo. If our explanation
celllevelGarciaetal.(2013),Bothmaetal.(2014,2015).
iscorrect,thenthetranscriptionratecannotbeincreased
Consequently, we have have access to at least parts of
byreplacing“slowcodons”inthebulkwithsynonymous
the fly’s developmental program (as monitored via the
codons, and experimental efforts must be focused on the
transcription rates of a handful of specific genes) from
promoter region instead. This study extends our under-
the first cell division of the fertilized egg through the
standing of transcriptional regulation, re-examines phys-
formation of a complex and spatially patterned embryo.
∗Correspondingauthor. E-mail: [email protected] These data provide a novel window on the unfolding
1
of the fly’s developmental pattern. It is tempting to in- oretical considerations, to be described below. A likely
terpret the spatiotemporal changes in the rate of mRNA explanation of this result is that we have misidentified
transcript production solely in terms of evolutionary op- the rate-limiting step in the mRNA production. One
timization of embryonic development. The development cannot discount the existence of other bottlenecks in
process, however, relies on physical processes, which im- the system, and we note here that abortive initiation of
posetheirownconstraints. Inthismanuscript,weexam- transcription has been independently observed in poly-
ine the mRNA production data for three genes, known merase dynamics in the proximity of the promoter in
to be a part of the head-to-tail patterning systems ac- in vitro setups Revyakin et al. (2006), Margeat et al.
tiveduringearlyflydevelopmentGilbert(2013),withan (2006),Kapanidisetal.(2006). Ineffect,itwasreported
eye towards determining whether the physical dynamics that the polymerase complex gets stuck for some time
of transcription impose limitations on the developmen- at the start of a gene. When we take into account this
tal dynamics. To put this another way: we ask whether aspect of transcription and perform calculations based
theobserveddevelopmentalprogramrunsinadynamical on an independently measured abortive initiation time
regime where it saturates the physical constraints of the constant Revyakin et al. (2006), we find that the traffic
underlying biochemical machinery. jams located at the start of the gene further limit the
There is a distinct reason to imagine that the gene ex- polymerasethroughputandbringthepredictedmaximal
pression machinery does, in fact, saturate some dynam- rate inline with the observed one.
ical upper bounds. On the one hand, there may exist The remainder of this manuscript is organized as fol-
selection pressure on the speed of embryo growth. On lows. First we briefly review the basic physics of traffic
the other hand, there are speed limits to transcription jams in one-dimensional models (see also Schmittmann
inherent in the mechanism itself. In particular, the gene and Zia, 1995, Zia et al., 2011, Chou et al., 2011). We
is a linear array of base pairs that must be read sequen- then introduce a modified model based on our knowl-
tially; the polymerase complex is doing the reading by edge of abortive transcription initiation, and discuss our
stochasticallymakingstepsalongthegene,andwhiledo- calculations and simulations. The modified model is a
ing so, it takes up a finite-length footprint on it (Fig. 1). special case of what is known in the literature as an
A number of such polymerase complexes may simulta- inhomogeneous TASEP model with slow (or defective)
neously run along the gene Garcia et al. (2013), that sites Zia et al. (2011), Dong et al. (2009), Chou and
in a first approximation can be assumed to interact only Lakatos (2004). In the next section we present the data
throughstericrepulsion(excludedvolume). Duringmax- and show that these cannot be satisfactory explained by
imal protein production, the number of such complexes theoriginalmodel,butarequantitativelyconsistentwith
is large enough (up to 100, Garcia et al. (2013)) that a transcriptional bottleneck associated with the initia-
their combined footprint takes up a significant fraction tion of the transcription process observed in the modi-
of the gene’s length (up to 80%, Garcia et al. (2013)). fiedmodel. Intheconclusionswediscusstheimplication
At such high coverages, one might well expect that the ofthedemonstratedeffectfordevelopmentalbiologyand
steric interactions between stochastically moving poly- forthestatisticalphysicsofnon-equilibriumsystems,and
merasecomplexessetamaximalrateforgeneexpression. propose several directions for future research.
These basic dynamics are reminiscent of traffic jams ob-
servedoncongestedroadways,andhasbeenwellstudied
2 The statistical mechanics of
in a variety of contexts Schu¨tz (2001), Schmittmann and
Zia (1995), as we describe below. traffic jams
Here we use earlier developed models to determine
from theoretical considerations alone what that maxi- Recognizing that the collective dynamics of many poly-
mal mRNA production rate should be and ask whether merases walking on DNA can be thought of as that
this rate is indeed observed in the data. It is reason- of processive stochastic walkers with simple steric (ex-
able to imagine that Drosophila embryonic development cluded volume) interactions, we briefly review this long-
is precisely the right place to look for nature to saturate standing model of non-equilibrium statistical physics.
this limiting value of protein production, since optimal The physics of this one-dimensional model provides per-
production likely leads to the most efficient method for haps the simplest example of non-equilibrium steady
rapidly making a fly. Our findings are, at first, perplex- states in a strongly interacting many-body system. This
ing. We indeed find that for three different genes that model, originally proposed by MacDonald and collabo-
are transcribed during development, the rates of tran- rators MacDonald et al. (1968), MacDonald and Gibbs
scriptional initiation production reach the same maxi- (1969) and commonly referred to as a Totally Asymmet-
mal value (within 15% of each other), suggesting the ric Simple Exclusion Process (TASEP), has been stud-
presence of a physically-imposed production speed limit ied as part of a general interest in the physics of driven
that developmental biology saturates, at least in these diffusive systems Schmittmann and Zia (1995) and as a
extreme circumstances. The observed speed limit, how- modelforbiologicallyrelevantdynamicsZiaetal.(2011),
ever, is only around 40% that predicted by basic the- Chowdhury et al. (2005), Frey et al. (2004), with the ap-
2
Abortive transcription initiation makes particles Jump suppressed if Without steric hindrance particles Incremental exit with
remain on the first site for an average time τ next site is occupied jump 1 site at a time with rate k no steric hindrance
Particle of length ℓ
Complete
entry rule k β=k
α
Figure 1: Schematic representation of the abortive k-TASEP model with extended particles of length (cid:96). The abortive
k-TASEPmodelfeaturesanadditionalslowfirstsite(showninred)comparedtothestandardk-TASEPmodel. From
the biological point of view, this model represents sequential elongation of polymerase complexes with footprint (cid:96)
along a gene of length L
plicationstoribosomemotionalongmRNA.Wereferthe steadystatesfromthebetterunderstoodthermodynamic
interested reader to Refs. Schmittmann and Zia (1995), phases of equilibrium states. A mean-field analysis of
Ziaetal.(2011),Chouetal.(2011)forextensivereviews the steady states leads to the prediction of three dis-
of the literature on TASEP. tinct phases Schu¨tz (2001). For α,β sufficiently small,
In brief, it is essential to recall that the basic prob- one finds a stable shock front where the particle den-
lem is defined by a particle injection rate α at the start sity jumps from a bulk value defined by the bulk jump
(e.g., the left end of the one dimensional track on which rate k to the one controlled by the boundary extrac-
the particles move) and an extraction rate at the right tion/injection rates α, β. The position of this stable
end (Fig. 1). The injection rate α may be interpreted shock within the system depends on the relative values
as the attempt frequency to add a particle to the left, of α and β. For sufficiently high values of both α and
which is successful only when that space is unoccupied. β, one observes a different phase, the maximal current
Theextractionrateβ maybesimilarlyinterpreted. Once regime. For α > β (β > α) there is also a high (low)
injected, particles move stochastically to the right. We density phase in which the current through the system
will consider only the case of Poisson walkers taking sin- isbelowthemaximalvalue. Thesemean-fieldresultsare
gle steps on a lattice so that, without interparticle in- supported by exact solutions to the steady-state density
teractions, a single stepping rate k determines both the profileobtainedbyLiggettLiggett(1975)andbynumer-
mean velocity of the walker and all its velocity fluctua- ical simulations.
tions van Kampen (1992). We also restrict our analysis Inthecaseofextendedparticles(thek-TASEPmodel),
to the case in which particles cannot be added or re- theconclusionsofthemean-fieldapproachstayvalid,but
moved in the bulk (i.e. no Langmuir kinetics), although additional factors appear in now approximate formulas
novel dynamics have been studied in this case Parmeg- for the mean elongation rate s and particle density ρ.
giani et al. (2003). This restriction is demanded by the For instance, in the low-density (LD) phase for low in-
experimentally observed processivity of the polymerase jection attempt rate α and high exit attempt rate β one
complexontheDNA.Themodelcontainstwomorecon- obtainsMacDonaldandGibbs(1969),Shawetal.(2003):
trol parameters: the length of the track L (i.e., the total
α α(k−α)
number of base pairs in the gene) and the length of the
ρ (α)= , s (α)= .
walking particles (cid:96) (i.e., the number of lattice sites taken LD k+α((cid:96)−1) LD k+α((cid:96)−1)
up by one walker). In most TASEP studies the walkers (1)
are assumed to be “point particles” (taking up one site)
When the injection attempt rate α increases to its crit-
on a lattice constant in length, but, as we will see be- √
ical value χ = k/(1+ l), while β stays high (β (cid:62) χ),
low, the finite length of the polymerase complex plays a
thesystemexperiencesaphasetransitiontothemaximal
significant role in understanding the collective dynamics
current phase that is characterized by the maximal par-
of transcription in the developing embryo and cannot be
ticle density ρ and the maximal particle elongation
ignored. max
rate s :
The main phenomenology of TASEP dynamics of max
point-like particles can be characterized by a phase di-
1 k
agram defined by the injection and extraction rates α ρmax = √ √ , smax = √ . (2)
(cid:96)(1+ (cid:96)) (1+ l)2
and β respectively Schu¨tz (2001), Zia et al. (2011). It is
remarkable that this one-dimensional system with short- For more details on the phase diagram of the k-TASEP
ranged interactions can develop long-range order such model, the reader is referred to Refs Shaw et al. (2003),
that the bulk properties of the system can be controlled Schu¨tz (2001). Assuming the footprint of a polymerase
by these boundary conditions at the endpoints. This is complex on the DNA is (cid:96) = 45 ± 5 base pairs (bp)
one of the many features distinguishing non-equilibrium long (as determined by electron crystallography, (?Selby
3
et al., 1997, Poglitsch et al., 1999)), and its bulk jump and numerical results have been obtained for point par-
rate on the DNA is around k ≈ 26 ± 2bp/s (Garcia ticles (see Cook et al., 2013, Poker et al., 2015, Dhi-
et al., 2013), in the maximal current regime, the theory man and Gupta, 2016, Xiao et al., 2016, and references
predicts the maximal polymerase injection rate s ≈ therein), as well as for extended particles Shaw et al.
max
0.44±0.05pol/s≈26±3pol/min. Newpolymerasecom- (2003, 2004b,a), Dong et al. (2007), Klumpp and Hwa
plexes cannot be recruited on the given gene faster than (2008), Dong et al. (2009), Zia et al. (2011), Brackley
this rate, which is defined by how fast the injected parti- et al. (2011). The standard way to treat a problem with
cleclearstheinjectionsite(thefirst(cid:96)sitesofthelattice). an individual slow site like ours would consist in using
On a gene L = 5400bp base pairs long (like Hunchback mean-field approximations for the sublattices to the left
gene,(Garciaetal.,2013)),themaximalnumberofpoly- andtotherightoftheslowsiteandstitchingthetwoso-
merases is N ≡Lρ ≈104±11pol. lutionstogetherbythefluxconservationequationonthe
max max
slow site Kolomeisky (1998), Shaw et al. (2004a). How-
ever, when the only slow site is located at the beginning
3 Abortive k-TASEP Model of the lattice, the problem can be easily reduced to the
standard k-TASEP model (a similar problem with point
In this paper we suggest that abortive initiation of tran- particles was recently considered by Xiao et al. (2016)).
scription can be the rate-limiting step of transcription Indeed, the slow site can be considered as a boundary
and should be added to the classical k-TASEP model, condition for the remaining L − 1 sites of the lattice
when used to explain experimental data in early embryo that can be treated as an ordinary (L−1)-site k-TASEP
development. One can expect that in such model the model. Ignoring entry-rule-related boundary effects, one
bottleneck of the transcription process will be located in can then, in the first approximation, define the injection
the promoter region at the 5’ end of the gene, rather attemptrateαoftherestofthelatticethroughthepause
than in its bulk as in the original model. Abortive ini- time τ at the first site: α = 1/τ. For long genes L (cid:29) 1,
tiation is an effect wherein polymerase complexes loaded onecansafelyassumeL−1≈LandusingEq.(1),(2)get
on a promoter region of a gene will several times en- the following expressions for the normalized polymerase
gage in transcription cycles, but will abort and release number in the steady state and normalized polymerase
shortRNAproductsreturningtotheinitialposition(Ka- elongation rate:
panidisetal.,2006,Revyakinetal.,2006,Margeatetal., √ √
2006). However once a product of 9 to 11 nucleotides is (cid:96)(1+ (cid:96))
N /N = ,
synthesized, the polymerase will break its interactions ss max kτ +(cid:96)−1
√ (3)
with the promoter and will proceed with mRNA synthe- (kτ −1)(1+ (cid:96))2
s /s = .
sis. It is currently accepted that the transition from the LD max kτ(kτ +(cid:96)−1)
abortive initiation state to processive RNA synthesis oc-
curs through a “DNA scrunching” mechanism, when the These dependencies are shown in Figs. 2a,b as solid
mechanical stress stored in the “scrunched” DNA dur- red lines for the biologically relevant range of values
ingabortivetranscriptionallowsthepolymerasetobreak ofτ =1–10s. Notethatthenormalizedpolymerasenum-
interactions with the promoter. However other mecha- ber N and the normalized injection rate s decrease
ss LD
nisms (namely, “transient excursions”, “inchworming”) from N /N ≈ 0.74 and s/s ≈ 0.82 for τ = 1s
ss max max
have been proposed as well (Kapanidis et al., 2006). to N /N ≈0.17 and s/s ≈0.19 for τ =10s. For
ss max max
To assess the effect of abortive initiation on the tran- theexperimentallyreportedvalueofτ ≈5±1sRevyakin
scription rate, we modify the standard k-TASEP model et al. (2006), the corresponding ratios are N /N ≈
ss max
by adding a slow site at the beginning of the lattice. 0.30 and s/s ≈0.34.
max
This modification will cause the polymerase to stochas- Note that while N /N describes the steady-state
ss max
tically pause at the first site imitating the effect ob- of the system, the expression for s /s describes
LD max
served in the experiment. Experimental studies have the current related to a shock wave of density propa-
revealed that the time spent by a polymerase in the gating in the system before the steady state is estab-
abortive initiation stage in the promoter region is an lished. The shock-wave description Schu¨tz (2001) is pos-
exponentially-distributedrandomvariablewiththeaver- sible thanks to the fact that the interface separating
agepausetime τ ≈5±1s(Revyakinetal.,2006), which two distinct phases in the TASEP model is microscopic,
we will use in the simulations below. In what follows, i.e. the transition occurs on a scale of several lattice
we will call this modification of the k-TASEP model an sites Janowsky and Lebowitz (1992, 1994). We verify
abortive k-TASEP model. that the shock-wave approach and the approximate for-
The proposed abortive k-TASEP model is a special mulas (3) ignoring boundary-layer effects correctly de-
caseofaninhomogeneousk-TASEPmodelwithonlyone scribeourabortivek-TASEPmodelbyperformingMonte
slow site located at the first site of the lattice. The Carlo simulations with the numerical values of parame-
inhomogeneous TASEP model has attracted significant ters (cid:96), L and k provided above. We use the “complete
interest in recent years, and some important analytical entry, incremental exit” rule implying that polymerases
4
80
0.8
0.7 = 1.0 s
= 3.0 s
0.7 = 4.0 s
0.6 60 = 5.0 s
N = 7.0 s
0.6 ss = 9.0 s
0.5 = 10.0 s
0.5 40
0.4 max
s
0.4
0.3
20
0.3
0.2
0.2
0
2 4 6 8 10 2 4 6 8 10 0 2 4 6
(s) (s) t (min)
(a) (b) (c)
Figure 2: Numerical simulations of the maximal polymerase number in the steady state N (a), the polymerase
ss
injection rate s (b), and the evolution of polymerase number N (c) as functions of the mean time τ spent in
pol
the abortive initiation regime. Each circle in (a),(b) and each curve in (c) represent the average of 20 simulations;
the dashed black line marks the maximal loading rate s ≈ 26±3pol/min predicted by the original k-TASEP
max
model(Eq.(2)). Independentexperimentsreportthatthemeandurationoftheabortiveinitiationphaseisaroundτ ≈
5±1s (Revyakin et al., 2006), which gives N /N ≈ 0.30 and s/s ≈ 0.34, or N ≈ 31 and s ≈ 8.8pol/min in
ss max max ss
theabsolutevalues. Blackarrowsin(c)indicatetheestablishedsteadystateofthesimulationwhereN isevaluated.
ss
s corresponds to the slope of the curves in (c), with s labeled in the plot. Eq. (3) and plots (a) and (b) can be
max
used to assess the mean time τ polymerases spend in the abortive initiation regime based on experimental data for
each gene, nc and construct. For instance, for the Hunchback bac construct (Fig. 3), one gets τ ≈ 2.0s in nc 13
and τ ≈ 2.8s in nc 14 from the mean N , and τ ≈ 2.9s in nc 13 and τ ≈ 4.3s in nc 14 from the mean injection
ss
rate s (dashed green lines).
4 Comparison to Experimental
are loaded on the gene only if the first (cid:96) sites are free,
while they leave the gene one site at a time with the Data
bulk rate k (Fig. 1) (Lakatos and Chou, 2003). To fo-
cus on the effect of the first slow site, we set the poly-
merase injection rate p to the value p = 100s−1 much
In the present section we compare the predictions of the
greater than the inverse time a polymerase spends on
abortivek-TASEPmodeltoexperimentalresultsinferred
the first site of the lattice p (cid:29) 1/τ for all τ in the
fromMS2fluorescencedataafterFISHcalibration,asde-
range 1–10s. The slow site is modeled as a stochastic
scribedin(Garciaetal.,2013). Eachdataentry(fluores-
process, in which at each time step ∆t a polymerase can
cence trace) contains the number of RNA polymerase II
eitherremainatthefirst(cid:96)sitesorcontinuetranscription
complexes N(t) currently loaded (transcribing) on a se-
with a certain finite probability. The escape probabil-
lectedgeneasafunctionoftimetrecordedinonenucleus
ity p =∆t/τ (the probability to leave the slow site) is
esc ofoneDrosophilaflyembryo. Thewholeobservationpe-
calculated based on the chosen values of ∆t and τ. The
riod is divided into nuclear cycles (nc), at the end of
simulationswereperformedwiththerejection-freekinetic
each of which the number of cells in the embryo dou-
MonteCarloalgorithm(KMC,anevent-basedalgorithm,
bles, and all polymerase complexes are unloaded from
alsoknownasresidence-time orBortz-Kalos-Lebowitz al-
the gene. We define the time point t = 0 to correspond
gorithm, see (Bortz et al., 1975)). The theoretical pre-
to the start of nc 10 (Garcia et al., 2013). In this anal-
diction (3) is found to perfectly describe simulation re-
ysis we will consider only the nuclear cycles 13 (starting
sults(shownasbluecirclesinFigs2a,b),whichvalidates
around t≈35min) and 14 (starting around t≈50min),
our analytical approach. Fig. 2c completes the simula-
since the genes we analyze are mainly expressed in these
tionresultsillustratingthemeanevolutionofthesystem
nuclear cycles.
from the non-steady state to the steady state.
The following three genes were considered in these
study: Hunchback (Hb), Snail (Sn) and Knirps (Kn).
All considered genes have one primary and one shadow
enhancer that influence the spatio-temporal dynamics of
5
Table1: Numberoffluorescencetracesanalyzedforeach
gene-enhancer construct combination. The number of nc 13 nc 14
embryos is given in the parentheses
0.03 0.025
0.02
Hunchback Snail Knirps FD0.02 FD0.015
P P
0.01
Bac 1965 (11) 547 (5) 326 (5) 0.01
0.005
No primary 979 (9) 607 (4) 84 (2)
0 0
No shadow 1279 (8) 159 (1) 60 (1) 0 50 100 0 50 100
Max. polymerase number N Max. polymerase number N
ss ss
(a) (b)
expression of these genes. The data presented below nc 13 nc 14
were obtained for each of these genes in three modifi- 0.03 0.025
cations: wild-type (labeled bac), a gene with no primary 0.02
enhancer (no primary) and a gene with no shadow en- FD0.02 FD0.015
P P
hancer (no shadow). Table 1 summarizes the number of 0.01
0.01
embryos and fluorescence traces analyzed for each gene- 0.005
enhancer construct combination. The time resolution of 0 0
0 50 100 0 50 100
the measurements is 37s. Max. polymerase number N Max. polymerase number N
ss ss
Figs 3c,d show the distribution of polymerase loading
(c) (d)
rates s as measured in our experiments for bac Hb in
nc 13 and 14. The maximal theoretical polymerase in- Figure 3: (a),(b): Probability density function (PDF) of
jection rate smax ≈26±3pol/min predicted by the orig- thesteady-statenumberofpolymerasesNssloadedonthe
inal k-TASEP model is marked by dashed black lines. HunchBack bac gene in nuclear cycles 13 (a) and 14 (b)
One can see that s appears to envelope all observed (allAPpositionscombined). Alldataentriesaregrouped
max
experimental values, suggesting that Hb transcription in bins of width 5, and the y-axis is proportional to the
in nc 13 and 14 cannot occur faster than s . The number of fluorescence traces falling in the current bin.
max
steady-state polymerase number Nss for the Hunchback The theoretically predicted upper limit Nmax ≈ 104±
bacgeneisshowninFigs3a,b,withthek-TASEPpredic- 11pol is shown by a dashed black line. The mean values
tion Nmax ≈ 104±11pol shown by a dotted black line. of the distributions in the figure are: Nss = 56±12pol
Once again we see that Nmax provides a correct over- (nc 13) and Nss = 46 ± 13pol (nc 14). (c),(d): PDF
all limit for all observed experimental values, suggesting of initial polymerase injection rates s for the HunchBack
that the original k-TASEP model indeed defines the up- bacenhancerconstructinnuclearcycles13(c)and14(d)
per boundary on the number of polymerase complexes (all AP positions combined). All slopes are grouped in
loaded on the gene in nc 13 and 14. Moreover, Figs 4 bins of width 1, and the y-axis is proportional to the
and 5 demonstrate that the actual mean transcription number of fluorescence traces falling in the current bin.
ratesdoesnotdependonaparticulargene,constructor The dashed vertical line shows the maximal polymerase
location of the nucleus on the anterior-to-posterior (AP) injectionratesmax ≈26pol/minaspredictedbyEq.(2).
axis of the embryo, suggesting gene-independent nature The mean values of the shown distributions are: s =
of the transcription rate constraint. 13pol/min (nc 13) and s = 10pol/min (nc 14). In all
It is however clear that the mean values of the dis- figuresthetotalnumberoftracesshownis682fornc13,
tributions in Fig. 3 lie well below the predictions of and1283fornc14. OnecanseethatbothNmaxandsmax
the unmodified k-TASEP model MacDonald and Gibbs correctly predict the maximal observed values, although
(1969), Shaw et al. (2003). The histograms in Fig. 3 theaveragevaluesarelower. Thefewoutliersabovesmax
clearlydemonstrateadownshiftfromthepredictednum- and Nss are likely due to noisy experimental data
bers, indicating that on average the 1D DNA chain is
used significantly lower than its maximum polymerase
flow capacity for a homogeneous lattice (with, for in- of τ ≈ 5±1s (Revyakin et al., 2006) predicts a simi-
stance, N /N ≈ 0.48 and s/s ≈ 0.44 for Hb lar decrease: N /N ≈ 0.30 and s/s ≈ 0.34. The
ss max max ss max max
bac, Fig. 3). In this paper we interpret this decrease slight difference between the predicted and the observed
as that on average no traffic jams are observed in the valuesofN /N ands/s maybe(besidesotherrea-
ss max max
bulk of the DNA chain for the selected genes and con- sonssuchasbiologicalvariability)attributedtoaslightly
structs, and that for them abortive transcription initia- different value of τ for Drosophila embryos in vivo com-
tionistherealbottleneckoftranscription(otherinterpre- pared to previous in vitro experiments in Ref. Revyakin
tationsarealsopossible,andwereturntothisdiscussion et al. (2006). If one assumes the validity of the abortive
later). Our speculations are supported by the abortive k-TASEP model, the distributions in Fig. 3 can be ex-
k-TASEP model that for independently measured value plained by the following values of τ: τ ≈ 2.0s for nc 13
6
Hb
60
0.7
50
0.6
40
0.5
N30 x
a0.4
m
20 s
/s 0.3
10
0.2 nc 13, bac
nc 13, no pr
0 nc 13, no sh
0.1 nc 14, bac
0 5 10 15 nc 14, no pr
Time since the start of a nuc. cyc. (min) nc 14, no sh
0
0.2 0.3 0.4 0.5
Figure 4: Mean evolution of active polymerase number AP position
ontheHunchback(blue),Snail(yellow)andKnirps(red)
genes. Each curve corresponds to a specific con- Figure 5: Initial injection rate s as a function of the
struct (bac, no primary or no shadow) of a particular AP position for different constructs (bac, no primary,
gene (Hunchback, Snail or Knirps) in one of the nu- no shadow) of the Hunchback gene. The error bars
clear cycles (13 or 14) and is averaged over the anterior- show 95% confidence intervals as estimated by boot-
posterior (AP) position in the embryo. The curves are strapping (Efron, 1979). Each curve was slightly shifted
combined in the same plot to demonstrate that the ini- sidewaystoallowallerrorbarstobevisible. Thedashed
tial injection rate s does not depend on any of these line shows the injection rate predicted by the abortive
parameters (some noise is present). The thick black TASEP model (s/smax ≈ 0.34) for τ = 5s. This predic-
curve shows the average over all the parameters. The tion fits well the experimental data lying slightly below.
dashedstraightlineshowsthemaximaltheoreticalinjec- Narrow AP windows and low number of data sets avail-
tionrates ≈26pol/minaspredictedbythestandard able to us for the Snail and Knirps genes (Table 1) did
max
k-TASEP model (Eq. (2)). The solid straight line shows not allow us to draw any reliable conclusions on the de-
the initial injection rate s/smax ≈ 0.34 as predicted by pendence of s/smax on AP for those genes
the abortive k-TASEP model for τ ≈5s. Eqs (3) can be
used to adjust the τ parameter and fit the initial slope
measuredvalues. Itcanbeseenthatexperimentaldistri-
more accurately (see main text). All fluorescence traces
butionsareabout3timeswiderthanthecalculatedones,
used throughout this article required synchronization of
which suggests that the noise sources included into the
thebeginningofnuclearcyclesbetweendifferentembryos
numerical abortive k-TASEP model (namely, stochastic
and nuclei. The desynchronization is due to recording
elongation and stochastic abortive initiation duration)
started at different moments in different embryos and to
are responsible for only about one third of the variabil-
the limited temporal resolution of the data. All traces
ityoftheexperimentalvalues. Othernoisesources(such
shorter than 10mins were also removed, since we could
as stochastic promoter activation, heterogeneous or dy-
notreliablysynchronizethemwiththeothertraces. The
namic elongation rates, instrumental noise or slope fit-
steadystateofthesystemseemstobeevolvingwithtime,
ting)mustaccountfortherestofthewidthofthedistri-
showingagradualdecreaseintheactivepolymerasenum-
butions(forfurtherdiscussionofnoisesourcesinprotein
ber. Under these conditions, we have taken the peak
synthesis see (Tkacik et al., 2008)).
number of polymerases as an estimate of N in all pre-
ss
sented data
5 Discussion
andτ ≈2.8sfornc14fromthepolymerasenumberN , Our analysis of transcriptional dynamics in the
ss
and τ ≈ 2.9s for nc 13 and τ ≈ 4.3s for nc 14 from the Drosophilaflyembryoexhibitsacommonvalueofsteady-
meaninjectionrates. Thesevaluesareshownbydashed state polymerase number N and polymerase injection
ss
green lines in Figs 2a,b. rate s in the beginning of nuclear cycles 13 and 14 for
To finalize the comparison between the simulations several analyzed genes and constructs. Although such
and the experimental data, it is interesting to compare universal character can be explained by the standard k-
the widths of the calculated distributions for N and s TASEP model, the observed values of N and s are sig-
ss ss
to the widths of experimental distributions in Fig. 3. nificantly lower than the theoretically predicted ones. In
Fig.6showsstandarddeviations(STD)ofN /N and the present article we have demonstrated that this dif-
ss max
s/s calculated for different values of mean abortive ference in slopes can be explained by the abortive ini-
max
initiation time τ and plotted along the experimentally tiation process that effectively decreases the polymerase
7
0.2
0.12
0.15
0.1
0.08 Simulation Simulation
Data nc 13 0.1 Data nc 13
0.06 Data nc 14 Data nc 14
0.04
0.05
0.02
0 0
2 4 6 8 10 2 4 6 8 10
Figure 6: Relative standard deviation of the steady state number of polymerases (δN /N ) and normalized initial
ss max
injection slope (δs/s ) calculated numerically as a function of mean abortive transcription initiation duration τ.
max
ExperimentallycalculatedvaluesfortheHunchbackbacconstructareshownbydashedhorizontallines: δN /N ≈
ss max
0.115 (nc 13), δN /N ≈ 125 (nc 14), δs/s ≈ 0.16 (nc 13), δs/s ≈ 0.16 (nc 14). The experimental lines
ss max max max
show a constant value, since τ dependence was not available from the experiment. Note that the noise level of the
experimental data is much higher than the noise predicted by the numerical model, which suggests the existence of
additional noise sources in experimental data (see main text)
elongation rate in the promoter region of the gene. We thus modulating the elongation speed. It is interesting
have modified the k-TASEP model adding a slow first to mention here that, for example, for some of the pro-
site, andprovidedsimpleformulasandperformedMonte teinsofEscherichiacoli,theslowcodonswerereportedto
Carlo simulations to estimate N and s in it. Our re- be preferentially located within the first 25 codons Chen
ss
sults for independently measured abortive transcription and Inouye (1990). Regardless of the exact distribution
initiation duration τ ≈5s give values much closer to the of slow sites, an important practical conclusion of this
experimentalones, andsuggestthatthepromoterregion study is that if the bottleneck of transcription is indeed
may be the real bottleneck of the transcription process located in the promoter region, the overall transcription
ratherthanpolymerase“trafficjams”downstreamofthe rate cannot be increased by modifications in the bulk of
gene. the gene, such as replacement of slow codons with syn-
It is however important to mention that our hypothe- onymousonesShawetal.(2004b),Ziaetal.(2011),Shaw
sis is not the only possible explanation for the observed et al. (2003). Experimental research of higher transcrip-
phenomenon. It seems to be impossible to explain the tion rates must then focus on acceleration of the initi-
observed decrease in s/s and N /N without slow ation dynamic or on substitution of slow codons in the
max ss max
sites near the start of the gene, but the distribution of promoter region of the gene.
slow sites may take different forms. We have only ana- From the physical point of view, it seems to be impos-
lyzedthesimplestoptionofonlyonefirstslowsite. Other sible to distinguish different configurations of slow sites
options may include several distributed slow sites, clus- having only access to the steady-state dynamics of the
ters of slow sites Chou and Lakatos (2004), or inhomo- experimental system. However, one can probably search
geneousjumprates(quencheddisorder)nearthestartof for the answer in the non-steady state dynamics, or the
thegeneShawetal.(2004b),Ziaetal.(2011),Shawetal. slowevolutionofthesteady-statesseeninFig.4. Forin-
(2003). Interestingly, the effect of quenched disorder on stance, ifslowsitesaredistributedalongthewholegene,
theparticlecurrentintheTASEPmodelhasbeenshown the propagating density wave must slow down each time
tomainlyreducetotheeffectofseveralslowestsitesShaw a new, slower site is discovered. Surprisingly, the curves
etal.(2004b,2003). Althoughthephysicaldescriptionof in Fig. 4 don’t seem to exhibit this gradual or stepwise
systemswithdifferentdistributionsofslowsiteswouldbe slow-down,suggestingagainthattheslowestsitesarelo-
similar,thebiologicalprocessesbehindtheslowsitesmay cated near the promoter region. This feature, as well as
be completely different. For instance, the quenched dis- a local decrease of transcription speed around 1–2mins
order model describes the fact that different nucleotides into nc 13 (Fig. 4, middle pannel), or the gradual de-
forthebuildingmRNAchainmayhavedifferentconcen- creaseofpolymerasedensityafterthemaximumhasbeen
trations in the cell, so that the polymerase complex has reached (Fig. 4), cannot be explained by the abortive k-
to wait longer for some nucleotides than for the others TASEP model, and require further investigation.
8
Fromthepointofviewofmolecularbiology, onecould ofevestripe2expressionrevealstranscriptionalbursts
conceive additional experiments aimed at identification in living Drosophila embryos. Proc. Natl. Acad. Sci.
of the role of the abortive transcription process in the U. S. A., 111(29):10598–603. http://dx.doi.org/
decrease of the transcription rate. On the one hand, 10.1073/pnas.1410022111.
if one could alter the molecular mechanisms behind the
Bothma, J. P., Garcia, H. G., Ng, S., Perry, M. W., Gre-
abortive initiation and slow it down, the corresponding
gor,T.,andLevine,M.,2015. Enhanceradditivityand
decrease in the transcription rate and steady-state poly-
non-additivityaredeterminedbyenhancerstrengthin
merase number could be observed experimentally and
the Drosophila embryo. Elife, 4(AUGUST2015):1–14.
compared to the predictions of Eq. (3) for the new val-
http://dx.doi.org/10.7554/eLife.07956.001.
ues of τ. If on the other hand, the abortive initiation
could be accelerated, at some point one can expect that
Brackley, C. A., Romano, M. C., and Thiel, M., 2011.
the transcription bottleneck may shift to other involved
TheDynamicsofSupplyandDemandinmRNATrans-
mechanisms, so that the transcription rate may be de-
lation. PLoS Comput. Biol., 7(10):e1002203. http:
termined by polymerase “traffic jams” in the bulk of the
//dx.doi.org/10.1371/journal.pcbi.1002203.
gene or, for instance, promoter activation kinetics. In
either case, the curves in Fig. 4 must reflect the changes. Chen, G. F. T. and Inouye, M., 1990. Suppression of
We have also compared the width of the distributions the negative effect of minor arginine codons on gene
of Nss/Nmax and s/smax in our simulations and in the expression; preferential usage of minor codons within
experimental data. Much wider experimental distribu- the first 25 codons of the Escherichia coli genes. Nu-
tions indicate that the noise sources taken into account cleic Acids Res., 18(6):1465–1473. http://dx.doi.
in our abortive k-TASEP model cannot satisfactory ex- org/10.1093/nar/18.6.1465.
plain the full variability of experimental data, and that
additional noise sources should be investigated in future Chou, T. and Lakatos, G., 2004. Clustered bottlenecks
studies. As an interesting development, one could try to in mRNA translation and protein synthesis. Phys.
extract additional information about the transcriptional Rev. Lett.,93(19):1–4. http://dx.doi.org/10.1103/
bottlenecks and noise sources of the system by analyz- PhysRevLett.93.198101.
ing the noise, identifying noise signatures seen in the
Chou, T., Mallick, K., and Zia, R. K. P., 2011. Non-
experimental data and shapes of the experimental dis-
equilibrium statistical mechanics: from a paradig-
tributions (i.e., in Fig. 3) and comparing them to those
matic model to biological transport. Reports Prog.
predictedbytheabortivek-TASEPmodelorothernoise
Phys., 74(11):116601. http://dx.doi.org/10.1088/
sources.
0034-4885/74/11/116601.
Finally, the question of whether the system of poly-
merases on the DNA actually reaches a steady non- Chowdhury, D., Schadschneider, A., and Nishinari, K.,
equilibrium state under physiological conditions of nc 13 2005. Physics of transport and traffic phenomena in
and 14 also requires additional investigation. For in- biology: From molecular motors and cells to organ-
stance, Fig. 4 is inconclusive about whether the tran- isms. Phys. Life Rev., 2(4):318–352. http://dx.doi.
scriptionratestaysconstantforsometimeafterreaching org/10.1016/j.plrev.2005.09.001.
thepeakvalueandbeforestartingtodecrease. Ifitisnot
thecase,thevalidityofthesteady-statek-TASEPmodel Cook, L. J., Dong, J. J., and Lafleur, A., 2013. In-
asamodeloftranscriptionunderphysiologicalconditions terplay between finite resources and a local defect in
should be critically re-evaluated. an asymmetric simple exclusion process. Phys. Rev.
E - Stat. Nonlinear, Soft Matter Phys., 88(4):1–8.
http://dx.doi.org/10.1103/PhysRevE.88.042127.
References
Dhiman, I. and Gupta, A. K., 2016. Origin and dynam-
Alberts,B.,Johnson,A.,Lewis,J.,Raff,M.,Roberts,K., ics of a bottleneck-induced shock in a two-channel ex-
and Walter, P., 2002. Molecular Biology of the Cell. clusion process. Phys. Lett. Sect. A Gen. At. Solid
Garland Science, New York, 4 edition. State Phys.,380(24):2038–2044. http://dx.doi.org/
10.1016/j.physleta.2016.04.031.
Bialek, W., 2012. Biophysics: Searching for Principles.
Princeton University Press, Princeton, Oxford.
Dong, J. J., Schmittmann, B., and Zia, R. K. P., 2007.
Bortz, A., Kalos, M., and Lebowitz, J. L., 1975. A Inhomogeneous exclusion processes with extended ob-
new algorithm for Monte Carlo simulation of Ising jects: The effect of defect locations. Phys. Rev. E -
spin systems. J. Comput. Phys., 17(1):10–18. http: Stat.Nonlinear,SoftMatterPhys.,76(5):31–33.http:
//dx.doi.org/10.1016/0021-9991(75)90060-1. //dx.doi.org/10.1103/PhysRevE.76.051113.
Bothma, J. P., Garcia, H. G., Esposito, E., Schlissel, G., Dong, J. J., Zia, R. K. P., and Schmittmann, B.,
Gregor, T., and Levine, M., 2014. Dynamic regulation 2009. Understanding the edge effect in TASEP with
9
mean-field theoretic approaches. J. Phys. A Math. MacDonald, C. T. and Gibbs, J. H., 1969. Concerning
Theor., 42(1):015002. http://dx.doi.org/10.1088/ thekineticsofpolypeptidesynthesisonpolyribosomes.
1751-8113/42/1/015002. Biopolymers, 7(5):707–725. http://dx.doi.org/10.
1002/bip.1969.360070508.
Efron, B., 1979. Bootstrap Methods: Another Look at
the Jackknife. Ann. Stat., 7(1):1–26. http://dx.doi. MacDonald, C. T., Gibbs, J. H., and Pipkin, A. C.,
org/10.1214/aos/1176344552. 1968. Kinetics of biopolymerization on nucleic acid
templates. Biopolymers, 6(1):1–25. http://dx.doi.
Frey, E., Parmeggiani, A., and Franosch, T., 2004. Col- org/10.1002/bip.1968.360060102.
lective phenomena in intracellular processes. Genome
informatics, 15(1):46–55. http://dx.doi.org/10. Margeat, E., Kapanidis, A. N., Tinnefeld, P., Wang, Y.,
11234/gi1990.15.46. Mukhopadhyay, J., Ebright, R. H., and Weiss, S.,
2006. Direct observation of abortive initiation and
Garcia, H. G., Tikhonov, M., Lin, A., and Gregor, T.,
promoter escape within single immobilized transcrip-
2013. Quantitative imaging of transcription in living
tion complexes. Biophys. J., 90(4):1419–1431. http:
Drosophila embryos links polymerase activity to pat-
//dx.doi.org/10.1529/biophysj.105.069252.
terning. Curr. Biol., 23(21):2140–5. http://dx.doi.
org/10.1016/j.cub.2013.08.054. Parmeggiani,A.,Franosch,T.,andFrey,E.,2003. Phase
coexistenceindrivenone-dimensionaltransport. Phys.
Gilbert, S. F., 2013. Developmental Biology. Sinauer
Rev. Lett., 90(February):086601. http://dx.doi.
Associates, Inc., Sunderland, MA, 10 edition.
org/10.1103/PhysRevLett.90.086601.
Janowsky, S. A. and Lebowitz, J. L., 1994. Exact re-
Poglitsch, C. L., Meredith, G. D., Gnatt, a. L.,
sults for the asymmetric simple exclusion process with
Jensen, G. J., Chang, W. H., Fu, J., and Korn-
a blockage. J. Stat. Phys., 77(1-2):35–51. http:
berg,R.D.,1999.ElectroncrystalstructureofanRNA
//dx.doi.org/10.1007/BF02186831.
polymerase II transcription elongation complex. Cell,
98(6):791–798. http://dx.doi.org/S0092-8674(00)
Janowsky, S. A. and Lebowitz, J. L., 1992. Finite-
81513-5[pii].
size effects and shock fluctuations in the asymmetric
simple-exclusionprocess. Phys.Rev.A,45(2):618–625.
Poker, G., Margaliot, M., and Tuller, T., 2015. Sen-
http://dx.doi.org/10.1103/PhysRevA.45.618.
sitivity of mRNA Translation. Sci. Rep., 5:12795.
http://dx.doi.org/10.1038/srep12795.
Kapanidis, A. N., Margeat, E., Ho, S. O., Kortkhon-
jia, E., Weiss, S., and Ebright, R. H., 2006. Initial
Revyakin, A., Liu, C., Ebright, R. H., and Strick, T. R.,
transcription by RNA polymerase proceeds through
2006. AbortiveInitiation and ProductiveInitiationby
a DNA-scrunching mechanism. Science (80-. ).,
RNA Polymerase Involve DNA Scrunching. Science
314(5802):1144–1147. http://dx.doi.org/10.1126/
(80-. )., 314(5802):1139–43. http://dx.doi.org/10.
science.1131399.
1126/science.1131398.
Klumpp, S. and Hwa, T., 2008. Stochasticity and traffic
Schmittmann, B. and Zia, R. Statistical Mechan-
jamsinthetranscriptionofribosomalRNA:Intriguing
ics of Driven Diffusive Systems. In Domb, C.
role of termination and antitermination. Proc. Natl.
and Lebowitz, J. L., editors, Phase Transitions
Acad. Sci. U. S. A., 105(47):18159–18164. http://
Crit. Phenomena. Vol. 17, pages 3–214. Academic
dx.doi.org/10.1073/pnas.0806084105.
Press, London, 1995. http://dx.doi.org/10.1016/
Kolomeisky, A. B., 1998. Asymmetric simple exclu- S1062-7901(06)80014-5.
sion model with local inhomogeneity. J. Phys. A.
Math. Gen., 31(4):1153–1164. http://dx.doi.org/ Schu¨tz, G. M. Exactly Solvable Models for Many-
10.1088/0305-4470/31/4/006. Body Systems Far from Equilibrium. In Domb, C.
and Lebowitz, J. L., editors, Phase Transitions Crit.
Lakatos, G. and Chou, T., 2003. Totally asymmetric Phenom., chapter 1, pages 3–255. Academic Press,
exclusion processes with particles of arbitrary size. J. San Diego, 2001. http://dx.doi.org/10.1016/
Phys. A. Math. Gen., 36(8):2027–2041. http://dx. S1062-7901(01)80015-X.
doi.org/10.1088/0305-4470/36/8/302.
Selby, C. P., Drapkin, R., Reinberg, D., and Sancar, A.,
Liggett, T. M., 1975. Ergodic theorems for the 1997. RNA polymerase II stalled at a thymine dimer:
asymmetric simple exclusion process. Trans. Am. footprint and effect on excision repair. Nucleic Acids
Math. Soc., 213:237–61. http://dx.doi.org/10. Res., 25(4):787–793. http://dx.doi.org/10.1093/
1090/S0002-9947-1975-0410986-7. nar/25.4.787.
10