Table Of ContentJComputAidedMolDes(2017)31:29–44
DOI10.1007/s10822-016-9956-6
A combined treatment of hydration and dynamical effects
for the modeling of host–guest binding thermodynamics:
the SAMPL5 blinded challenge
Rajat Kumar Pal1,5 • Kamran Haider2 • Divya Kaur6 • William Flynn4,7 •
Junchao Xia4 • Ronald M Levy4 • Tetiana Taran3 • Lauren Wickstrom3 •
Tom Kurtzman2,5,6 • Emilio Gallicchio1,5,6
Received:17June2016/Accepted:25August2016/Publishedonline:30September2016
(cid:2)SpringerInternationalPublishingSwitzerland2016
Abstract As part of the SAMPL5 blinded experiment, we charged guests. Binding free energies obtained with
computed the absolute binding free energies of 22 host– Debye–Hu¨ckel treatment of salt effects were in good
guestcomplexesemployinganovelapproachbasedonthe agreement with experimental measurements. Water dis-
BEDAM single-decoupling alchemical free energy proto- placement effects contributed favorably and very signifi-
colwithparallelreplicaexchangeconformationalsampling cantly to the observed binding affinities; without it, the
and the AGBNP2 implicit solvation model specifically modeling predictions would have grossly underestimated
customized to treat the effect of water displacement as binding. The work validates the implicit/explicit solvation
modeled by the Hydration Site Analysis method with approach employed here and it shows that comprehensive
explicit solvation. Initial predictions were affected by the physical models can be effective at predicting binding
lack of treatment of ionic charge screening, which is very affinities of molecular complexes requiring accurate treat-
significant for these highly charged hosts, and resulted in ment of conformational dynamics and hydration.
poor relative ranking of negatively versus positively
Keywords SAMPL5 (cid:2) Hydration Site Analysis (HSA) (cid:2)
Debye–Hu¨ckel (cid:2) Salt effects (cid:2) AGBNP2 (cid:2) BEDAM
Theoriginalversionofthisarticlewasrevised:Correctionsdonein
theoriginalarticlehasbeenpublishedintheerratum.
Introduction
& EmilioGallicchio
[email protected]
Accurate modeling of hydration is important for under-
1 DepartmentofChemistry,BrooklynCollege,2900Bedford standing the thermodynamics of binding and molecular
Avenue,Brooklyn,NewYork11210,USA
recognition[1–9].Bindingfreeenergymodelswithexplicit
2 DepartmentofChemistry,LehmanCollege,TheCity solvation are considered as the ‘‘gold standard’’ for mod-
UniversityofNewYork,250BedfordParkBlvd.West,
eling receptor-ligand complexation because of the critical
Bronx,NewYork,NY10468,USA
role of desolvation, the hydrophobic effect, and bridging
3 BoroughofManhattanCommunityCollege,Departmentof
waters. In recent years, there has been significant progress
Science,TheCityUniversityofNewYork,199Chambers
in the development of computational tools to quantify the
Street,NewYork,NY10007,USA
structure and thermodynamics of water molecules on host
4 CenterforBiophysicsandComputationalBiology,Institute
and protein surfaces [10–15]. Many of these approaches
ofComputationalMolecularScienceandDepartmentof
Chemistry,TempleUniversity,Philadelphia,PA,USA have highlighted the importance of displacing water
5 Ph.D.PrograminBiochemistry,TheGraduateCenterofthe molecules from enclosed regions of protein active sites in
CityUniversityofNewYork,NewYork,NY10016,USA boosting the binding affinity [16–20].
6 Ph.D.PrograminChemistry,TheGraduateCenteroftheCity Inthiswork,weemployHydrationSiteAnalysis(HSA)
UniversityofNewYork,NewYork,NY10016,USA [10, 18, 20], a methodology based on Inhomogeneous
7 DepartmentofPhysicsandAstronomy,RutgersUniversity, SolvationTheory(IST)[21]tomapoutsolvationstructural
Piscataway,NJ08854,USA and thermodynamic features in protein binding sites. High
123
30 JComputAidedMolDes(2017)31:29–44
density regions of water, termed hydration sites, are clas- AGBNP hydration site spheres to mimic the thermody-
sified based on a number of energetic and structural mea- namicsofenclosedwatermolecules[24,32].Theresulting
sures such as the water-protein interaction energy, water hybrid explicit/implicit model was then employed to pre-
entropy and the strength of local water-water and water- dict the binding affinities of the SAMPL5 datasets. The
protein hydrogen bonding. These measures are used to SAMPL5 blinded experiment, based on the octa-acid, [33]
identifyregionsfromwhichthedisplacementofwaterleads tetra-endo-methyl octa-acid [34] and CB-clip host–guest
to either significant gains or significant penalties to the systems [35], offers a unique opportunity to validate the
predicted binding affinity. Ligands that displace high free approach outlinedhere.The tetra-endo-methylocta-acid is
energy water molecules are morelikelyto bind strongly. referred to as methyl octa-acid throughout the article. We
Moleculesareinherentlyflexible.Hence,computational have studied the octa-acid system before as part of the
models ofmolecular recognition can not ignore dynamical previous SAMPL challenge [36], where we trained the
aspects of binding. In this work, we employ the Binding AGBNP water sites empirically to reproduce available
Energy Distribution Analysis Method (BEDAM) [22–24], experimental data. Here instead, we intend to train the
a well established alchemical protocol for the quantitative model systematically using data from the HSA method,
prediction of absolute binding free energies. BEDAM has grounded in well established hydration theories. Further-
been specifically designed to capture conformational reor- more, the SAMPL5 datasets give us the opportunity to
ganization and entropic contributions to the binding free validate our approach in an unbiased manner on hosts for
energies. To overcome the slow time-scales of conforma- which experimental data is not available.
tional reorganization, the method employs parallel multi-
dimensional replica exchange conformational sampling
algorithms[25,26]coupledwithanimplicitrepresentation Methods
of the solvent [27–29].
The implicit solvent representation enables the formu- Hydration Site Analysis of the host binding cavity
lationofbindingintermsofthedirecttransferoftheligand
from the implicit solvent continuum to the receptor site We investigated the hydration properties of the three host
[22]. This single-decoupling feature circumvents some of molecules using Hydration Site Analysis (HSA) approach
the convergence difficulties encountered with double-de- [10,18].HSAutilizesanexplicitsolventMDsimulationof
coupling approaches with explicit solvation [30, 31]. Key the solute molecule to identify spherical sites where water
to the method is the sampling of the effective binding is present at high density. The thermodynamic quantities
energy function u(x), which represents the ligand-receptor forthesesites,suchasenthalpy,first-orderentropyandfree
interaction energy of a specific conformation x of the energy are calculated using inhomogeneous solvation the-
complex averaged over the conformations of the solvent. ory (IST) [21]. A detailed description of the simulation
The benefits of the single-decoupling approach is particu- setup used for HSA is given in the Computational Details
larly significant for large and charged ligands, as in the section. In summary, the explicit solvent simulation for
present application. The same approach is simply not fea- each of the host system in a restrained configuration is
sible withexplicitsolvation since itwouldentailalengthy processed to obtain hydration site locations where water
potential of mean force calculation for each evaluation of molecules are present with at least twice the bulk density.
the effective binding energy function [30]. The clustering procedure to obtain hydration site locations
The advantages of implicit solvation modeling in terms is described in detail elsewhere [20]. Briefly, for each
of conformational sampling and the single decoupling hydration site, the total energy of the water, E , is cal-
total
pathway are, however, counterbalanced by the lack of a culatedasthesumofitsmeanwater-waterE ,andsolute-
ww
detailed representation of hydration effects. A way around water interaction energies E , where E is one half the
sw ww
these conflicting requirements is to train the implicit sol- mean interaction energy of the water in the site with all
ventenergyfunctionbasedonexplicitsolvationdataaswe other waters, and E is one half the mean interaction
sw
set out to do in this work. BEDAM employs the Analytic energy of the water in the site with the solute. The factors
Generalized Born plus Non-Polar (AGBNP) model which of one half follow the convention that half of the interac-
has been specifically tailored for protein-drug binding tion energy in a pairwise interaction is assigned to each
applications[29].Oneimportantaspectofthemodelisthe molecule of the pair. The locations and average solvation
treatment of short ranged water-solute interactions by energies of the hydration sites obtained for each host
means of hydration site spheres [24]. molecule are shown in Fig. 1 and Table 1. The solvation
In this work, structural and thermodynamic data energies were compared to TIP3P neat liquid simulations
obtained from explicit solvent Hydration Site Analysis are todeterminewhichhydrationsitescontainedfavorableand
used to inform the placement and energetic strength of unfavorable water molecules relative to bulk solvent. The
123
JComputAidedMolDes(2017)31:29–44 31
Fig.1 Locationofhydrationsiteswithinthebindingcavityofthehostsasidentifiedfromhydrationsiteanalysis.aOcta-acid,bmethylocta-
acid,cCB-clip
Table1 SummaryofHydrationSiteAnalysis(HSA)resultsforthreehostsystemsinvestigatedinthisstudy
Siteid Location Occupancy(p) Esw Eww Etotal ½Etotal(cid:3)Ebulk(cid:4)p Nnbrs fenc
A. Octa-acid
0 Bottom 0.62 -4.57 -2.42 (cid:3)6:99 1.58 1.27 0.76
1 Top 0.40 -0.64 -7.59 (cid:3)8:23 0.52 3.06 0.42
2 Top 0.41 -0.61 -7.41 (cid:3)8:02 0.62 2.89 0.45
3 Top 0.40 -0.58 -7.62 (cid:3)8:19 0.53 3.12 0.41
4 Top 0.40 -0.58 -7.55 (cid:3)8:13 0.56 2.96 0.44
5 Middle 0.29 -1.48 -7.50 (cid:3)8:98 0.16 2.89 0.45
B. Methylocta-acid
0 Bottom 0.60 -4.52 -2.14 (cid:3)6:66 1.72 1.32 0.75
1 Top 0.36 -0.48 (cid:3)7:00 (cid:3)7:48 0.74 2.27 0.57
2 Top 0.31 -0.73 -6.84 (cid:3)7:57 0.61 2.54 0.52
3 Rim 0.31 -0.02 (cid:3)8:49 (cid:3)8:51 0.31 3.19 0.39
4 Top 0.32 -0.85 -6.83 (cid:3)7:68 0.60 2.50 0.52
5 Top 0.30 -0.62 -6.68 (cid:3)7:30 0.66 2.24 0.57
6 Rim 0.27 0.60 -9.67 (cid:3)9:06 0.66 3.64 0.31
C. CB-clip
0 Center 0.92 -5.76 -2.27 (cid:3)8:03 1.38 1.75 0.67
3 Naphthalenerings 0.42 -1.55 (cid:3)6:97 (cid:3)8:52 0.42 3.43 0.35
Allenergeticquantitiesareexpressedinkcal/mol.ForCB-clip,atotalofninehydrationsitewereobtainedinthecavity,thetwositeusedfor
AGBNP2parametrizationarereportedhere.E =-9.53kcal/mol
bulk
relative solvation energies and positions of the explicit below.TheHSAscores(listedinTable 1)aregivenbythe
hydration sites were used to position and parameterize the expression ½E ðiÞ(cid:3)E (cid:4)pðiÞ where i is the index of the
total bulk
strength of the AGBNP2 hydration site adjustments which HSAhydrationsites,p(i)isthewateroccupancyofthesite,
account for enclosed hydration effects. We chose to only E ðiÞ is the total energy of the site and E is the cor-
total bulk
useHSAenergiestoparameterizethestrengthofhydrogen responding reference value obtained from TIP3P neat
bond correction parameters because HSA entropies were water.Twowatermoleculesareconsideredtobefirst-shell
previouslyshowntobenon-predictiveinascoringfunction neighbors in a given MD frame if their oxygen atoms are
usedtopredictrelativebindingaffinitiesofligandsbinding separatedbylessthan3.5A˚ Thisdistancecriterionisbased
to Factor Xa [20]. For each system, we derive scores for onthe location ofthe minima ofthe oxygen-oxygen radial
hydration sites and apply them as corrections to AGBNP2 distributionfunctionforbulkwater[37].Themeannumber
sites for the corresponding system (Fig. 2a) as described of these first shell neighbors for water in each hydration
123
32 JComputAidedMolDes(2017)31:29–44
Fig. 2 a Position of the hydration spheres added to AGBNP2 between two naphthalene rings; the center of masses of carbons
parameters for each hosts, b the carbon atoms used to position the markedingreenwereusedtopositionwater-sitesinthemiddlecavity
hydration sphere in all three hosts; the carbon atoms marked in of the octa-acids; carbons marked in orange were used to model a
magentaareusedforpositioningthewatersitesonthetopcavityof watersiteatthebottomcavityoftheocta-acids;inCB-clip,awater
octa-acids;inCB-clip,thosecarbonsareusedtopositionwatersitein siteispositionedatthecenterofmassofthosefourcarbons
site is termed Nnbrs, and this first-shell neighbor count is DG ¼DG þDG þDG ð2Þ
h elec np hs
used to define the fractional enclosure of a hydration site,
Theelectrostaticcomponentofthehydrationfreeenergyis
byreferencingthenumberofitsfirst-shellneighborstothe
represented by a pair-wise descreening implementation of
mean number of first-shell neighbors of a TIP3P water
molecule in bulk, Nnbrs: the Generalized Born model [28, 40, 41]. The non-polar
bulk
component of the hydration free energy is modeled as the
Nnbrs sum of cavity and solute-solvent dispersions interactions
f ¼1(cid:3) ð1Þ
enc Nnbrs [28]. AGBNP2 includes algorithms to place and score
bulk
hydration spheres on the surface of the solute to describe
This quantity indicates the degree to which the water in a short-range solute-solvent interactions such as hydrogen
hydration site is blocked from contact with other water bonding [29]. The functional form of the short-ranged
molecules. The value of Nnbrs is 5.25 as calculated from hydrogen bonding component of the solvation free energy
bulk
TIP3P pure water. in Eq. (2) above is
X
DG ¼ h Sðw Þ
hs s s ð3Þ
The AGBNP2 implicit solvation model s
wherew isthewateroccupancyfactorofthesitemeasured
s
Here we employed the OPLS-AA/AGBNP2 effective as the fraction of the volume of the hydration site sphere
potential,whereOPLS-AA[38,39]representsthecovalent actually accessible to water defined as
and non-bonded inter-atomic interactions and solvation is
Vfree
modeled implicitly by the Analytic Generalized Born plus w ¼ s ð4Þ
s V
non-polar (AGBNP2) model [29]. The hydration free s
energyDGh iscomputedasthesumofelectrostaticDGelec, whereVs isthevolumeofthewatersphereandVsfree isthe
non-polar DGnp, and short-range solute-water hydrogen free volume of the water site, that is the volume not
bonding interaction DGhs: occludedbysoluteatoms,obtainedbysummingallthetwo
123
JComputAidedMolDes(2017)31:29–44 33
body,threebody,etc.overlapvolumesofthewatersphere sodium phosphate solution used for the measurements of
with the solute atoms i, j, k, etc. as [29] CB-clip guest binding, was used throughout.
X X X
Vfree ¼V (cid:3) V þ V (cid:3) V
s s si sij sijk ð5Þ
i i\j i\j\k
Binding free energy protocol
In Eq. (3), S is a switching function, and the h are
s
adjustable parameters which measure the strength of the Binding free energies were computed using the Binding
hydrogen bonding energy (more precisely the portion of it EnergyDistributionAnalysisMethod (BEDAM)[22].The
not captured by the Generalized Born model) [29]. The methodusesasingle-decouplingalchemicalapproachwith
specification of the solute atoms carrying such hydration implicit solvation. An alchemical progress parameter k is
spheresisaccomplishedbymeansofSMARTSpatterns.A introducedrangingfrom0,correspondingtotheuncoupled
parameterdatabaseisusedtomapSMARTSpatternstothe state of the complex, to 1, corresponding to the coupled
energeticparametersandplacementgeometryofhydration state of the complex. The k-dependent effective soft-core
spheres [29]. The h parameters are normally negative and hybrid potential energy function is:
s
adjusted to describe favorable solute-solvent hydrogen
U ðrÞ¼U ðrÞþkuðrÞ ð7Þ
k 0
bonding interactions. In this context, the DG term disfa-
hs
vors binding by increasing the ligand and receptor desol- where r ¼ðr ;r Þ are the atomic coordinates of the
A B
vationpenalties.Inaddition,inthiswork,weusethesame receptor-ligand complex and r and r denote the coordi-
A B
functionalformtomodelenclosedwatermolecules,which nates of the receptor and ligand, respectively. U ðrÞ¼
0
contribute favorably to binding, when displaced by the Uðr ÞþUðr Þisthepotentialenergyofthecomplexwhen
A B
ligand.Thekeydistinctionbetweenhydrogenbondingsites the receptor and ligand are uncoupled, i.e. as if they are
andenclosedhydrationsitesisthesignoftheh parameter, separatedatinfinitedistancefromeachother.Thequantity
s
whichispositiveforthelatterandnegativefortheformer. u(r) is the binding energy and is defined as the change in
The specific values of h assigned to enclosed hydration the effective potential energy of the complex for bringing
s
sites are derived from explicit solvent HSA analysis and the receptor and ligand from infinite separation to the
are given in the Results section below. conformation r of the complex:
uðrÞ¼Uðr ;r Þ(cid:3)ðUðr ÞþUðr ÞÞ ð8Þ
A B A B
Incorporation of Debye–Hu¨ckel parameters In this work we adopted a soft-core form of the binding
in the implicit solvation model energy mentioned above and as described [23, 43].
UWHAM analysis of the binding energy samples was
In this work the Generalized Born components of the carried as described [43] to obtain binding free-energy
AGBNP2 model included modifications based on the estimates and their corresponding statistical uncertainties.
Debye–Hu¨ckel screening aimed at modeling the effects of Average interaction energies, DE were obtained by aver-
b
electrolytes present in the aqueous solution. Specifically, aging the binding energy values collected at k=1 (the
we adopted the following functional form for the GB pair bound state). The uncertainty on interaction energies were
potential [42] measured as the standard error of the mean. The entropic
u ði;jÞ¼(cid:3)(cid:2) 1 (cid:3)e(cid:3)jrij(cid:3) qiqj contributiontothefreeenergy,theeffectofreorganization
GB (cid:2) (cid:2) rffihffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiiffiffi energy and reorganization free energy to binding were
out in ri2jþBiBje(cid:3)ðri2j=4BiBjÞ derived [26, 44] from binding energies and other thermo-
dynamic parameters as described in the computational
ð6Þ
details section. The uncertainties of all these quantities
where r is the distance between atoms i and j, (cid:2) ¼80 were computed by error propagation.
ij out
and (cid:2) ¼1 are the dielectric constants of the solvent and In this work, BEDAM conformational sampling is
in
the interior of the solute respectively; q and q are the acceleratedbymeansoftwo-dimensionalreplicaexchange
i j
partialchargesofatomsiandj.ThetermsB andB arethe along alchemical and temperature parameters [26, 45]. k
i j
Born radii of atoms i and j. The Debye–Hu¨ckel screening exchanges facilitate the mixing of intermolecular degrees
p
parameter is defined as j¼ ffi8ffiffipffiffiffikffiffiffiffiIffiffi where I is the ionic of freedom, while temperature exchanges activate
B
strength of the solution and k is the Bjerrum length intramolecular degrees of freedom thus allowing confor-
B
[40, 42]. For water at room temperature, k =7.0 A˚. An mational transitions to explore the conformational space
B
ionic strength of 0.12 M, corresponding to the 20 mM more efficiently [46].
123
34 JComputAidedMolDes(2017)31:29–44
Computational details the water molecules in those locations. For each host,
190, 000 frames of the trajectory were processed with
Explicit solvent calculations and Hydration Site Analysis HSA analysis. Hydration site locations were identified
(HSA) using a subset of 10,000 frames which were evenly
spaced in time. High density spherical regions (hydration
For each host molecule, a molecular dynamics simulation sites) of 1A˚ radius were identified using a clustering
was performed using the OPLS 2005 force field (Schro- procedure [10] on the water molecules that were found
dinger Inc.) in a TIP3P [47] cubic water box, using the within 5 A˚ of the host molecule. The resulting hydration
DESMOND package [48]. The initial startingstructure for sites were each populated by retrieving all water mole-
eachsimulationwasarepresentativeholostructurewithout cules, which had oxygen atoms within 1.0 A˚ from the
the ligand. These representative structures were obtained
corresponding hydration site center. The hydration sites
from BEDAM simulations of the octa-acid and methyl
were then enumerated according to their occupancies,
octa-acid hosts with guest 4 and the CB-Clip host with
with the highest populated site given the index 0. The
guest1.Eachwaterboxwasbuiltwithaminimumdistance
average solvation energies were calculated based on
of 10 A˚ between the solute and the edge of the box. The
energies of the water molecules that were within 1.0 A˚
default Desmond equilibration protocol was used to mini-
from the coordinates of each hydration site center in the
mize and thermalize the systems for the production run.
original 190,000 frames.
The system was further relaxed for 25 ns under NPT
conditions at 1 atm and 300 K. An NVT production run
was started from a configuration in which the box volume
Systempreparationforthebindingfreeenergycalculations
was close to the average of the previous NPT simulation.
The production MD simulations were run for 100 ns and
Thestructures ofthe octa-acid and CB-clip hostsandtheir
snapshotswerecollectedevery0.5ps.Thefirst1nsofthe
guests were prepared using Maestro (Schrodinger Inc.)
production run was discarded. In the octa-acid and methyl
starting from structure files provided by the SAMPL5
octa-acidsimulations,positionalrestraintswereusedonall
organizers. Bond orders and formal charges were adjusted
oftheheavyatomswithaforceconstantof5.0kcal/mol/A˚2 asappropriate.Alltheeightcarboxylategroupsoftheocta-
which allowed for free rotation of hydrogens. In the CB- acids were modeled as deprotonated resulting in a net
clip simulations, position restraints were used on all of the negativechargeof(cid:3)8.Similarly,thefoursulfonategroups
heavy atoms of the host with a force constant of 5.0 CB-cliphostweremodeledasallionized,resultinginanet
kcal/mol/A˚2, excluding the sulfonate groups. The use of overall charge of (cid:3)4. Alternate protonation and tau-
one rigid receptor should not have a significant effect on tomerization states of the CB-clip guests were generated
determining the hydration sites and the following free using the LigPrep facility (Schrodinger Inc.) using Epik
energycalculationsduetothenatureofthescalingfunction [49] with a pH range of 7±2. Ionization and tautomeriza-
used to calculate enclosure energies and the rigidity of the tion free energy penalties were recorded and added to the
SAMPL5 hosts. If the host structures are unrestrained BEDAM binding free energy estimates to compute the
duringHSAanalysis,theoccupanciesofthehydrationsites predictedbindingfreeenergies.LigPreppredictedmultiple
are expected to decrease due to the effect of the confor- protonation and tautomerization states for the CB-clip
mationalfluctuationsofthehostonthesolvent.Incontrast, guests. Multiple protonation states of guests 1, 4, 5, 6, 8
the energies of these hydration sites may become more and 9 were tested individually. Guests 3 and 5 which
favorable due to the ability of the solvent to optimize its contain alkylammonium groups were modeled as posi-
interactionswitheithertheneighboringwatermoleculesor tively charged and the guests containing carboxyl groups,
nearby solute atoms. These changes to the HSA energies guests 1, 2, and 4 were modeled as deprotonated each
and occupancies will have an impact on the global having a single negative charge. Guest 6, which is
parameter c and should not impact the enclosure energies nitrobenzoic acid, holds an overall single negative charge
usedinthefreeenergycalculations.Ontheotherhand,this contributed by the carboxyl moiety. To match the experi-
treatmentisinappropriateformoreflexiblehoststhatadopt mental pH, the acidic guests of the octa-acid set were
multiple bound states because the hydration site locations modeledasdeprotonatedandionizationpenalties werenot
will vary for each bound representative structure. SHAKE applied.
wasusedtoconstrainanybondsinvolvinghydrogenatoms. All individual guests were manually docked to the
Hydration Site Analysis involved two steps: (1) identi- bindingcavityofthehostsintwooftheocta-acidsandthe
fying regions of high water density around each host sur- in-betweenspaceofthetwonaphthaleneringsonCB-clip.
face and (2) calculating the average solvation energies of Joint Hamiltonian and temperature 2-D Asynchronous
123
JComputAidedMolDes(2017)31:29–44 35
Replica-exchange Molecular dynamics simulations [26] Results
employed18intermediate kstepsasfollows: k=0,0.002,
0.004, 0.008, 0.01, 0.02, 0.04, 0.07, 0.1, 0.17, 0.25, 0.35, The locations of hydration sites obtained for each host
0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 and 8 temperatures were moleculeareshowninFig. 1.Fortheocta-acidhost,atotal
distributed between 300 K and 534 K (T = 300, 328, 358 of six hydration sites are identified, one in the bottom
,390,424,458,494,534).Atotalof144replicaswereused cavity,oneinthemiddlecavityandfourintheuppercavity
corresponding to all possible pairing of temperature and k (Fig. 1,left).Forthemethylocta-acidhost,atotalofseven
values. The replica-exchange (AsyncRE) simulations were hydrationsitesareidentified,oneinthebottomcavity,four
started from energy-minimized and thermalized structures in the upper cavity and two at the level of the rim of the
fromthemanuallydockedmodels.Aflat-bottomharmonic host (Fig. 1, middle). The introduction of the methyl sub-
restraint with a tolerance of 5 A˚ between the centers of stituentsmainlycausespartialorderingofwatermolecules
massofthehostandtheguestwasapplied.Eachcycleofa near the rim of the octa-ocid host, which are otherwise
replica lastedfor100picosecondswith 1fstime-step. The bulk-like. In the particular conformation examined, two
average sampling time for a replica was approximately 10 partially ordered sites are found at the center of the mouth
ns. Calculations were performed on the campus computa- of the cavity(Fig. 1, middle). For the CB-clip host, a total
tional grids at Brooklyn College and at Temple University of nine hydration sites were identified, one of which is
and on the XSEDE SuperMIC cluster, the latter using 3 located at the center of the host and the remaining eight
compute nodes utilizing all CPU’s and both MIC devices. were symmetrically distributed, spanning the region from
The binding energies obtained from all replicas were ana- naphthalene rings to the outer edges of the cavity. In the
lyzed using UWHAM [43] method and the R-statistical bindingfreeenergycalculations(seebelow)weconsidered
package to compute the binding free energy DG(cid:5) and thecentralsiteplusthesitesandwichedinbetweenthetwo
b
average interaction energies DE of individual complexes. naphthalene rings (Fig. 1, right).
b
Energy values from the first 1 ns of each replica were In this work, as in the previous SAMPL4 host–guest
excluded from the free energy calculation. Calculation at edition [36] , we used custom AGBNP2 hydration spheres
multiple temperature allows estimation of the conforma- to model the effect of enclosed water molecules displaced
tional entropy of binding DS(cid:5) from the temperature upon ligand binding (see Methods). The idea is that these
conf
derivative of the binding free energy (here we used a sites have large water occupancy ðws ’1Þ in the unbound
simple finite difference estimator using adjacent binding state and small water occupancy ðws ’0Þ in the presence
free energy values). Reorganization free energies DG(cid:5) oftheboundligand.Hencepositivevaluesofthewatersite
reorg
were measured as the difference between the binding free energy parameter ðhsÞ can mimic the favorable effect of
energy and the average interaction energy at the bound water expulsion towards binding. The approach we fol-
state as lowed here for SAMPL5 is to place and score enclosed
hydration sites which are distinct from the empirical
DG(cid:5)reorg ¼DG(cid:5)b(cid:3)DEb ð9Þ approach adopted for the octa-acid host in SAMPL4. In
SAMPL4 hydration site parameterization was conducted
Finally, the reorganization energy for each complex was
essentially by trial and error in an attempt to reproduce
measured by subtracting the entropic component from the
known experimental data. In this work instead the place-
reorganization free energy:
ment and scoring of AGBNP2 hydration sites is based on
HSA analysis of explicit solvent data. Furthermore,
DE(cid:5) ¼DG(cid:5) þTDS(cid:5) ð10Þ
reorg reorg conf because of limitations of the AGBNP2 hydration site
Statistical uncertainties for the reorganization free energy, placement algorithm, in SAMPL4 some of the effects of
conformational entropy and reorganization energy were water expulsion were captured by empirical modifications
obtained from error propagation of the uncertainties of the of surface tension and solute-solvent van der Waals
binding free energy and the average binding energy. The parameters.Forthiswork,wedevelopednovelgeometrical
data submitted to SAMPL5 challenge were obtained from algorithms to place hydration sites based on geometrical
the 2D replica exchange binding free energy calculations centersofarbitrarygroupsofatomsandmoresignificantly,
without inclusion of salt effects. Uncertainties were thanks to HSA analysis, we were able to reduce substan-
reported as twice the standard error. The binding free tially, the number of empirically-derived parameters as
energy predictions were submitted to the SAMPL5 host– described next.
guest challenge on January 29, 2016. For our submissions, The locations of enclosed hydration sites identified by
the octa-acid, methyl octa-acid and CB-clip were assigned HSA analysis were reproduced as best as possible using
prediction IDs # 18, 20 and 10 respectively. AGBNP2 water spheres and the anchoring methods
123
36 JComputAidedMolDes(2017)31:29–44
available in AGBNP2 [29] and those developed for this octa-acid set [36]. Deviations from the experiments sug-
work. Solute anchoring points correspond to hydrogen gested that the HSA analysis over-emphasized the benefit
bondingsitessuchasalongpolarhydrogenbondsandlone of displacing water from the bottom site. To correct this
pairs in various geometries [29], in addition to anchoring deficiency we decided to assign h parameters by dis-
s
points based on geometrical centers of groups of atoms. tributing the total enclosure energy equally over the six
Waterspherelocationsdefinedinthiswaynaturallyfollow sites. This led to results in reasonable agreement with
solutemovements.Thefourtopsitesoftheocta-acidhost, experimental affinities with a scaling factor of c¼2:88
sites (1, 2, 3, and 4) in Table 1, were modeled with eight withoutionic screeningandc¼1:85withionicscreening.
AGBNP2 hydration spheres placed perpendicularly to The predictions for the octa-acid and methyl octa-acid
inner phenyl rings of the cavity (Fig. 2b, left). The middle hosts were produced using this scheme. For the CB-clip
hydration site of the octa-acid host (site 5, Table 1A) was host, lacking experimental evidence supporting one strat-
modeled by an AGBNP2 site placed at the geometrical egy over another, we employed the more physically rea-
center of a set of aromatic carbons as shown in Fig. 2b, sonableapproachofderivingh parametersfromindividual
s
left.Thebottomwatersite(site0inTable 1A)wasplaced enclosure energies as described above. The same scaling
similarly.Thesix hydration sitesidentifiedforthe methyl- factors derived for the octa-acid system were used for the
octa-acid host (Table 1B), were modeled using three CB-clip system.
AGBNP2hydrationspheresasindicatedinFig 2a,middle. The blinded binding free energy predictions DG(cid:5)
calc
The one at the rim ofthe cavity corresponds to sites 3 and submitted to the SAMPL5 challenge, obtained without the
6,thesecondinthetopregionofthecavitycorrespondsto inclusionofsalteffects,areshowninTables 2, 3and 4for
sites 1, 2, 4 and 5, and finally the bottom site. All these the octa-acid, methyl octa-acid and CB-clip sets, respec-
individual water sites were placed at the center of mass of tively and Fig. 3, together with the corresponding experi-
specific set of carbons surroundingthe three regionsin the mentalvalues.Thesepredictionsdidnotreproducewellthe
pocket (Fig. 2b, middle). For the CB-clip host, one water experimental measurements, with the most noticeable
sitewasplacedatthecenterofmassof4aromaticcarbons shortcoming being the significant overestimation of the
and4 alkyl carbonsinthe center ofthe cavity as shownin affinitiesofpositivelychargedguests(guests3and5ofthe
(Fig. 2b, right). The second water site was placed in octa-acid guests, and guests 6 and 10 of the CB-clip host).
betweenthetwonaphthaleneringsatthecenterofmassof Theresultswiththeinclusionofionicchargescreeningare
4 aromatic carbon atoms indicated (Fig. 2b, right). We inmuchbetteragreementwiththeexperiments.Rootmean
have not attempted to model hydration sites located above squareerrors(RMSE)relativetoexperiments(Tables 2, 3;
and below the CB-clip host cavity partly because of their Fig. 3a, b) are reduced by more than a factor of 2 for the
polar character already captured by AGBNP2 hydrogen- octa-acidhostandbymorethanafactorof3forthemethyl
bonding sites [29]. octa-acid host. The RMSE for the CB-clip set (Table 4;
The solvation parameters h associated with each Fig. 3c) is reduced from 4.7 to 3.6 kcal/mol. Introduction
s
AGBNP2 water site of the octa-acid host were derived of salt effects also improved the correlation between
from HSA analysis using the water enclosure energies as computational predictions and the experiments. Pearson
correlationcoefficients (r)improvedfrom0.66to0.89and
E ðiÞ¼c½E ðiÞ(cid:3)E (cid:4)pðiÞ ð11Þ
enclosure total bulk
from0.67to0.90forthemethylocta-acidandCB-clipsets,
wherep(i)isthewateroccupancyofthesiteand½E ðiÞ(cid:3) respectively.
total
E (cid:4)istheaverageenergyofthesiterelativetotheenergy The correlation coefficient for the octa-acid set
bulk
of bulk water (Table 1). Enclosure energies were rescaled improvedsignificantlywiththeinclusionofsalteffectsbut
by a global parameter, c, to obtain AGBNP2 hydration remained low. This can be considered an acceptable result
parameters for enclosed water spheres. The global scaling given that, with the exception of guest 4, the range of
parameterwasadjustedsoastoreproducetheexperimental experimental affinities of the octa-acid set is small (less
binding free energies of the SAMPL4 octa-acid set as than1.8kcal/mol).Thestrongestbinderinthissetisguest
described below. Scaled enclosure energies of hydration 4. The calculations do not reproduce this feature well.
sites corresponding to a single AGBNP2 hydration sphere Instead,guest3,apositivelychargedligand,ispredictedto
were aggregated into the h parameter of that sphere. bindthestrongestwith-8.8kcal/molwithionicscreening.
s
Conversely, the h parameters for AGBNP2 sites corre- The trend of overprediction of the affinity of positively
s
spondingtomorethanonehydrationsiteswereobtainedby charged ligands, while significantly reduced, remain even
distributing enclosure energies equally. after inclusion of ionic screening. The origin of the rela-
Using the scheme above, we were unable to reproduce tivelylargedeviationbetweenexperimentalandcalculated
accurately the experimental affinities for the SAMPL4 affinities for guest 4 is not clearly evident. This complex
123
JComputAidedMolDes(2017)31:29–44 37
Table2 Calculatedandexperimentalbindingfreeenergieswithandwithoutsalteffectsforocta-acidsetandthermodynamicdecompositionof
bindingfreeenergieswiththeinclusionofsalteffects
Guests Structure DG(cid:5)a Withoutsalteffects Withsalteffects
exp
DG(cid:5)b DG(cid:5)c DEd DG(cid:5)e DEf (cid:3)TDS(cid:5)g
calc calcDH b reorg reorg conf
G1 O (cid:3)5:4 (cid:3)2:2(cid:6)0:06 (cid:3)3:8(cid:6)0:08 (cid:3)15:5(cid:6)0:01 11.7±0.09 1.0±1.4 10.7±1.4
O–
G2 O (cid:3)4:7 (cid:3)6:3(cid:6)0:06 (cid:3)6:6(cid:6)0:09 (cid:3)15:9(cid:6)0:01 9.2±0.09 0.9±1.6 8.4±1.5
O–
N
G3 (cid:3)4:5 (cid:3)15:4(cid:6)0:06 (cid:3)8:8(cid:6)0:09 (cid:3)20:4(cid:6)0:01 11.6±0.09 -0.9±1.5 12.5±1.5
N+
G4 Br (cid:3)9:4 (cid:3)6:3(cid:6)0:08 (cid:3)6:8(cid:6)0:2 (cid:3)20:0(cid:6)0:02 13.2±0.1 1.3±2.6 11.9±2.6
O
–O
G5 (cid:3)3:7 (cid:3)11:3(cid:6)0:06 (cid:3)6:3(cid:6)0:1 (cid:3)16:6(cid:6)0:02 10.3±0.1 1.7±1.7 8.6±1.7
N+
G6 O O (cid:3)5:3 (cid:3)7:4(cid:6)0:06 (cid:3)7:0(cid:6)0:09 (cid:3)18:8(cid:6)0:02 11.8±0.1 0.2±1.6 11.6±1.6
N+
–O O–
RMSE 5.8 2.6
Correlationcoefficient(r) (cid:3)0:39 (cid:3)0:04
Allvaluesinkcal/mol.aExperimentalITCbindingfreeenergy[35].bPredictedbindingfreeenergywithoutadditionofioniceffects.cBinding
freeenergypredictionwiththeadditionofsalteffects.dAverageinteractionenergyoftheboundcomplexat=1.Theuncertaintyisestimatedas
thestandard errorofthemean. eReorganization free energyascalculatedusingEq.9, fReorganization energyorintra-molecular strainas
calculatedusingEq.10, gBindingentropycomputedbyfinitedifferencemethodandevaluatedat300K;theuncertaintyiscomputedbyerror
propagationforallthedecompositions
has the largest reorganization penalty in the set, which is negatively-charged counterpart, guest 7, is correctly pre-
surprising, given the rigidity of the guest. We ascribe this dicted to be a relatively weak binder. Guest 2 is correctly
effect to the side-ways orientation of the bromine atom predictedastheweakestbinderoftheset.Thebindingfree
relative to the carboxylate substituent. This causes con- energies of the guests in the middle of the pack are
formationalstraininthe hostandtiltingofthe carboxylate reproduced reasonably well and within the expected
to accomodate the bromine atom. We speculate that our accuracy limits of the model. The inclusion of ionic
model may be over-estimating the steric hindrance expe- screening has the general effect of weakening binding and
rienced by the bromine substituent. shifting the predictions closer to the experimental values.
Binding trends in the methyl octa-acid are better repro- For some guests the shift is very substantial (as much as 9
duced, partlythankstolargervariationsinaffinities inthis kcal/molforguest3),whiletheaffinitiesofotherguests(5,
set. Focusing on the more accurate results with ionic 8and9forexample)arebarelyaffectedbyionicscreening.
screening, guest 4, the strongest binding for the octa-acid These differences appear to be related to the number of
host, is here the weakest binder both experimentally and charged groups and the degree of solvent screening affor-
computationally.Thecalculationsreproducethelargelossof ded by alkyl substituents.
affinityofguest4uponmethylationofthehostreasonably In Tables 2, 3 and 4, we also report the results of free
well. Furthermore, in agreement with the experiments, the energy decomposition analyses [44] (values without ionic
calculationspredictthatthestrongestbinderisguest3. screeningarenotshown).AveragebindingenergiesðDE Þ,
b
The two strongest binders of the CB-clip host, guests 6 obtainedfrom theaverage ofthe binding energyvalues,u,
and 10, are correctly ranked the best by the computational in the bound state (k¼1), reflect the strength of ligand-
model, although their relative rankings are reversed. The host interactions, including desolvation. Subtracting these
complexwithguest6ispredictedtohaveanunreasonably from the binding free energies yields reorganization bind-
favorable binding free energy (-16.8 kcal/mol). Its ing free energies (DG(cid:5) ). Using data from higher
reorg
123
38 JComputAidedMolDes(2017)31:29–44
Table 3 Experimental and computed binding free energies with and without salt effects for methyl octa-acid hosts and thermodynamic
decompositionsofbindingfreeenergiescalculatedwithinclusionofsalteffects
Guests Structure DG(cid:5)a Withoutsalteffects Withsalteffects
exp
DG(cid:5)b DG(cid:5)c DEd DG(cid:5)e DEf (cid:3)TDS(cid:5)g
calc calcDH b reorg reorg conf
G1 O (cid:3)5:3 (cid:3)7:5(cid:6)0:06 -4.8±0.08 -17.2±0.01 12.4±0.08 0.8±1.4 11.5±1.4
O–
G2 O (cid:3)5:1 (cid:3)10:6(cid:6)0:07 -7.7±0.09 -17.6±0.01 9.9±0.1 1.4±1.7 8.5±1.6
O–
N
G3 (cid:3)6 (cid:3)16:2(cid:6)0:07 -9.4±0.1 -21.4±0.01 12±0.1 0.7±1.7 11.3±1.7
N+
G4 Br (cid:3)2:4 (cid:3)2:4(cid:6)0:1 -0.7±0.2 -21.1±0.02 20.4±0.2 6.7±2.7 13.7±2.7
O
–O
G5 (cid:3)3:9 (cid:3)14:5(cid:6)0:07 -5.7±0.1 -16.5±0.02 10.8±0.1 1.2±1.8 9.6±1.8
N+
G6 O O (cid:3)4:5 (cid:3)8:9(cid:6)0:06 -5.8±0.09 -18.0±0.02 12.2±0.1 1.4±1.6 10.8±1.6
N+
–O O–
RMSE 6.7 2.1
Correlationcoefficient(r) 0.66 0.89
Allvaluesinkcal/mol.aExperimentalITCbindingfreeenergy,exceptforG4andG5forwhichonlyNMRmeasurementsareavailable[35].
bPredictedbindingfreeenergywithoutadditionofioniceffects. cBindingfreeenergypredictionwiththeadditionofsalteffects. dAverage
interactionenergyoftheboundcomplexat=1.Theuncertaintyisestimatedasthestandarderrorofthemean. eReorganizationfreeenergyas
calculated using Eq.9, fReorganization energy or intra-molecular strain as calculated using Eq.10, gBinding entropy computed by finite
differencemethodandevaluatedat300K;theuncertaintyiscomputedbyerrorpropagationforallthedecompositions
simulated temperatures, the latter is further decomposed predictors of affinity. For example, guest 3, which is the
intoconformationalentropy((cid:3)TDS(cid:5) )andreorganization second weakest binder in the set, is predicted to have the
conf
energy (DE ) components. The conformational entropy strongest interactions with the host (-47.2 kcal/mol).
reorg
component measures the entropy loss for the formation of Conversely, guest 5, which is one of the top binders, has
the complex. The reorganization energy measures the relatively weak interactions with the host (-26.2 kcal/-
intramolecular energy change of the guest and the host as mol). Clearly, as observed previously [32, 36], reorgani-
theychangeconformationstobindtoeachother.Thelatter zation(entropiclossandintramolecularenergystrain)isan
is commonly referred as intramolecular energy strain important determinant of binding.
because it often (but not always) opposes binding. As
further discussed below, decompositions of binding free
energiesprovideusefulphysicalinterpretationsofobserved Discussion
bindingaffinitytrends.Asoftenobserved[32],thebinding
free energy is the result of a large compensation between Water molecules solvating hydrophobic enclosures exhibit
theaveragebindingenergyandreorganizationfreeenergy. unique thermodynamic and structural signatures. Because
Forexample,thepooraffinityofguest4tothemethylocta- they are in concave regions, they have fewer proximal
acid host can be ascribed primarily to a large and unfa- water neighbors with which they can form favorable
vorable reorganization energy (20.4 kcal/mol) rather than hydrogen bonding interactions. Simultaneously, due to the
the strength of the host–guest interaction energy (-21.1 hydrophobicityofthesolute,theyareunabletocompensate
kcal/mol), which is among the most favorable in the set. for these lost interactions by forming favorable hydrogen
The average binding energies for the CB-clip complexes bond contacts with the solute surface. As a result, these
have a particular wide range, from -20.2 kcal/mol to watermoleculesareenergeticallyunfavorablewithrespect
-47.2 kcal/mol. Interestingly, these are imperfect to bulk and their displacement from the surface into the
123
Description:Abstract As part of the SAMPL5 blinded experiment, we computed the absolute binding free energies of 22 host– guest complexes employing a novel approach based on the. BEDAM single-decoupling alchemical free energy proto- col with parallel replica exchange conformational sampling and the