DEER-PREdict: Software for efficient calculation of spin-labeling EPR and NMR data from conformational ensembles

Giulio Tesei, João M. Martins, Micha B. A. Kunze, Yong Wang, Ramon Crehuet, Kresten Lindorff-Larsen

The authors have declared that no competing interests exist.

https://doi.org/10.1371/journal.pcbi.1008551, Volume: 17, Issue: 1, Pages: 1-18

Article Type: Research Article Article History

Publisher: Public Library of Science

- Facebook
- Twitter
- Linkedin
- Whatsapp
Altmetric

Table of Contents

Introduction
Design and implementation
Results
Conclusion
Availability and future directions
Supporting information

Abstract

Owing to their plasticity, intrinsically disordered and multidomain proteins require descriptions based on multiple conformations, thus calling for techniques and analysis tools that are capable of dealing with conformational ensembles rather than a single protein structure. Here, we introduce DEER-PREdict, a software program to predict Double Electron-Electron Resonance distance distributions as well as Paramagnetic Relaxation Enhancement rates from ensembles of protein conformations. DEER-PREdict uses an established rotamer library approach to describe the paramagnetic probes which are bound covalently to the protein.DEER-PREdict has been designed to operate efficiently on large conformational ensembles, such as those generated by molecular dynamics simulation, to facilitate the validation or refinement of molecular models as well as the interpretation of experimental data. The performance and accuracy of the software is demonstrated with experimentally characterized protein systems: HIV-1 protease, T4 Lysozyme and Acyl-CoA-binding protein. DEER-PREdict is open source (GPLv3) and available at github.com/KULL-Centre/DEERpredict and as a Python PyPI package pypi.org/project/DEERPREdict.

The accurate description of the structure of a protein is pivotal to fully understand its biological function. A large fraction of eukaryotic proteins is intrinsically disordered or consists of multiple folded domains connected by disordered regions. The structure of these proteins is highly flexible and can only be described by large ensembles of conformations. The characterization of these ensembles can be achieved by integrating in silico molecular modelling and simulations with experiments. Here, we present DEER-PREdict, an open-source software program to conveniently and efficiently calculate the observables of two biophysical methods, namely double electron-electron resonance (DEER) and paramagnetic relaxation enhancement (PRE) nuclear magnetic resonance. Both techniques provide distance information for highly dynamic systems and involve labelling proteins at one or more sites with flexible probe molecules. The DEER-PREdict package combines previously developed and validated methods for placing multiple conformations of a nitroxide molecule at the protein sites with the rapid calculation of DEER and PRE observables from large ensembles of protein structures. Through examples, we illustrate the use of DEER-PREdict as a tool for interpreting experimental results, validating molecular models of flexible proteins as well as designing experiments.

Tesei,Martins,Kunze,Wang,Crehuet,Lindorff-Larsen,and Schneidman-Duhovny: DEER-PREdict: Software for efficient calculation of spin-labeling EPR and NMR data from conformational ensembles

This is a PLOS Computational Biology Software paper.

Introduction

A detailed understanding of protein function often requires an accurate description of the structure and dynamics of a protein. The characterization of protein complexes as well as multi-domain and disordered proteins is typically achieved by combining experimental techniques of distinct spatial resolution [1]. Among the many different experimental techniques that may be used, we focus here on (i) a pulsed electron paramagnetic resonance (EPR) technique called double electron-electron resonance (DEER) and (ii) a nuclear magnetic resonance (NMR) method called paramagnetic relaxation enhancement (PRE). While the two methods differ substantially in their physics and applications, they have in common that they generally involve adding so-called spin-labels to the protein of interest.

DEER, also sometimes known as pulsed electron-electron double resonance (PELDOR), [2–6] relies on probing magnetic dipole-dipole interactions that are sensitive to distributions of residue-residue distances ranging from ∼1.8 nm to ∼8 nm, and up to 16 nm in deuterated soluble proteins [7–10]. For proteins, DEER generally requires site-directed spin labeling (SDSL) to functionalize a pair of selected residues with paramagnetic probes, e.g. 1-Oxyl-2,2,5,5-tetramethylpyrroline-3-methyl methanethiosulfonate (MTSSL) [4].

PRE NMR also makes use of SDSL to provide information on the average proximity of protein backbone nuclei up to ∼3.5 nm away from the unpaired electron of the paramagnetic probe [11]. The dependence of the rate of relaxation enhancement on the electron-proton distance, r, scales as 〈r⁻⁶〉, making the measurement particularly sensitive to contributions from different probe conformations [11].

Since spin labels are conformationally dynamic, both protein and paramagnetic probes need to be described by conformational ensembles to obtain accurate predictions of DEER and PRE observables from molecular models [12–14]. Molecular dynamics (MD) simulations are one approach to obtain conformational ensembles that model the structure and dynamics of spin-labels for the calculation of EPR and NMR data [15–18]. While such analyses can provide unique insight into the motions of and interactions between protein and spin-label [19], they may be relatively expensive computationally. Further, many studies integrate results from multiple probe positions, or pairs thereof, which may be difficult to represent in a single MD simulation with explicit representations of the probes.

Another approach is to use conformational analysis of the spin-label combined with modelling of the dynamics [20–23]. Such analyses suggest that the conformational variation of spin-labelled sites is rotameric, i.e. it can be relatively well described by a finite number of defined structures. Thus, in the calculation of DEER data, rapid modeling of dynamic paramagnetic probes was made possible with the introduction of the rotamer library approach (RLA) applied to the MTSSL probe by Polyhach et al. [24].

Here, building and expanding on earlier work [3, 24–27], we developed a software tool for fast predictions of DEER and PRE observables from large conformational ensembles using the RLA. We present our implementation, distributed as the DEER-PREdict software, and test it against experimental data on HIV-1 Protease, T4 Lysozyme and the Acyl-CoA-Binding Protein. This software has been previously used for the calculation of both intra- and intermolecular DEER and PRE NMR data [28, 29], and has some overlap with the features in RotamerConvolveMD [25] (github.com/MDAnalysis/RotamerConvolveMD). DEER-PREdict is open-source, documented (deerpredict.readthedocs.io) and open to contributions from the community.

Design and implementation

DEER-PREdict is written in Python and is available as a Python API, which facilitates its integration within larger data pipelines. Predictions of DEER and PRE data are carried out via the DEERpredict and PREpredict classes. Both classes are initialized with protein structures (provided as MDAnalysis [30] Universe objects) and spin-labeled positions (residue numbers and chain IDs). As shown in the Results section, the calculations are triggered by the run function, which also sets additional attributes such as the paths of input and output files as well as experiment-specific parameters. Per-frame data is saved in compressed binary files (HDF5 and pickle files) to allow for fast calculations of ensemble averages in reweighting schemes.

For the presented software, we adopt a procedure of rotamer placement and evaluation of labeled sites which is analogous to the RLA of Polyhach et al. [24], and we build on this previous work to implement fast calculations of DEER and PRE observables from large structural ensembles, such as MD trajectories.

Rotamer library approach

Rotamer libraries have a long history in protein structural analysis [31], with an early application being to study side-chain packing [32]. Several other applications of this approach were later employed, e.g. in homology modeling and protein design [33, 34]. In our implementation, the RLA is used to insert the rotamer conformations of a paramagnetic probe at the spin-labeled site and to calculate the Boltzmann weight of each conformer. By default, we use the MTSSL 175 K rotamer library by Polyhach et al. [24], which was filtered to include only the χ₁ χ₂ conformations that are most commonly found in crystal structures of T4 Lysozyme [35]. As shown by Klose et al. [26], this selection criterion increases the accuracy of the calculated electron-electron distance distributions. The code is, however, general and it is possible to add new rotamer libraries by providing a text file containing the Boltzmann weights of each rotamer state $p_{i}^{i n t}$ , a topology file (PDB format) and a trajectory file (DCD format) where rotamers are aligned with respect to the the plane defined by Cα atom and C–N peptide bond. These files should be included in the lib folder and listed in the yaml file DEERPREdict/lib/libraries.yml. The default MTSSL 298 K MC/UFF CαSδ rotamer libraries of the Matlab-based MMM modeling toolbox [13] are also provided in the DEER-PREdict package.

Following the alignment of the rotamer to the protein backbone (Cα, C and N atoms), the calculation of the Boltzmann weights is based on the sum of internal, $ϵ_{i}^{i n t}$ , and external, $ϵ_{i}^{e x t}$ , energy contributions. The internal contribution is taken from Polyhach et al. [24] and results from the clustering of representative dihedral combinations from MD simulations. The normalized frequency of each cluster throughout the MD trajectory was used to determine the Boltzmann probability, $p_{i}^{i n t}$ , of a given i^th state, which readily can be converted into an internal energy contribution, $ϵ_{i}^{i n t}$ , via Boltzmann inversion. On the other hand, the external energy contribution is calculated on the fly as the dispersion interaction energy between heavy atoms of rotamer and protein residues within a 1-nm cutoff, using the pairwise 6-12 Lennard-Jones potential of the CHARMM36 force field, with atom sizes scaled by the input parameter sigma_scaling, which defaults to 0.5 as in the MMM modeling toolbox (http://www.epr.ethz.ch/software) [13].

The overall probability of the i^th rotamer state is then calculated as

where

Z = \sum_{i} p_{i}^{i n t} exp (- ϵ_{i}^{e x t} / k T)

is the steric partition function quantifying the fit of the rotamer in the embedding protein conformation. Low values of Z result from large probe-protein van der Waals interaction energies, suggesting a tight placement of the spin label either due to a displacement of the rotamers or indicative of a wild-type conformation made inaccessible by the presence of the MTSSL probe. Especially in folded proteins, probes located in closely packed regions are likely to induce changes in the ensemble of the spin-labeled protein compared to the native form, and should be avoided in designing SDSL experiments. Therefore, in the calculation of DEER or PRE NMR observables, frames with Z < 0.05 are discarded to preclude spurious conformers from contributing to the ensemble average []. For the MTSSL 175 K rotamer library, a Z cutoff of 0.05 is compatible with distributions of

ϵ_{i}^{e x t}

values where at most one of the 46 rotamers has

ϵ_{i}^{e x t} \approx 3

k_BT while the rest has

ϵ_{i}^{e x t} \leq 7

k_BT. We observed that the results shown in this paper are virtually insensitive to the choice of the Z cutoff between 0.05 and 0.5 (see ), therefore, in DEER-PREdict the default Z cutoff can be conveniently replaced by a user-provided value.

Predicting the DEER signal from structural ensembles

Electron-electron distance distributions extracted from DEER experiments, e.g. using the DeerLab package [36], have previously routinely been compared with distributions predicted using the RLA implemented in the Matlab-based MMM modeling toolbox (http://www.epr.ethz.ch/software) [13]. Since MMM intrinsically operates on single structures, we and others had to resort to wrapper scripts to compute distance distributions of large ensembles, such as MD trajectories [3, 25, 37]. With the program presented herein, we provide a tool to conveniently predict DEER distance distributions from large conformational ensembles, which can be easily integrated in reweighting schemes such as the Bayesian/maximum entropy procedure [1, 14, 38, 39].

For each trajectory frame or conformation of a given ensemble, the rotamers from the library are placed at the spin-labeled position (Fig 1A) and the distances between all pair combinations of N-O paramagnetic centers are calculated. The resulting matrix of pair-wise distances is then used to compute the distance distribution weighted by the combined probability of each probe conformation, p_i × p_j, with p_i and p_j being the conformation probabilities of rotamers i and j. After averaging over all the frames, a low-pass filter is applied to the distance distribution for noise reduction [40],

where

F

and

F^{- 1}

are the Fourier transform and inverse Fourier transform operators, respectively, whereas σ is the standard deviation of the low-pass filter. The resulting P(r) is a smooth curve even for the analysis of a single protein conformation (Fig 1B). The standard deviation of the low-pass filter can readily be provided by the user through the option filter_stdev of the run function in the DEERpredict class, overriding the default value of 0.5 Å. The average over the trajectory frames can be weighted by a user-specified list of weights e.g. to remove the bias from enhanced sampling simulations.

Fig 1

Probe placement scheme and comparison to DEER data.

(A) A pool of 46 conformations of the MTSSL probe from the rotamer library are aligned to the backbone of residues K55 and K55’ of HIV-1 protease. The color code represent the Boltzmann weights of each rotamer, increasing from blue to red. (B) Electron-electron distance distribution for HIV-1 protease spin labeled at residues K55 and K55’. The blue line is the experimental data from Torbeev et al. [44] whereas the red line is the prediction using DEER-PREdict and a crystal structure of HIV-1 protease (PDB code 3BVB).

The dipolar modulation signal can be back-calculated from the distance distribution, P(r), via the following integral [41]

K(r, t) is the DEER kernel

where FrC and FrS are Fresnel cosine and sine functions, and ω is the dipolar frequency

where μ₀ is the permeability of free space, μ_B is the Bohr magneton and g is the electron g-factor. The ranges of inter-probe distance and time are [0, r_max] and [t_min, t_max] with increments dr = 0.05 nm and dt, respectively. The default values r_max = 12 nm, t_min = 0.01 μs, t_max = 5.5 μs and dt = 0.01 μs can be overridden by the user. Following the correction of the experimental DEER time trace for the intermolecular background [, ], the resulting form factor can directly be compared with

where 0.02 ≤ λ ≤ 0.5 is the modulation depth of the experimental signal [], quantifying the efficiency of the DEER pump pulse [].

Prediction of PRE rates and intensity ratios

In analogy to the calculations of electron-electron distances to predict DEER distributions, we extended the use of the RLA to electron-proton separations to improve the accuracy of PRE predictions. We focus here is on PRE NMR experiments that probe the increase in transverse relaxation rates of any backbone proton due to the dipolar interaction with the unpaired electron of the paramagnetic probe:

where

R_{2}^{o x}

and

R_{2}^{r e d}

are the transverse relaxation rates in the presence of the spin label in the oxidized or reduced (diamagnetic) state, respectively. We note that it is also possible to measure PREs on other atoms and to probe longitudinal relaxation enhancement, and it would be possible to include such measurements in future versions of DEER-PREdict.

A description of the enhancement of the transverse relaxation due to dipole-dipole interactions in paramagnetic solutions was first proposed by Solomon and Bloembergen [45, 46]

where γ_I and ω_I are the gyromagnetic ratio and the Larmor frequency of the proton, respectively, whereas s_e is the electron spin quantum number, equal to 1/2 for nitroxide probe systems. The spectral density function J(ω_I) can be described using a model-free formalism [–], which takes into account the overall molecular tumbling in the external magnetic field as well as the internal motion of the spin label:

where

and

τ_r is the rotational correlation time of the protein, τ_s is the effective electron correlation rate and τ_i is the correlation time of the internal motion (effective correlation time of the spin label). For MTSSL probes, τ_s ≫ τ_r and τ_c ≈ τ_r []. The value of τ_c depends on protein size and structure and is generally of the order of 1–10 ns [, –]. For τ_i, values between 100 to 500 ps can be assumed, based on e.g. ¹⁵N spin relaxation rates and MD simulations [, ]. In general, τ_c and τ_i can be specified as user input in DEER-PREdict.

For the generalized order parameter, S, we use the factorization into contributions from radial and angular internal motions introduced by Brüschweiler et al. [49], $S^{2} = S_{r a d i a l}^{2} S_{a n g u l a r}^{2}$ . The expressions for $S_{r a d i a l}^{2}$ and $S_{a n g u l a r}^{2}$ were derived from a jump model that treats the N conformers of the rotamer library as N discrete states with equal probabilities (1/N) [50]. In reality, the various dihedral angles of the spin label have different free energy barriers, resulting in residence times between jumps ranging from less than 1 to several ns [17].

where r is the proton-electron distance and the brackets denote averages over the conformers weighted by the respective Boltzmann weights, p_i, i.e.

〈 r^{- 3} 〉 = \sum_{i}^{N} r_{i}^{- 3} p_{i}

and

〈 r^{- 6} 〉 = \sum_{i}^{N} r_{i}^{- 6} p_{i}

where Ω is the angle between the vectors r_i and r_j, connecting a backbone proton with the ith and jth rotamer states, respectively. The relaxation enhancement rate for a single protein structure is calculated using , and assuming that the motion of the paramagnetic label is much faster than the protein conformational changes, the ensemble average is estimated as

where M is the number of configurations or frames of the simulation trajectory. In the case of unbiased simulations, the statistical weights, w_l, are simply 1/M. Optionally, a list of weights can be provided by the user, e.g. to reweight a biased MD simulation [, ] or to incorporate the prediction of the PRE rates into a Bayesian/maximum entropy reweighting scheme [].

For samples with particularly high PRE rates it can be infeasible to obtain Γ₂ from multiple time-point measurements [60]. In such and other cases, the PRE is sometimes probed indirectly from the ratio of the peak intensities in ¹H,¹⁵N-HSQC spectra of the spin-labeled protein in the oxidized and reduced state. Assuming that the intensity of the proton magnetization decays exponentially—by transverse relaxation only—during the total INEPT time of the HSQC measurement [61], t_d, the intensity ratio is estimated as

Requirements and installation

The main requirements are Python 3.6–3.8 and MDAnalysis 1.0 [30, 62]. In an environment with Python 3.6–3.8, DEER-PREdict can readily be installed through the package manager PIP by executing

1 pip install DEERPREdict

Package stability

Tests reproducing DEER and PRE data for the protein systems studied in this article, as well as for a nanodisc [29], are performed automatically using Travis CI (travis-ci.com/github/KULL-Centre/DEERpredict) every time the code is modified on the GitHub repository. The same tests can also be run locally using the test running tool pytest.

Results

In the following, we present applications of our tool to the prediction of DEER distance distributions and PRE intensity ratios of three folded proteins.

The code snippets reported in this section pertain to DEER-PREdict version 0.1.7. A Jupyter Notebook to reproduce the results shown below (article.ipynb) can be found in the tests/data folder on the GitHub repository. Up-to-date documentation is available at deerpredict.readthedocs.io.

Case study 1: DEER data for HIV-1 protease

HIV-1 protease (HIV-1PR) is a homodimeric aspartic hydrolase involved in the cleavage of the gag-pol polyprotein complex. The inhibition of this process affects the life cycle of the HIV-1 virus, rendering it noninfectious [63]. The HIV-1PR monomer is composed of 99 residues and presents a structurally stable core region (residues 1-43 and 58-99) and a dynamic region characterized by a β-hairpin turn, called the flap (residues 44-57). The active site is located at the interstice between the core regions of the two monomers, in proximity to the catalytic D25 residues. This cavity is closed off by the dynamic flap regions, which are considered to act as a gate controlling the access to the active site. The dynamics of the flap regions are of utmost importance for the development of inhibitors, and have been extensively studied, both experimentally and in silico [44, 64–69]. Based on the relative position of the flaps, three main conformational states have been proposed. In X-ray crystallography, the closed state is typically observed for the ligand-bound enzyme (e.g. PDB codes 3BVB [70] and 2BPX [71]), the semi-open state is predominant for the apo form (e.g. PDB code 1HHP [72]) whereas the wide-open state has been observed for variants (e.g. PDB codes 1TW7 [73] and 1RPI [74]) [69]. In DEER measurements, these conformational states can be resolved by spin-labeling sites K55 and K55’ (see S1 Text and S2 Fig).

To assess the predictive ability of DEER-PREdict, we generated conformational ensembles of the HIV-1PR homodimer via two different approaches: (a) a single 500-ns unbiased MD simulation, and (b) four independent 125-ns MD simulations restrained with experimental residual dipolar couplings (RDC) data [58, 75] from Roche et al. [65, 66] (see S1 Text for methodological details). The initial configuration of our simulations is the X-ray crystal structure of the active-site mutant D25N bound to the inhibitor Darunavir (PDB code 3BVB).

Fig 2 presents a comparison of experimental DEER distance distributions and echo intensity curves with predictions from simulation trajectories of 1,000 frames sampled every 0.5 ns. The echo intensity curves are calculated using Eq 6, where the λ is estimated to 0.0922 by fitting the experimental dipolar evolution function to the corresponding curve derived from the experimental P(r) via Eq 3. For a single trajectory, the analysis is performed in 13 s on a 1.7 GHz processor by running the following code:

1 import MDAnalysis

2 from DEERPREdict.DEER import DEERpredict

3 u = MDAnalysis.Universe(’conf.pdb’,’traj.xtc’)

4 DEER = DEERpredict(u,residues =[55, 55],chains=[’A’,’B’],temperature = 298)

5 DEER.run()

The third line generates the MDAnalysis Universe object from an XTC trajectory and a PDB topology. The fourth line initializes the DEERpredict object with the spin-labeled residue numbers and the respective chain IDs. The fifth line runs the calculations and saves per-frame and ensemble-averaged data to res-55-55.hdf5 and res-55-55.dat, respectively, as well as the steric partition functions of sites K55 and K55’ to the file res-Z-55-55.dat.

Fig 2

Comparing experiments and simulations for HIV1-PR.

DEER distance distributions (A) and echo intensity curves (B) obtained by Torbeev et al. [44] from DEER experiments (blue), and calculated using DEER-PREdict from unbiased (orange) and RDC ensemble-biased MD simulations (red).

In the experimental distance distribution, the main peak at ∼3.3 nm corresponds to the closed state whereas the second peak between 4 and 5 nm is characteristic of the wide-open state. The shoulder peak at ∼2.8 nm has been identified as an open-like state known as the curled/tucked conformation [9, 76, 77]. The results of our unbiased and restrained simulations are in substantial agreement with the findings of Roche et al. [65, 66], indicating that the flaps of the inhibitor-free HIV-1PR are predominantly in closed conformation. Compared to the distance distribution calculated from the starting configuration of PDB code 3BVB (see Fig 1), predictions based on MD trajectories more accurately reproduce the shape of the shoulder and the main peak of the experimental P(r). Moreover, using the RDC data as restraints leads to a significant improvement in the agreement between simulations and experiments, with the RMSD decreasing from 0.07 for the unbiased to 0.03 for the RDC ensemble-biased simulations. However, in the simulations we do not observe the wide-open state. This discrepancy could be due to insufficient sampling or could be attributed to the difference in sequence between the simulated protein and the experimental construct.

Case study 2: DEER data for T4 lysozyme

Lysozyme from the T4 bacteriophage (T4L) has long been used as a model system in the study of protein structure and dynamics [78–83]. Here, we focus on the L99A and the triple L99A-G113A-R119P mutants which are structurally similar and mainly differ in the relative populations of their major conformational states. The L99A variant presents a 150 Å³ hydrophobic pocket capable of binding hydrophobic ligands and has been thoroughly studied to further our understanding of the dynamics and selectivity of the binding pocket [78, 84]. The L99A variant occupies two distinct conformational states: the ground state (G) and the transient excited state (E), amounting for 97% and 3% of the population, respectively. The large-scale motions converting the G into the E state occur on the millisecond time scale and result in the occlusion of the cavity, which is occupied by the side chain of F114 in the E state [82]. The additional G113A and R119P mutations in the triple-mutant variant interconvert the populations of the conformational states to 4% for the G state and 96% for the E state [82]—note that, here and in the following, we refer to the G and E states based on their structural similarity to the L99A variant rather than on their relative populations. These conformational equilibria have been studied by DEER for various pairs of spin-labeled sites, which effectively resolve the G and E states as separate peaks of the P(r) [83].

Here, we compare DEER distance distributions calculated with DEER-PREdict for two pairs of probe positions (D89C–T109C and T109C–N140C) with the corresponding experimental data by Lerch et al. [83]. First, we calculate the P(r) of the single states using PDB code 3DMV for the G states and PDB codes 2LCB and 2LC9 for the E states of single and triple mutants, respectively. Second, the P(r)’s are linearly combined based on the experimentally derived ratios of G and E populations (97:3 for L99A and 4:96 for L99A-G113A-R119P) [82]. Additionally, we predict DEER distance distributions from previously reported metadynamics MD simulations of L99A and L99A-G113A-R119P [80] (see S1 Text for methodological details). In these calculations, the average over the trajectory is weighted by exp(F_bias/k_BT), where F_bias is the final static bias for each frame and k_BT is the thermal energy. The analysis of a trajectory of 6,670 frames is performed in 3 min on a 1.7 GHz processor executing the following lines of code:

1 import MDAnalysis

2 from DEERPREdict.DEER import DEERpredict

3 import numpy as np

4 u = MDAnalysis.Universe(’conf.pdb’,’traj.xtc’)

5 for residues in [[89, 109],[109, 140]]:

6 DEER = DEERpredict(u,residues = residues,temperature = 298,z_cutoff = 0.1)

7 DEER.run(weights = np.exp(Fbias/(0.298*8.3145)))

In line six, we specify the positions of the spin-labels, the temperature at which the metadynamics simulations were performed and a non-default value for the Z cutoff. In line seven, we provide the weights of each trajectory frame, generated from the array of F_bias values.

Fig 3 shows a comparison between the experimental distance distributions obtained by Lerch et al. [83] and our predictions. In general, the calculated distributions fall within the experimental ranges of inter-probe distances and are particularly accurate for the D89C–T109C spin-labeled pair in metadynamics simulations. The sharper shape of the experimental P(r)’s, relative to the calculated distributions, could be due to the cryogenic temperatures at which DEER experiments are conducted, whereas simulations were performed at room temperature. For the T109C–N140C spin-labeled pair of the triple variant, the discrepancy between predicted and calculated P(r)’s might be explained by considering that distances shorter than 1.5 nm fall below the range probed by DEER experiments. On the other hand, the inaccurate predictions of the T109C–N140C P(r) for the single (L99A) variant is greater than expected. Such discrepancies may be due both to errors in the protein structure or in the DEER-calculations. While our results cannot distinguish between these scenarios, we follow previous work [14] by examining whether the discrepancies can can be attributed to the error on the Boltzmann probabilities of the rotamer states, $p_{i}^{i n t}$ . We thus use a Bayesian/maximum entropy (BME) procedure to show that a small change in the original rotamer weights can lead to a substantial improvement of the agreement with the experimental data (see S1 Text and S3 Fig).

Fig 3

Comparing experiments with simulations and structures of T4 lysozyme variants.

DEER distance distributions for probe positions (A) D89C–T109C and (B) T109C–N140C of the single (blue) and the triple variant (red). Solid lines are the experimental data by Lerch et al. [83], dotted lines are calculated from PDB codes and dashed lines are predictions from metadynamics (MTD) simulations by Wang and coworkers [80].

Case study 3: PRE data for Acyl-CoA-binding protein

The RLA is well known in the EPR community and generally favored over e.g. a Cα-based approach as discussed elsewhere [3, 13, 26]. In the presented software, we apply the same improved modeling of the probe flexibility also to the prediction of PRE rates and intensity ratios.

Our test data is the PRE data for the bovine Acyl-coenzyme A Binding Protein (ACBP) reported by Teilum et al. [53]. In this study the structural behavior of ACBP under native and mildly-denaturing conditions was investigated via the SDSL of five positions in the amino acid sequence: T17C, V36C, M46C, S65C and I86C. Here, we focus on the native state of ACBP for which an NMR structure comprising 20 conformers has been refined from residual dipolar couplings (RDC) and deposited in the Protein Data Bank (PDB code 1NTI). Fig 4 shows a comparison between the experimental data and the intensity ratios calculated from the Γ₂ values averaged over the 20 conformations of the PDB entry. A good overall agreement is achieved across the different probe positions. Notably, using the RDC-refined structure, we reproduce most of the structural features observed in the PRE experiments, including the proximity of residues 24, 27, 31 and 34 to the spin-labeled residue 86, which is consistent with a helix-turn-helix motif. The predicted intensity ratios are generated in 1.5 s on a 1.7 GHz processor executing the following code:

1 import MDAnalysis

2 from DEERPREdict.PRE import PREpredict

3 u = MDAnalysis.Universe(’1nti.pdb’)

4 for res in [17, 36, 46, 65, 86]:

5 PRE = PREpredict(u,res,temperature = 298,atom_selection=’H’)

6 PRE.run(tau_c = 2e-09,tau_t = 2*1e-10,delay = 1e-2,r_2 = 12.6,wh = 750)

At line three, we load PDB code 1NTI as an MDAnalysis Universe object. We then use a for loop to calculate the PRE data from the distances between amide protons and the spin-label N-O group at five different positions along the amino acid sequence. In the last line we specify τ_c = 2 ns, τ_t = 0.2 ns, t_d = 10 ms, R₂ = 12.6 s⁻¹ and ω_I = 2π × 750 MHz. Per-frame and ensemble-averaged PRE data are automatically saved to files named res-*.pkl and res-*.dat, respectively, whereas per-frame steric partition functions are saved to res-Z-*.dat.

Fig 4

Calculated and experimental PRE HSQC intensity ratios for the T17C, V36C, M46C, S65C and I86C mutants of ACBP.

Blue lines represent the experimental data [53], with the associated ±0.1 error shown by the blue shaded areas. Red lines represent intensity ratios calculated from PDB code 1NTI with τ_c = 2 ns, τ_t = 0.2 ns, t_d = 10 ms, R₂ = 12.6 s⁻¹.

As detailed in S1 Text and S4 Fig, the steric partition functions provided by DEER-PREdict can be used to predict whether a position in the sequence is likely to accommodate the paramagnetic probe within the wild-type structure. Besides aiding the interpretation of experimental data, this feature can be instrumental to designing and enhancing the success-rate of time- and labor-intensive SDSL experiments.

As previously discussed, the explicit treatment of the paramagnetic probe may be crucial for the accurate back-calculation of DEER data, and even more so for PRE predictions, due to the 〈r⁻⁶〉-dependence of the PRE. A common way to restrain MD simulations or to back-calculate PRE experimental data without explicitly simulating the paramagnetic probe is to approximate the electron location to the position of the Cβ atom of the spin-labeled residue [85]. The advantage of this approach is that (a) multiple labeling sites can be analyzed in a single simulation and (b) the explicit atom is present in the simulation making the calculation of PREs straightforward. Cβ-based calculations may, however, be prone to over- or underestimating electron-proton distances by several Å, thereby introducing a systematic error. The impact of the Cβ-approximation on the accuracy of PRE predictions is illustrated in S5 and S6 Figs for the case of ACBP (see also S1 Text).

Conclusion

We have introduced an open-source software program with a fast implementation of the RLA in tandem with protein ensemble averaging, for the calculation of DEER and PRE data. Using three examples, we have highlighted the capabilities of our implementation: (a) the extension of the RLA for DEER data from a protein ensemble and (b) the calculation of PRE rates and intensity ratios with the same approach.

The structural interpretation of DEER and PRE measurements requires an accurate treatment of the structure and conformational heterogeneity of the spin labels. In the presented software, this is achieved using the RLA and, in the case of the PRE, a model-free approach to describe the dynamics. Relative to simulations of the explicitly spin-labeled mutants, the RLA presents the particular advantage of enabling the prediction for multiple SDSL experiments from a single simulation of the wild type sequence.

Availability and future directions

The software is implemented using the popular trajectory analysis package MDAnalysis, version 1.0 [30] and is available on GitHub at github.com/KULL-Centre/DEERpredict. DEER-PREdict is also distributed as a PyPI package (pypi.org/project/DEERPREdict) and archived on Zenodo (DOI: 10.5281/zenodo.3968394). DEER-PREdict and MDAnalysis are published and distributed under GPL licenses, version 3 and 2, respectively.

DEER-PREdict has a general framework and can be readily extended to encompass non-protein biomolecules as well as additional rotamer libraries of paramagnetic groups. Moreover, the software can be augmented with a module to predict Förster resonance energy transfer data, combining the insertion routines already implemented for MTSSL probes with rotamer libraries for fluorescent dyes.

Acknowledgements

We thank Robert Best for help with RDC-restrained simulations as well as work on extending DEER-PREdict to use for prediction of FRET experiments.

References

SOrioli, AHLarsen, SBottaro, KLindorff-Larsen. How to learn from inconsistencies: Integrating molecular simulations with experimental data In: Computational Approaches for Understanding Dynamical Systems: Protein Folding and Assembly. Elsevier; 2020 p. 123–176.

MPannier, SVeit, AGodt, GJeschke, HWSpiess. Dead-Time Free Measurement of Dipole–Dipole Interactions between Electron Spins. Journal of Magnetic Resonance. 2000;142(2):331–340. 10.1006/jmre.1999.1944

CBagnéris, KBRogala, MBaratchian, VZamfir, MBAKunze, SDagless, et al Probing the Solution Structure of IκB Kinase (IKK) Subunit γ and Its Interaction with Kaposi Sarcoma-associated Herpes Virus Flice-interacting Protein and IKK Subunit β by EPR Spectroscopy. Journal of Biological Chemistry. 2015;290(27):16539–16549. 10.1074/jbc.M114.622928

JPKlare, HJSteinhoff. Spin labeling EPR. Photosynthesis Research. 2009;102(2-3):377–390. 10.1007/s11120-009-9490-7

GPhan, HRemaut, TWang, WJAllen, KFPirker, ALebedev, et al Crystal structure of the FimD usher bound to its cognate FimC–FimH substrate. Nature. 2011;474(7349):49–53. 10.1038/nature10109

DKlose, NVoskoboynikova, IOrban-Glass, CRickert, MEngelhard, JPKlare, et al Light-induced switching of HAMP domain conformation and dynamics revealed by time-resolved EPR spectroscopy. FEBS Letters. 2014;588(21):3970–3976. 10.1016/j.febslet.2014.09.012

TSchmidt, MAWälti, JLBaber, EJHustedt, GMClore. Long Distance Measurements up to 160 Å in the GroEL Tetradecamer Using Q-Band DEER EPR Spectroscopy. Angewandte Chemie International Edition. 2016;55(51):15905–15909. 10.1002/anie.201609617

THEdwards, SStoll. A Bayesian approach to quantifying uncertainty from experimental noise in DEER spectroscopy. J Magn Reson. 2016;270:87–97. 10.1016/j.jmr.2016.06.021

TMCasey, GEFanucci. Spin labeling and Double Electron-Electron Resonance (DEER) to Deconstruct Conformational Ensembles of HIV Protease In: Methods in Enzymology. Elsevier; 2015 p. 153–187.

GJeschke. DEER Distance Measurements on Proteins. Annual Review of Physical Chemistry. 2012;63(1):419–446. 10.1146/annurev-physchem-032511-143716

GMClore, JIwahara. Theory, practice, and applications of paramagnetic relaxation enhancement for the characterization of transient low-population states of biological macromolecules and their complexes. Chem Rev. 2009;109(9):4108–4139. 10.1021/cr900033p

PFajer, MFajer, MZawrotny, WYang. Simulation of spin label structure and its implication in molecular characterization. Methods in enzymology. 2015;563:623.

GJeschke. MMM: A toolbox for integrative structure modeling. Protein Science. 2017;27(1):76–85. 10.1002/pro.3269

KReichel, LSStelzl, JKöfinger, GHummer. Precision DEER distances from spin-label ensemble refinement. The journal of physical chemistry letters. 2018;9(19):5748–5752. 10.1021/acs.jpclett.8b02439

DEBudil, KLSale, KAKhairy, PGFajer. Calculating slow-motional electron paramagnetic resonance spectra from molecular dynamics using a diffusion operator approach. The Journal of Physical Chemistry A. 2006;110(10):3703–3713. 10.1021/jp054738k

FDing, MLayten, CSimmerling. Solution structure of HIV-1 protease flaps probed by comparison of molecular dynamics simulation ensembles and EPR experiments. Journal of the American Chemical Society. 2008;130(23):7184–7185. 10.1021/ja800893d

DSezer, JHFreed, BRoux. Parametrization, molecular dynamics simulation, and calculation of electron spin resonance spectra of a nitroxide spin label on a polyalanine α-helix. The journal of physical chemistry B. 2008;112(18):5755–5767. 10.1021/jp711375x

YXue, NRSkrynnikov. Motion of a disordered polypeptide chain as studied by paramagnetic relaxation enhancements, 15N relaxation, and molecular dynamics simulations: how fast is segmental diffusion in denatured ubiquitin? Journal of the American Chemical Society. 2011;133(37):14614–14628. 10.1021/ja201605c

SSasmal, JLincoff, THead-Gordon. Effect of a Paramagnetic Spin Label on the Intrinsically Disordered Peptide Ensemble of Amyloid-β. Biophysical journal. 2017;113(5):1002–1011. 10.1016/j.bpj.2017.06.067

BRobinson, LSlutsky, FAuteri. Direct simulation of continuous wave electron paramagnetic resonance spectra from Brownian dynamics trajectories. The Journal of chemical physics. 1992;96(4):2609–2616. 10.1063/1.462869

HJSteinhoff, WLHubbell. Calculation of electron paramagnetic resonance spectra from Brownian dynamics trajectories: application to nitroxide side chains in proteins. Biophysical journal. 1996;71(4):2201–2212. 10.1016/S0006-3495(96)79421-3

FTombolato, AFerrarini, JHFreed. Dynamics of the nitroxide side chain in spin-labeled proteins. The Journal of Physical Chemistry B. 2006;110(51):26248–26259. 10.1021/jp0629487

FTombolato, AFerrarini, JHFreed. Modeling the effects of structure and dynamics of the nitroxide side chain on the ESR spectra of spin-labeled proteins. The Journal of Physical Chemistry B. 2006;110(51):26260–26271. 10.1021/jp062949z

YPolyhach, EBordignon, GJeschke. Rotamer libraries of spin labelled cysteines for protein studies. Phys Chem Chem Phys. 2011;13(6):2356–2366. 10.1039/C0CP01865A

LSStelzl, PWFowler, MSPSansom, OBeckstein. Flexible Gates Generate Occluded Intermediates in the Transport Cycle of LacY. J Mol Biol. 2014;426(3):735–751. 10.1016/j.jmb.2013.10.024

DKlose, JPKlare, DGrohmann, CWMKay, FWerner, HJSteinhoff. Simulation vs. Reality: A Comparison of In Silico Distance Predictions with DEER and FRET Measurements. PloS One. 2012;7(6):e39492 10.1371/journal.pone.0039492

LSalmon, GNodet, VOzenne, GYin, MRJensen, MZweckstetter, et al NMR Characterization of Long-Range Order in Intrinsically Disordered Proteins. J Am Chem Soc. 2010;132(24):8407–8418. 10.1021/ja101645g

Milkovic NM, Thomasen FE, Cuneo MJ, Grace CR, Martin EW, Nourse A, et al. Interplay of folded domains and the disordered low-complexity domain in mediating hnRNPA1 phase separation. BioRxiv [Preprint]. 2020; bioRxiv 2020.05.15.096966 [cited 2020 Aug 12]. Available from: 10.1101/2020.05.15.096966.

TBengtsen, VLHolm, LRKjølbye, SRMidtgaard, NTJohansen, GTesei, et al Structure and dynamics of a nanodisc by integrating NMR, SAXS and SANS experiments with molecular dynamics simulations. eLife. 2020;9 10.7554/eLife.56518

NMichaud-Agrawal, EJDenning, TBWoolf, OBeckstein. MDAnalysis: A toolkit for the analysis of molecular dynamics simulations. Journal of Computational Chemistry. 2011;32(10):2319–2327. 10.1002/jcc.21787

RLDunbrackJr. Rotamer libraries in the 21st century. Current opinion in structural biology. 2002;12(4):431–440. 10.1016/S0959-440X(02)00344-5

JWPonder, FMRichards. Tertiary templates for proteins. J Mol Biol. 1987;193(4):775–791. 10.1016/0022-2836(87)90358-5

MJBower, FECohen, RLDunbrack. Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: A new homology modeling tool. J Mol Biol. 1997;267(5):1268–1282. 10.1006/jmbi.1997.0926

JRDesjarlais, TMHandel. De novo design of the hydrophobic cores of proteins. Protein Sci. 1995;4(10):2006–2018. 10.1002/pro.5560041006

MRFleissner, DCascio, WLHubbell. Structural origin of weakly ordered nitroxide motion in spin-labeled proteins. Protein Sci. 2009;18(5):893–908. 10.1002/pro.96

LFIbáñez, GJeschke, SStoll. DeerLab: A comprehensive toolbox for analyzing dipolar EPR spectroscopy data Magnetic Resonance Discussions. 2020;.

HGöddeke, MHTimachi, CAJHutter, LGalazzo, MASeeger, MKarttunen, et al Atomistic Mechanism of Large-Scale Conformational Transition in a Heterodimeric ABC Exporter. Journal of the American Chemical Society. 2018;140(13):4543–4551. 10.1021/jacs.7b12944

JKöfinger, LSStelzl, KReuter, CAllande, KReichel, GHummer. Efficient ensemble refinement by reweighting. Journal of chemical theory and computation. 2019;15(5):3390–3401. 10.1021/acs.jctc.8b01231

SBottaro, TBengtsen, KLindorff-Larsen. Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach In: Methods in Molecular Biology. Springer US; 2020 p. 219–240.

JHayes. Some observations on digital smoothing of electroanalytical data based on the Fourier Transformation. Analytical Chemistry. 1973;45(2):277–284.

SGWorswick, JASpencer, GJeschke, IKuprov. Deep neural network processing of DEER data. Science Advances. 2018;4(8):eaat5218 10.1126/sciadv.aat5218

GJeschke, VChechik, PIonita, AGodt, HZimmermann, JBanham, et al DeerAnalysis2006—a comprehensive software package for analyzing pulsed ELDOR data. Applied Magnetic Resonance. 2006;30(3-4):473–498. 10.1007/BF03166213

Ddel Alamo, MHTessmer, RAStein, JBFeix, HSMchaourab, JMeiler. Rapid Simulation of Unprocessed DEER Decay Data for Protein Fold Prediction. Biophysical Journal. 2020;118(2):366–375. 10.1016/j.bpj.2019.12.011

VYTorbeev, HRaghuraman, KMandal, SSenapati, EPerozo, SBHKent. Dynamics of “Flap” Structures in Three HIV-1 Protease/Inhibitor Complexes Probed by Total Chemical Synthesis and Pulse-EPR Spectroscopy. J Am Chem Soc. 2009;131(3):884–885. 10.1021/ja806526z

ISolomon. Relaxation Processes in a System of Two Spins. Phys Rev. 1955;99(2):559 10.1103/PhysRev.99.559

NBloembergen. Proton Relaxation Times in Paramagnetic Solutions. J Chem Phys. 1957;27(2):572–573. 10.1063/1.1743771

GLipari, ASzabo. Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 1. Theory and range of validity. Journal of the American Chemical Society. 1982;104(17):4546–4559. 10.1021/ja00381a009

ETOlejniczak, CMDobson, MKarplus, RMLevy. Motional averaging of proton nuclear Overhauser effects in proteins. Predictions from a molecular dynamics simulation of lysozyme. J Am Chem Soc. 1983;106:1923–1930. 10.1021/ja00319a004

RBrueschweiler, BRoux, MBlackledge, CGriesinger, MKarplus, RRErnst. Influence of rapid intramolecular motion on NMR cross-relaxation rates. A molecular dynamics study of antamanide in solution. J Am Chem Soc. 1991;114:2289–2302. 10.1021/ja00033a002

JIwahara, CDSchwieters, GMClore. Ensemble Approach for NMR Structure Refinement against 1H Paramagnetic Relaxation Enhancement Data Arising from a Flexible Paramagnetic Group Attached to a Macromolecule. J Am Chem Soc. 2004;126(18):5879–5896. 10.1021/ja031580d

GMClore, JIwahara. Theory, Practice, and Applications of Paramagnetic Relaxation Enhancement for the Characterization of Transient Low-Population States of Biological Macromolecules and Their Complexes. Chemical Reviews. 2009;109(9):4108–4139. 10.1021/cr900033p

SBibow, VOzenne, JBiernat, MBlackledge, EMandelkow, MZweckstetter. Structural Impact of Proline-Directed Pseudophosphorylation at AT8, AT100, and PHF1 Epitopes on 441-Residue Tau. Journal of the American Chemical Society. 2011;133(40):15842–15845. 10.1021/ja205836j

KTeilum, BBKragelund, FMPoulsen. Transient Structure Formation in Unfolded Acyl-coenzyme A-binding Protein Observed by Site-directed Spin Labelling. J Mol Biol. 2002;324(2):349–357. 10.1016/S0022-2836(02)01039-2

WLiu, XLiu, GZhu, LLu, DYang. A Method for Determining Structure Ensemble of Large Disordered Protein: Application to a Mechanosensing Protein. Journal of the American Chemical Society. 2018;140(36):11276–11285. 10.1021/jacs.8b04792

GNWGomes, MKrzeminski, ANamini, EWMartin, TMittag, THead-Gordon, et al Conformational Ensembles of an Intrinsically Disordered Protein Consistent with NMR, SAXS, and Single-Molecule FRET. Journal of the American Chemical Society. 2020;142(37):15697–15710. 10.1021/jacs.0c02088

SNKhan, CCharlier, RAugustyniak, NSalvi, VDéjean, GBodenhausen, et al Distribution of Pico- and Nanosecond Motions in Disordered Proteins from Nuclear Spin Relaxation. Biophysical Journal. 2015;109(5):988–999. 10.1016/j.bpj.2015.06.069

NRezaei-Ghaleh, GParigi, MZweckstetter. Reorientational Dynamics of Amyloid-β from NMR Spin Relaxation and Molecular Simulation. The Journal of Physical Chemistry Letters. 2019;10(12):3369–3375. 10.1021/acs.jpclett.9b01050

CCamilloni, ACavalli, MVendruscolo. Replica-Averaged Metadynamics. J Chem Theory Comput. 2013;9(12):5610–5617. 10.1021/ct4006272

GBussi, DDonadio, MParrinello. Canonical sampling through velocity rescaling. J Chem Phys. 2007;. 10.1063/1.2408420

VHRyan, GLDignon, GHZerze, CVChabata, RSilva, AEConicella, et al Mechanistic View of hnRNPA2 Low-Complexity Domain Structure, Interactions, and Phase Separation Altered by Mutation and Arginine Methylation. Molecular Cell. 2018;69(3):465–479.e7. 10.1016/j.molcel.2017.12.022

JLBattiste, GWagner. Utilization of site-directed spin labeling and high-resolution heteronuclear nuclear magnetic resonance for global fold determination of large proteins with limited nuclear overhauser effect data. Biochemistry. 2000;39(18):5355–5365. 10.1021/bi000060h

Gowers R, Linke M, Barnoud J, Reddy T, Melo M, Seyler S, et al. MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations. In: Proceedings of the 15th Python in Science Conference. SciPy; 2016. p. 98–105. Available from: 10.25080/majora-629e541a-00e.

NEKohl, EAEmini, WASchleif, LJDavis, JCHeimbach, RADixon, et al Active human immunodeficiency virus protease is required for viral infectivity. Proc Natl Acad Sci U S A. 1988;85(13):4686–4690. 10.1073/pnas.85.13.4686

MEBlackburn, AMVeloro, GEFanucci. Monitoring inhibitor-induced conformational population shifts in HIV-1 protease by pulsed EPR spectroscopy. Biochemistry. 2009;48(37):8765–8767. 10.1021/bi901201q

JRoche, JMLouis, ABax. Conformation of inhibitor-free HIV-1 protease derived from NMR spectroscopy in a weakly oriented solution. ChemBioChem. 2015;16(2):214–218. 10.1002/cbic.201402585

JRoche, JMLouis, ABax, RBBest. Pressure-induced structural transition of mature HIV-1 Protease from a combined NMR/MD simulation approach. Proteins. 2015;83(12):2117–2123. 10.1002/prot.24931

VHornak, AOkur, RCRizzo, CSimmerling. HIV-1 protease flaps spontaneously close to the correct structure in simulations following manual placement of an inhibitor into the open state. J Am Chem Soc. 2006;128(9):2812–2813. 10.1021/ja058211x

XHuang, MDBritto, JLKear-Scott, CDBoone, JRRocca, CSimmerling, et al The Role of Select Subtype Polymorphisms on HIV-1 Protease Conformational Sampling and Dynamics. J Biol Chem. 2014;289(24):17203–17214. 10.1074/jbc.M114.571836

ZLiu, TMCasey, MEBlackburn, XHuang, LPham, IMSde Vera, et al Pulsed EPR characterization of HIV-1 protease conformational sampling and inhibitor-induced population shifts. Physical Chemistry Chemical Physics. 2016;18(8):5819–5831. 10.1039/c5cp04556h

JMSayer, FLiu, RIshima, ITWeber, JMLouis. Effect of the Active Site D25N Mutation on the Structure, Stability, and Ligand Binding of the Mature HIV-1 Protease. Journal of Biological Chemistry. 2008;283(19):13459–13470. 10.1074/jbc.M708506200

SMunshi, ZChen, YLi, DBOlsen, MEFraley, RWHungate, et al Rapid X-ray diffraction analysis of HIV-1 protease–inhibitor complexes: inhibitor exchange in single crystals of the bound enzyme. Acta Crystallographica Section D Biological Crystallography. 1998;54(5):1053–1060. 10.1107/S0907444998003588

SSpinelli, QZLiu, PMAlzari, PHHirel, RJPoljak. The three-dimensional structure of the aspartyl protease from the HIV-1 isolate BRU. Biochimie. 1991;73(11):1391—1396. 10.1016/0300-9084(91)90169-2

PMartin, JFVickrey, GProteasa, YLJimenez, ZWawrzak, MAWinters, et al “Wide-Open” 1.3 Å Structure of a Multidrug-Resistant HIV-1 Protease as a Drug Target. Structure. 2005;13(12):1887–1895. 10.1016/j.str.2005.11.005

BCLogsdon, JFVickrey, PMartin, GProteasa, JIKoepke, SRTerlecky, et al Crystal Structures of a Multidrug-Resistant Human Immunodeficiency Virus Type 1 Protease Reveal an Expanded Active-Site Cavity. Journal of Virology. 2004;78(6):3123–3132. 10.1128/JVI.78.6.3123-3132.2004

CCamilloni, MVendruscolo. A tensor-free method for the structural and dynamical refinement of proteins using residual dipolar couplings. J Phys Chem B. 2015;119(3):653–661. 10.1021/jp5021824

HHeaslet, RRosenfeld, MGiffin, YCLin, KTam, BETorbett, et al Conformational flexibility in the flap domains of ligand-free HIV protease. Acta Crystallographica Section D Biological Crystallography. 2007;63(8):866–875. 10.1107/S0907444907029125

XHuang, MDBritto, JLKear-Scott, CDBoone, JRRocca, CSimmerling, et al The Role of Select Subtype Polymorphisms on HIV-1 Protease Conformational Sampling and Dynamics. Journal of Biological Chemistry. 2014;289(24):17203–17214. 10.1074/jbc.M114.571836

AEEriksson, WABaase, JAWozniak, BWMatthews. A cavity-containing mutant of T4 lysozyme is stabilized by buried benzene. Nature. 1992;355(6358):371–373. 10.1038/355371a0

HKato, NDVu, HFeng, ZZhou, YBai. The folding pathway of T4 lysozyme: an on-pathway hidden folding intermediate. J Mol Biol. 2007;365(3):881–891. 10.1016/j.jmb.2006.10.048

YWang, EPapaleo, KLindorff-Larsen. Mapping transiently formed and sparsely populated conformations on a complex energy landscape. eLife. 2016;5 10.7554/eLife.17505

FAMulder, BHon, DRMuhandiram, FWDahlquist, LEKay. Flexibility and ligand exchange in a buried cavity mutant of T4 lysozyme studied by multinuclear NMR. Biochemistry. 2000;39(41):12614–12622. 10.1021/bi001351t

GBouvignies, PVallurupalli, DFHansen, BECorreia, OLange, ABah, et al Solution structure of a minor and transiently formed state of a T4 lysozyme mutant. Nature. 2011;477(7362):111–114. 10.1038/nature10349

MTLerch, CJLópez, ZYang, MJKreitman, JHorwitz, WLHubbell. Structure-relaxation mechanism for the response of T4 lysozyme cavity mutants to hydrostatic pressure. Proc Natl Acad Sci U S A. 2015;112(19):E2437–46. 10.1073/pnas.1506505112

AEEriksson, WABaase, XJZhang, DWHeinz, MBlaber, EPBaldwin, et al Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect. Science. 1992;255(5041):178–183. 10.1126/science.1553543

GATribello, MBonomi, DBranduardi, CCamilloni, GBussi. PLUMED 2: New feathers for an old bird. Computer Physics Communications. 2014;185(2):604–613. 10.1016/j.cpc.2013.09.018