Edited by Alan R. Fersht, University of Cambridge, Cambridge, United Kingdom, and approved November 23, 2020 (received for review July 8, 2020)
Author contributions: Z.L., Q.W., and J.M. designed research; Z.L., A.A.C.-A., and L.L. performed research; Z.L., Q.W., and J.M. analyzed data; and Z.L., Q.W., and J.M. wrote the paper.
Three-dimensional refinement is a critical component of cryo-EM single-particle reconstruction. In this paper, we report the development of a computational method, OPUS-SSRI, and its application to seven real cryo-EM datasets. Our data clearly demonstrated that OPUS-SSRI can improve the final resolutions and structural details in cryo-EM single-particle analysis.
In this paper, we present a refinement method for cryo-electron microscopy (cryo-EM) single-particle reconstruction, termed as OPUS-SSRI (Sparseness and Smoothness Regularized Imaging). In OPUS-SSRI, spatially varying sparseness and smoothness priors are incorporated to improve the regularity of electron density map, and a type of real space penalty function is designed. Moreover, we define the back-projection step as a local kernel regression and propose a first-order method to solve the resulting optimization problem. On the seven cryo-EM datasets that we tested, the average improvement in resolution by OPUS-SSRI over that from RELION 3.0, the commonly used image-processing software for single-particle cryo-EM, was 0.64 Å, with the largest improvement being 1.25 Å. We expect OPUS-SSRI to be an invaluable tool to the broad field of cryo-EM single-particle analysis. The implementation of OPUS-SSRI can be found at https://github.com/alncat/cryoem.
Cryo-electron microscopy (Cryo-EM) single-particle analysis is a powerful method for determining macromolecular structures. The major advantages of cryo-EM over the traditional X-ray crystallography are that it does not require crystallization and is not plagued by the phase problem. However, there remain many new challenges in this promising technique. The central problem of cryo-EM single-particle analysis is the incompleteness of experimental observations. More specifically, the information of the relative orientations and translations of all particles is missing. Furthermore, in a dataset with multiple conformations (or substates), the membership of a specific class needs to be defined. Moreover, the signal-to-noise ratio (SNR) of a cryo-EM dataset is often very low since the electron exposure of the sample needs to be strictly limited to reduce radiation damage (1). Other problems often present in cryo-EM datasets include the nonuniform angular sampling, which frequently results in inadequate sampling or even no sampling in certain orientations (2). Therefore, the problem of cryo-EM three-dimensional (3D) reconstruction is an extremely ill-posed problem. To alleviate the ill-posedness of this problem, prior assumptions must be incorporated into the reconstruction process to ensure the uniqueness of solution and the objectivity of the final maps.
Two outstanding features of 3D density maps are sparseness and smoothness. Specifically, since the atoms in macromolecules only occupy part of the 3D maps, the macromolecular maps are often sparse in space. On the other hand, because the atoms in macromolecules are connected through chemical bonds, the electron densities of macromolecules vary smoothly across the space (3). Though sparseness is a popular prior in solving inverse problems, it is a relatively novel notion to cryo-EM 3D reconstruction. In contrast, the importance of smoothness prior is widely recognized in cryo-EM 3D refinement. An early attempt to enforce the smoothness of the density map was to apply the Wiener filter (4). Later approaches improved upon the Wiener filter by using Bayesian statistics (3). Scheres et al. assumed that the Fourier components of the density map are distributed according to Gaussian distributions (3) a priori and derived a maximum a posteriori estimation for reconstruction. This approach, as implemented in REgularised LIkelihood OptimisatioN (RELION) (5), is referred to as the traditional approach in the context of this paper. Except for incorporating priors into the reconstruction process, another line of efforts aims to enhance cryo-EM 3D refinement by optimizing the defocus parameter and class membership for each particle, as exemplified by THUNDER (6). THUNDER has been shown to improve cryo-EM refinement by providing more accurate contrast transfer function and membership for each particle.
In this paper, we continued the direction used in RELION and proposed an approach to regularize the 3D maps. Our approach, named OPUS-SSRI (Sparseness and Smoothness Regularized Imaging), focuses on imposing sparseness and smoothness priors (i.e., regularization) (7) and total variation (TV) (8). To encourage sparseness and smoothness of the density map while suppressing bias, we proposed a nonconcave, nonsmooth, real-space restraint by combining
We tested OPUS-SSRI by performing 3D refinement on a total of seven real datasets and comparing the refinement results with those obtained using RELION 3.0 or THUNDER. The detailed experimental process and optimal parameters are reported in the SI Appendix.
According to the gold-standard Fourier shell correlation (FSC) at 0.143, the final density maps reconstructed by OPUS-SSRI clearly have higher SNRs compared to those generated by RELION 3.0 in most resolution shells for β-galactosidase (10, 11) (Fig. 1A), 80S ribosome (12) (Fig. 1B), influenza hemagglutinin (HA) (13) (Fig. 1C), transient receptor potential melastatin (TRPM4) (14) (Fig. 1D), protein-conducting ERAD channel Hrd1/Hrd3 complex (15) (Fig. 2A), transient receptor potential vanilloid 5 (TRPV5) (16) (Fig. 3A), and calcium-activated chloride channel (TMEM16A) in nanodisc (17) (Fig. 4A). The final maps refined by OPUS-SSRI have resolutions that are 0.15 to 1.25 Å better than those refined by RELION 3.0, with averaged resolution improvement of 0.64 Å for all seven systems (Table 1). The improvement of the density maps reconstructed by OPUS-SSRI is also confirmed by the model versus map FSCs. The postprocessed maps of OPUS-SSRI have much higher correlations with respect to the corresponding rigid-body fitted atomic models in most resolution shells than RELION 3.0 (SI Appendix, Fig. S1). Overall, for the seven systems, the improvements in resolution for the postprocessed maps of OPUS-SSRI are in the range of 0.14 to 0.73 Å, with an average of 0.30 Å, over those refined by RELION 3.0 (SI Appendix, Table S1).


Gold-standard unmasked and masked FSC curves for the final 3D reconstructions refined by OPUS-SSRI (in red color) or RELION 3.0 (in blue color) for (A) β-galactosidase, (B) 80S ribosome, (C) influenza hemagglutinin, and (D) TRPM4. In all panels, the dashed black line represents FSC = 0.143.


Refinement of Hrd1/Hrd3 complex. (A) Gold-standard unmasked and masked FSC curves calculated from two independent reconstructions by OPUS-SSRI or RELION 3.0. The dashed black line represents FSC = 0.143. (B) Final reconstructed cryo-EM map using RELION 3.0. (C) Final reconstructed cryo-EM map using OPUS-SSRI. The red rectangle defines a region of the EM map to be enlarged in D for RELION 3.0 and E for OPUS-SSRI for residues 142 to 175, respectively. The EM density is represented in mesh (blue), and the atomic model is shown in a ribbons diagram with side chains in stick presentation. Both density maps are contoured at the same level.


Refinement of TRPV5. (A) Gold-standard unmasked and masked FSC curves calculated from two independent reconstructions by OPUS-SSRI or RELION 3.0. The dashed black line represents FSC = 0.143. (B) Final reconstructed cryo-EM map using RELION 3.0. (C) Final reconstructed cryo-EM map using OPUS-SSRI. The red rectangles in B and C define a region of the EM map to be enlarged in D for RELION 3.0 and E for OPUS-SSRI for residues 374 to 409, respectively. The dash red circles highlight a region in the model before (D) and after (E) the manual adjustments in COOT and structural refinement using PHENIX. The EM density is represented in mesh (blue), and the structural model is represented by a ribbons diagram with side chains in stick presentation. Both density maps are contoured at the same level.


Refinement of TMEM16A in nanodisc. (A) Gold-standard unmasked and masked FSC curves calculated from two independent reconstructions by OPUS-SSRI or RELION 3.0. The dashed black line represents FSC = 0.143. (B) Final reconstructed cryo-EM map using RELION 3.0. (C) Final reconstructed cryo-EM map using OPUS-SSRI. The dashed red rectangles in B and C define a region of EM map to be enlarged in D for RELION 3.0 and E for OPUS-SSRI for residues 408 to 440, respectively. The solid red rectangles in B and C define a region of the EM map to be enlarged in F for RELION 3.0 and G for OPUS-SSRI for residues 848 to 884, respectively. The EM density is represented in mesh (blue), and the structural model is represented by a ribbons diagram with side chains in stick presentation. Both density maps are contoured at the same level.

| Proteins | Gold-standard FSC = 0.143 | |||||
| RELION | THUNDER | OPUS-SSRI | ||||
| Resolution (Å) | Resolution (Å) | ΔÅ over RELION* | Resolution (Å) | ΔÅ over RELION* | ΔÅ over THUNDER† | |
| β-galactosidase (EMPIAR-10017) | 4.16 | 4.25 | −0.09 | 3.93 | 0.23 | 0.33 |
| 80S ribosome (EMPIAR-10002) | 4.08 | 3.80 | 0.28 | 3.93 | 0.15 | −0.13 |
| Hemagglutinin (EMPIAR-10097) | 4.19 | 4.11 | 0.08 | 3.77 | 0.42 | 0.34 |
| TRPM4 (EMPIAR-10126) | 3.48 | / | / | 2.74 | 0.74 | / |
| Hrd1/Hrd3 (EMPIAR-10099) | 4.80 | 4.75 | 0.05 | 3.55 | 1.25 | 1.20 |
| TRPV5 (EMPIAR-10254) | 3.12 | 3.09 | 0.03 | 2.47 | 0.65 | 0.62 |
| TMEM16A (EMPIAR-10123) | 3.90 | / | / | 2.84 | 1.06 | / |
| Average improvement | 0.07 | 0.64 | 0.47 | |||
/ indicates that the comparison was unavailable in two cases in which THUNDER failed to execute due to computer incompatibility.
* The value in negative indicates the resultant resolution is worse than that from RELION, while the value in positive indicates the resultant resolution is better than that from RELION.
† The value in negative indicates the resultant resolution is worse than that from THUNDER, while the value in positive indicates the resultant resolution is better than that from THUNDER.
THUNDER was also run on five of these seven systems (it failed to execute on two datasets due to incompatibility with our computing facility). According to the gold-standard FSC at 0.143, the improvements in resolution by THUNDER over RELION 3.0 are in the range of −0.09 to 0.28 Å with an average of 0.07 Å (Table 1). If judged by the model versus map FSCs at 0.143, the improvements in resolution of THUNDER over RELION 3.0 are in the range of −0.18 to 0.17 Å with an average of 0.07 Å on the five systems (SI Appendix, Table S1). Of these five systems, OPUS-SSRI constantly outperforms THUNDER on four systems and only slightly underperforms THUNDER on one system (80S ribosome) as gauged by the gold-standard FSC = 0.143 and model versus map FSC = 0.143. Overall, OPUS-SSRI produces an average improvement of 0.47 Å in resolution over THUNDER for all five systems if judged by the gold-standard FSC = 0.143, with the largest improvement being 1.20 Å (Table 1 and SI Appendix, Fig. S2), and of 0.20 Å in resolution if judged by the model versus map FSC = 0.143, with the largest improvement being 0.63 Å (SI Appendix, Table S1 and Fig. S3).
Fig. 2 shows some of the structural improvements for Hrd1/Hrd3 complex in more detail. Clearly, compared to the density map reconstructed by RELION 3.0 (Fig. 2B), the density map from OPUS-SSRI is much sharper and cleaner (Fig. 2C). In fact, out of the seven systems studied, OPUS-SSRI refinement on Hrd1/Hrd3 complex results in the largest improvements in resolution (Table 1 and SI Appendix, Table S1). For instance, in the density map from RELION 3.0, there is a gap in the main-chain density between residues 147 and 148 (Fig. 2D). However, in the density map from OPUS-SSRI, the density in this region becomes continuous and strong (Fig. 2E).
Similarly, for TRPV5, comparing to the final map obtained by RELION 3.0 (Fig. 3B), the density map from OPUS-SSRI becomes much sharper with improved SNRs (Fig. 3C). Most impressively, the density map from OPUS-SSRI even allows retracing of the structural model in the region of residues 374 to 380 that was out of the density map in the original structure (highlighted in dashed red circle in Fig. 3D). After the manual adjustment in the crystallographic object-oriented toolkit COOT (18) and structural refinement using Python-based Hierarchical ENvironment for Integrated Xtallography (PHENIX) (19), the match between the model and map is substantially improved (highlighted by dashed red circle in Fig. 3E).
In addition, for TMEM16A, in contrast to the density map from RELION 3.0 (Fig. 4B), the density map obtained by OPUS-SSRI (Fig. 4C) shows sharper and smoother densities with less noise throughout. The improvement from OPUS-SSRI is highlighted for two helices in the regions of residues 408 to 440 (Fig. 4 D and E) and 848 to 884 (Fig. 4 F and G). Most impressively, in the density map refined by OPUS-SSRI, the densities for side chains of residues F412, M416, W419, and F423 (Fig. 4E) and F863, I865, F867, and N869 (Fig. 4G) become very well separated, in marked contrast to the blobs of densities from RELION 3.0 in Fig. 4 D and F, respectively.
In this paper, we proposed OPUS-SSRI, a 3D refinement method for cryo-EM single-particle analysis. The improvement of our method in gold-standard FSC of the final reconstructions is the most noticeable, which can be largely attributed to the superior denoising effect of the sparseness and smoothness priors that we introduced. By setting relatively small components in the 3D map to zero and filtering components to be more consistent with their neighbors, the sparseness and smoothness restraints can suppress the noisy densities that do not belong to the molecules in the map, thus producing cleaner reconstructions. The cleaner map in turn leads to more accurate pose estimation for each particle. These improvements brought about by our method result in an overall much-improved final reconstruction. Furthermore, the relatively large improvements for structures with heterogenous flexibility such as Hrd1/Hrd3 and TMEM16A confirm the theoretical difference between the traditional smoothness prior in RELION and our smoothness prior in OPUS-SSRI. For structures with heterogeneous flexibilities in different regions, the traditional approach in RELION enforces translation-invariant isotropic smoothness to the 3D maps, thus smearing the rigid regions and creating large biases in the reconstructions. In contrast, OPUS-SSRI can adapt to different flexibilities in different regions in the maps, thus greatly reducing biases and improving the final reconstructions. Another approach we explored to promote smoothness is by casting the back-projection as a local kernel regression problem. This formulation enables us to embed the 3D maps in a reproducing kernel Hilbert space (RKHS) with specific smoothness.
Although our method introduces five more parameters, their optimal values can be easily determined. First of all, we can set
It is worth noting that OPUS-SSRI focuses on improving accuracies of pose parameters for each particle in the maximization step, which is complementary to the approach explored by THUNDER that targets other latent variables, such as defocus parameters and class membership. Hence, these two approaches can be readily combined. In fact, accurate determination of pose parameters are the prerequisites for a better per-particle defocus parameter refinement. This is exemplified by the limited improvement of THUNDER on the highly noisy dataset Hrd1/Hrd3, in which the pose of each particle was of large errors (SI Appendix, Figs. S2 and S3), yielding inaccurate reference two-dimensional (2D) projections and adversely affecting the per-particle contrast transfer function (CTF) refinement. Therefore, our OPUS-SSRI might enhance the per-particle CTF refinement on some noisy datasets by improving the pose determination of these datasets.
Finally, our tests of OPUS-SSRI on seven real datasets support that OPUS-SSRI can greatly improve the resolution of the final density map, thus allowing more accurate building of atomic models. We expect OPUS-SSRI to be an invaluable tool to the general field of cryo-EM single-particle analysis.
We clarify some notations here. For a vector
Formally, the FT of 3D map
As 3D molecular maps are both sparse and smooth, in order to incorporate these priors into refinement, a mathematical formulation for them must be developed. Conventionally, the smoothness of a function is associated with the norm of its gradient, and sparseness is referred to as the number of zeros in the values of function (20). In the following subsections, we will formulate different smoothness priors and reveal their differences. The key equations illustrating the effects of the traditional smoothness restraint and our smoothness restraint are Eqs. 2 and 5, respectively.
The traditional method (5) enforces smoothness by applying a quadratic restraint on the magnitudes of FTs based on the assumption that they are distributed according to Gaussian. Since the traditional method is an instance of Wiener filtering (21), the restraint strength depends on the SNR. The 3D map reconstructed by the traditional method can be defined as the maximizer of
where
To understand the effect of the smoothness restraint of the traditional method, we consider the role of the restraint in the gradient ascent iteration, which is of the form
where
Sparseness resembles the idea of masking in the calculation of masked FSC, where the voxels which are below a certain threshold are setting to 0. The similar effects can be achieved by restraining the sum of the absolute values of densities, namely, the
where
This subsection presents the algorithm to optimize the penalized log likelihood in Eq. 3. First, the log marginal likelihood function can be optimized by the expectation–maximization method (25) (see SI Appendix, Expectation maximization for derivation). The reconstruction process alternates between the expectation step in which the distribution of pose parameters for each particle is determined and the maximization step in which the 3D map is reconstructed. Secondly, to address the nonconcavity of log norm, we approximate the logarithm function by concave function and iteratively improve the approximation (24) at each maximization step (see SI Appendix, Weighted approximation for derivation). Lastly, to average 3D maps reconstructed in consecutive maximization steps, we consider leveraging implicit gradient ascent (26), which is a widely used technique to improve the stability of optimization method. The implicit gradient ascent restrains the Euclidean distance between the new solution and the 3D map of previous maximization step
where
Though TV norm is nondifferentiable at zero, we can approximate its gradient by Nesterov smoothing (27). The approximate gradient of TV norm at a voxel
where
The differentiable function with
where
In summary, Eq. 6 is applied iteratively in the maximization step. The gradient of the TV norm enforces spatially varying smoothness in gradient ascent, while the soft-thresholding operator induced by the weighted
To reconstruct the 3D map, cryo-EM researchers introduced a back-projection operator, which puts the 2D FT of the image into the 3D map. As the inverse of slice operator
where
OPUS-SSRI used Gaussian kernel, which is of the form
Eq. 7 has a closed form solution,
with
The implementation of OPUS-SSRI is based on RELION. The 3D refinement program in RELION consists of two modules, expectation and maximization. We implemented our method as a new routine in its maximization module. Therefore, when performing refinement, the expectation steps of RELION (5) and OPUS-SSRI use exactly the same settings. The gradient calculation and soft-thresholding operators are implemented with CUDA, thus allowing fast maximization.
The gold-standard FSC is the FSC between two independently refined half maps F and G (22). The gold-standard FSC of Fourier coefficients at shell
If there exists a high-resolution atomic structural model, we can validate the cryo-EM map by comparing it to this atomic model. The first step in calculating the model versus map FSC is fitting the atomic model into the cryo-EM density map. The model map is constructed from the fitted atomic model by sampling on the same grid as the experimental map. The model versus map FSC (31) is the correlation between the FT of the model map and the FT of cryo-EM map. The point where the model versus map FSC approaches 0.143 can be regarded as the resolution of the experimental map.
The single-particle datasets used in this paper were obtained from either the deposited particle stack or the coordinate files. In all experiments, we built the initial maps ab initio in RELION 3.0 and refined those initial maps using the three methods to be compared. The initial map building began with one round of 2D classification in RELION 3.0. The particles belonging to the major classes were then selected to build the initial map ab initio using the 3D classification procedure in RELION 3.0. The same low-pass–filtered initial maps were subsequently supplied into the three methods, RELION 3.0, THUNDER, and OPUS-SSRI, for refinement. For the datasets with specific symmetry, the symmetry was enforced throughout the refinement process. For RELION 3.0 and OPUS-SSRI, we also used the same convergence criteria [i.e., no resolution improvement and pose changes for the last two iterations (5)]. In THUNDER, the particle grading and CTF search options were set as “True” for better results. Finally, the gold-standard FSC calculations and density map postprocessing of the refinement results of all methods were carried out in RELION 3.0. In the postprocessing step, the mask was created from the final reconstruction using all particles in the 3D refinement procedure. Using relion_postprocess (30), we obtained gold-standard FSCs and the postprocessed map from independent maps by correcting the modulation transfer function of the detector and sharpening with automatically estimated B-factors. We then compared the postprocessed map with respect to the corresponding published atomic model(s) by calculating model versus map FSC using Phenix.Mtriage (32). Before comparison, the atomic model was fitted into the postprocessed density maps reconstructed by different methods using the rigid-body fit in Chimera (33).
J.M. acknowledges support from the NIH (R01-GM127628 and R01-GM116280) and the Welch Foundation (Q-1512). Q.W. acknowledges support from the NIH (R01-GM127628 and R01-GM116280) and the Welch Foundation (Q-1826). A.A.C.-A. was partially supported by a training fellowship from the Computational Cancer Biology Training Program of the Gulf Coast Consortia (Cancer Prevention and Research Institute of Texas [CPRIT] Grant No. RP170593).
The implementation of OPUS-SSRI can be found at GitHub (https://github.com/alncat/cryoem). All other study data are included in the article and supporting information.
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32