Abstract
Ras proteins are highly conserved signaling molecules that exhibit regulated, nucleotide-dependent switching between active and inactive states. The high conservation of Ras requires mechanistic explanation, especially given the general mutational tolerance of proteins. Here, we use deep mutational scanning, biochemical analysis and molecular simulations to understand constraints on Ras sequence. Ras exhibits global sensitivity to mutation when regulated by a GTPase activating protein and a nucleotide exchange factor. Removing the regulators shifts the distribution of mutational effects to be largely neutral, and reveals hotspots of activating mutations in residues that restrain Ras dynamics and promote the inactive state. Evolutionary analysis, combined with structural and mutational data, argue that Ras has co-evolved with its regulators in the vertebrate lineage. Overall, our results show that sequence conservation in Ras depends strongly on the biochemical network in which it operates, providing a framework for understanding the origin of global selection pressures on proteins.
(Click on the small image to get a higher-resolution version.)
Figure 1 - The Ras switching cycle and the bacterial two-hybrid system.
A. Ras cycles between an active, GTP-bound state and an inactive, GDP-bound state. Ras•GTP binds to effector proteins, such as Raf kinase, which binds to Switch I. The intrinsic hydrolysis of GTP is slow, unless catalyzed by a GTPase activating protein (GAP) which binds to Switch I. Intrinsic GDP release is also a slow process, unless facilitated by a guanine nucleotide exchange factor (GEF) which binds to both Switch I and Switch II.
B. Structure of Ras highlighting secondary structure elements.
Figure 1 continued
C. The bacterial two-hybrid system couples the Ras•GTP:Raf-RBD interaction to the production of an antibiotic resistance factor. The Ras variant library, the Raf-RBD, and the antibiotic resistance factor are encoded on three inducible plasmids. The GAP and GEF can also be co-expressed in the bacterial two-hybrid system. After protein expression, a fraction of the cells is removed and the plasmids encoding the Ras variant library are isolated and deep sequenced to count the frequency of each variant before antibiotic selection. The remainder of cells are subject to antibiotic selection with chloramphenicol and the plasmids encoding the Ras variant library are isolated and deep sequenced to count the frequency of each variant after antibiotic selection. The counts of each variant before and after selection are used to calculate the enrichment of each Ras variant.
D. In vitro validation of the bacterial two-hybrid system. The enrichment of individual Ras variants is approximately proportional to the change in Ras•GTP:Raf-RBD binding free energy upon mutation. Binding free energy of individual Ras mutants was measured by isothermal titration calorimetry, where error bars represent the standard deviation from three experiments, and relative enrichment values (ΔEx) are derived from wild-type Ras binding to Raf-RBD in the unregulated-Ras experiment.
Figure 2 - Mutational tolerance of Ras in the regulated-Ras experiment.
A. The results of the regulated-Ras experiment are shown in the form of a 165x20 matrix. Each row of the matrix represents one of the 20 amino acids, and each column shows one of the residues of Ras, from 2 to 166. Each entry in the matrix represents, in color-coded form, the value of the relative enrichment for the corresponding mutation ΔEx. All data are normalized to the wild type Ras reference sequence, which has a relative enrichment value of zero. The numerical values of ΔEx are provided in the supplementary data.
B-C. Distribution of individual fitness effects (ΔEx; left) and residue-averaged fitness effects (<ΔEx>; right). Residues with a significant loss of function effect on Ras (<1σ from the mean) are indicated on the residue-averaged histogram.
D. Mapping the residues that lead to a significant loss-of-function onto the tertiary structure of Ras. These positions include the hydrophobic core, as well as residues involved in GTP/GDP and Raf-RBD binding.
E. Additional sites of mutational sensitivity include surface residues involved in ion-pairing networks that stabilize the GTPase fold.
Figure 3 - Mutational tolerance of Ras in the attenuated-Ras experiment.
A. Relative enrichment values (ΔEx) are shown in matrix form as in Figure 2A, for the attenuated-Ras experiment. In this experiment, Ras is expressed without the GEF, and shows a muted pattern of mutational sensitivity. Mutations at residues known to be mutated in human cancer (e.g. Gly 12, Gly 13, Gln 61, Lys 117) show a strong gain of function in this context.
B. Distribution of residue-averaged relative enrichment values. Residues with a significant gain of function effect on Ras (>1σ from the mean) are indicated on the histogram.
C. Mapping the residues that lead to a gain of function onto the tertiary structure of Ras. These positions span the P-loop, Switch II, and guanine binding loops of Ras and are directly involved in nucleotide coordination and hydrolysis.
D-E. Comparison of relative enrichment values for mutations at selected residues in the regulated-Ras and attenuated-Ras experiments. Substitutions at residues that commonly mutated in cancer, such as Gly 12, Gly 13, and Gln 61, are exclusively gain of function in the attenuated-Ras experiment.
Figure 4 - Mutational tolerance of Ras in the unregulated-Ras experiment.
A. Relative enrichment values (ΔEx) are shown in matrix form as in Figure 2A, for the unregulated-Ras experiment. In this experiment, Ras is expressed without the GAP and the GEF, and reveals hotspots of activating mutations. Mutations at residues known to be mutated in human cancer (e.g. Gly 12, Gly 13, Gln 61, Lys 117) show a strong gain of function in this context.
B. (Left) A scatter plot of relative enrichment values (ΔEx) for the regulated-Ras and unregulated-Ras experiments. The distribution of relative enrichments values for each experiment are also shown. Loss of function mutations are shown in blue, and neutral mutations are shown in grey. Gain of-function mutations in both experiments are shown in red, and mutations that are only gain of function in the unregulated-Ras experiment-but are neutral in the regulated-Ras experiment are shown in yellow (conditional gain of function). (Right) The spatial distribution of gain-of-function (red) and conditional gain-of-function (yellow) mutations on the three-dimensional structure of Ras. Residues that contain a majority of gain-of-function mutations from the scatter plot in B. are colored red, and residues that contained a majority of conditional gain-of-function mutations are colored in yellow.
Figure 5 - Mutational tolerance of Ras in the Ras-G12V experiment.
A. Relative enrichment values (ΔEx) are shown in matrix form as in Figure 2a, for the Ras-G12V experiment, in the absence of the GAP and the GEF. In this experiment, Ras shows a muted pattern of mutational sensitivity when compared to the wild-type unregulated-Ras experiment. Mutations at residues known to be mutated in human cancer (e.g. Gly 13, Gln 61, Lys 117) do not show a strong gain of function in this context. Mutations at Gly 12 are not included in the data.
B-C. Comparison of relative enrichment values for mutations at selected residues in the unregulated-Ras and Ras-G12V experiments. Substitutions at hotspot residues and residues that are commonly mutated in cancer, such as Gln 61, Lys 117, and Lys 147, result in a gain of function in the unregulated-Ras experiment, but have an attenuated effect in the Ras-G12V experiment.
Figure 6 - Superposition of structures from molecular dynamics simulations of GTP-bound forms of Ras.
A. Wild-type Ras•GTP. The diagrams show the superposition of 300 structures sampled evenly from two 300 ns trajectories. The diagram on the left shows a canonical view of Ras•GTP. The two other diagrams show two orthogonal views.
B. As in A., for Ras•GTP L120A, shown as yellow spheres.
C. As in A., for Ras•GTP Q99A, shown as yellow spheres.
Figure 7 - The impact of hotspot mutations on structural flexibility in Ras.
A. Sustained sidechain-sidechain contacts in the simulation of wild-type Ras•GTP. We used the network analysis tool of Vishveshwara and co-workers (Bhattacharyya et al., 2013) to analyze the molecular dynamics trajectories for Ras. Strong interactions between sidechains, as defined by Bhattacharyya et al., 2013, that are present in more than a threshold fraction of 50% of the instantaneous structures in the trajectory are identified by blue lines. These lines are drawn between the Cα atoms of the corresponding residues.
B. For simulations of Ras•GTP H27A, contacts that are present in the wild type simulation but not in the H27A simulation are drawn in blue, and contacts that are present in the H27A simulation but not in the wild type simulation are drawn in yellow.
C. As in B., for Ras•GTP Q99A.
D. As in B., for Ras•GTP L120A.
Figure 8 - Sequence variation in Ras.
A. An evolutionary tree from an alignment of 72 extant Ras sequences from invertebrates (blue) and vertebrates (orange). The hypothetical ancestral sequence at the base of the tree is highlighted in green.
B. Sequence alignments of Ras from choanoflagellate (S. rosetta) and sponge (A. queenslandica) are 72% and 80% identical to human H-Ras, respectively. While the sequences are largely identical, there are four principal regions of sequence divergence. These regions correspond to residues 45-50, helix α3, helix α4, and residues 148-154 in human H-Ras.
C. Comparison of the sequence of human H-Ras to the ancestral sequence from the base of the metazoan lineage reveals similar regions of sequence variation. S. rosetta Ras was not used in the alignment of Ras sequences to generate the tree shown in (A).
D. The substitutions in human H-Ras that are present in the ancestral sequence are compared to the mutational data from the unregulated-Ras experiment. There are 48 differences between the sequences of the hypothetical ancestral protein and human H-Ras. Of these, 30 represent residues that would activate unregulated human H-Ras if the wild-type residue were replaced by the residue in the ancestral sequence. Eight differences correspond to neutral substitutions, and nine substitutions would decrease function if introduced into human H-Ras.
Figure 9 - Interaction of Ras with the two Ras-binding sites of SOS.
Structure of the Ras:SOS complex (PDB code: 1NVV). Two molecules of H-Ras are bound to the allosteric and active site of SOS. The allosteric site of SOS is bound by Ras•GTP, whereas the active site of SOS is bound by nucleotide-free Ras. Switch I and Switch II of Ras (red) are responsible for engaging both sites in SOS. Additionally, residues in variable regions 1 and 4 are involved in the binding of Ras to the allosteric site of SOS, including Asn 26, Gln 43, Thr 50, and Glu 153 (yellow).
Figure 10 - Comparison of the R and T states in Ras.
A. Variable Regions 2 and 3 highlighted on the structure of Ras•GTP. Variable Region 2 comprises helix α3 and the preceding loop, and Variable Region 3 comprises helix α4 and the preceding loop that is partially involved in guanine binding.
B. Room temperature crystal structure of wild type Ras adopts the R state as defined by Mattos and colleagues (Buhrman et al., 2010; Holzapfel et al., 2012), where the sidechain of Tyr 71 is buried in the hydrophobic core of the protein. Crystal structure of the L120A latch mutant shows a rotation of Tyr 71, whereby the sidechain is exposed. Additionally, the interface between helix α3 and Switch II undergoes a conformational change.
C. Schematic of the conformational transition between the R state and T state. Helix α3 and the C-terminal helix of Switch II shift downwards in the transition from the R state to the T state, leading to a rotation of Tyr 71 from a buried to a solvent exposed conformation.
D. Comparison of the choanoflagellate S. rosetta crystal structure to the L120A structure. S. rosetta Ras adopts the T state, similar to L120A, where Tyr 71 is rotated outwards and exposed to solvent.