The specificity of tyrosine kinases is predominantly attributed to localization effects dictated by non-catalytic domains. We developed a method to profile the specificities of tyrosine kinases by combining bacterial surface-display of peptide libraries with next-generation sequencing. Using this, we showed that the tyrosine kinase ZAP-70, which is critical for T cell signaling, discriminates substrates through an electrostatic selection mechanism encoded within its catalytic domain (Shah et al. 2016)
. Here, we expand this high-throughput platform to analyze the intrinsic specificity of any tyrosine kinase domain against thousands of peptides derived from human tyrosine phosphorylation sites. Using this approach, we find a difference in the electrostatic recognition of substrates between the closely-related Src-family kinases Lck and c-Src. This divergence likely reflects the specialization of Lck to act in concert with ZAP-70 in T cell signaling. These results point to the importance of direct recognition at the kinase active site in fine-tuning specificity.
(Click on the small image to get a higher-resolution version.)
Figure 1 - Analysis of Src-family kinase specificity using a high-throughput platform.
A. T cell receptor-proximal signaling mediated by the tyrosine kinases Lck and ZAP-70 (based on Au-Yeung et al. 2009).
B. Schematic representation of the high-throughput specificity screening platform.
C. Phylogenetic tree inferred from the sequences the eight human Src-family kinase domains, highlighting the segregation of Src-A and Src-B kinases.
D. The frequency of amino acid residues at each position in the full Human-pTyr library, visualized using WebLogo (Crooks et al. 2004).
Figure 2 - Phosphorylation of the Human-pTyr library by the kinase domains of c-Abl, ZAP-70, c-Src, and Lck.
A. A histogram showing the distribution of enrichment scores for 2587 peptides upon phosphorylation by c-Abl. ‘High-efficiency’ c-Abl substrates are defined as those with enrichment scores above the dashed line in the graph.
B. Phospho-pLogo diagram showing the probability at each position of the amino acids in the high-efficiency c-Abl substrates (791 peptides) relative to all substrates in the library (2587 peptides).
C. Histogram of enrichment score distribution, as in A, for ZAP-70.
D. Phospho-pLogo diagram, as in B, for ZAP-70 (604 high-efficiency peptides).
E. Phospho-pLogo diagram, as in B, for c-Src (791 high-efficiency peptides).
F. Phospho-pLogo diagram, as in B, for Lck (763 high-efficiency peptides).
Sequence logos were visualized using pLogo (O’Shea et al. 2013). In each case, the same cutoff of an enrichment score greater than or equal to 1.5 was used. Data for c-Src are the average of three independent screens, and data for c-Abl, Lck, and ZAP-70 are the average of two independent screens.
Figure 3 - Specificity differences between closely-related kinases, c-Src and Lck.
A. Scatter plots showing the correlations between enrichment scores for all peptides in the Human-pTyr library from multiple screens with different kinases. First panel: correlation between two individual Lck replicates. Second panel: correlation between Lck screens (duplicate) and c-Src screens (triplicate); peptides analyzed in Figures 3E and 4C are denoted by red dots. Third panel: correlation between Lck screens (duplicate) and c-Abl screens (duplicate). Fourth panel: correlation between Lck screens (duplicate) and ZAP-70 screens (duplicate).
B. Phospho-pLogo diagrams showing the probability of each amino acid per residue in the c-Src-preferred substrates (top logo), which are not efficiently phosphorylated by Lck (128 peptides), and in the Lck-preferred substrates (bottom logo), which are not efficiently phosphorylated by c-Src (100 peptides), relative to all substrates in the library.
C. Relative enrichment values for ZAP-70 or c-Src versus Lck, comparing the abundance of charged residues at each position in substrates after phosphorylation by either kinase.
Figure 3 - Continued
C continued Enrichment values for each residue are calculated as described in Materials and Methods. Relative enrichment values are expressed as log10 (ZAP-70 or c-Src enrichment / Lck enrichment).
D. Distribution of net charges on peptides that were selectively phosphorylated by c-Src or Lck, relative to one another, showing a greater tolerance for negatively-charged substrates in c-Src relative to Lck. Net charge was estimated as the difference between the number of Lys/Arg residues and Asp/Glu residues.
E. In vitro phosphorylation kinetics of five purified peptides by the c-Src and Lck kinase domains. All peptides were used at a concentration of 250 μM, and both kinases were used at a concentration of 500 nM. The corresponding peptides analyzed in screens with c-Src and Lck are also highlighted as red dots in the second panel of Figure 3A. Error bars represent the standard deviation of at least three measurements.
Figure 4 - Conserved electrostatic differences between Src-A and Src-B kinases.
A. A representative instantaneous structure from a simulation of the Lck kinase domain bound to an ITAM peptide (TCRζ residues 104-118), highlighting the position of the three residues analyzed in Figure 4 – Supplement 1.
B. Histograms showing the prevalence of charged residues at each position in the subset of peptides that are preferred substrates of c-Src Q423E or R472P, but not wild-type c-Src. The abundances of these charged residues are normalized to those found in the whole Human-pTyr library.
C. Comparison of the in vitro rates for phosphorylation of the five peptide sequences in Figure 3E by four c-Src kinase domain variants: wild-type c-Src and the Q423E, R472P, and P488E mutants. All peptides were used at a concentration of 250 μM, and all kinases were used at a concentration of 500 nM. For each kinase, all rates were normalized to the rate of phosphorylation of peptide 3, and error bars represent the standard deviation from at least three measurements. The absolute rate constants are plotted in Figure 4 – Supplement 4.
Figure 5 - Additional insights from high-throughput specificity screens.
A. Heatmaps depicting the enrichment of -1 and +3 residues in the preferred substrates of Lck, c-Src, c-Abl, and ZAP-70. Enrichment is calculated relative to the amino acid abundance at these positions in the whole Human-pTyr library, analogous to the depictions in Figure 2. Residues in each kinase that may confer specificity at these positions are given in the tables adjacent to the heatmaps.
B. A representative instantaneous structure from a simulation of the Lck kinase domain bound to an ITAM peptide (TCRζ residues 104-118), highlighting the position of the three residues in the kinase domain listed in panel A that make contacts with the -1 and +3 positions on substrates.
C. Comparison of results from specificity screens with the Human-pTyr Library to a curated list of human kinase-substrate pairs from the PhosphoSitePlus database (Hornbeck et al. 2015). The scatter plot shows the enrichment scores for tyrosine phosphosites that were assigned to each specific kinase and also present in the Human-pTyr Library. The number of sequences greater than or equal to a threshold enrichment score of 1.5 is listed to the right of the scatter plot.
(Click on the small image to get a higher-resolution version.)
Figure 1 Supplementary Figure 1
pLogo diagram depicting the probabilities of amino acid residues at each position surrounding the target tyrosine residue in sequences in the Human-pTyr library. These probabilities are normalized to the same amino acid probabilities at positions surrounding tyrosine residues across the human proteome.
Figure 1 Supplementary Figure 2
Correlations between enrichment scores for all peptides in the Human-pTyr library between replicate screens performed using the same kinase. Correlations between duplicate screens for c-Abl, ZAP-70, and Lck are shown in the top row. Correlations between the three independent screens with c-Src are shown in the bottom row.
Figure 2 Supplementary Figure 1
Sequences of several negatively-charged tyrosine phosphorylation sites in lymphocyte scaffold proteins represented in the Human-pTyr library. Enrichment scores for phosphorylation of these sites by ZAP-70 and Lck are show as a histogram.
Figure 2 Supplementary Figure 2
Histograms showing the distributions of enrichment scores for 2587 peptides upon phosphorylation by c-Src and Lck. ‘High-efficiency’ substrates are defined as those with enrichment scores above the dashed line in the graph.
Figure 4 Supplementary Figure 1
Sequence logos for the regions surrounding three residues that may cause divergent substrate specificities between Src-A and Src-B kinases. Logos were generated from sequence alignments of 37 Src-A or Src-B kinases across the vertebrate lineage and visualized using WebLogo (Crooks et al. 2004). Residue numbering corresponds to human c-Src and Lck sequences.
Figure 4 Supplementary Figure 2
Correlations between enrichment scores for all sequences in the Human-pTyr library upon phosphorylation between wild-type c-Src and the c-Src mutants Q423E, R472P, and P488E.
Figure 4 Supplementary Figure 3
Phospho-pLogo diagrams for preferred substrates of c-Src Q423E, c-Src R472P, and c-Src P488E that are not efficiently phosphorylated by wild-type c-Src. These logos are generated from 58, 51, and 53 sequences, respectively, and are normalized to the full Human-pTyr library (2587 sequences).
Figure 4 Supplementary Figure 4
Initial velocities for phosphorylation of the five peptide sequences shown in Figure 3E by wild-type c-Src and three c-Src mutants, Q423E, R472P, and P488E. All measurements were carried out using 500 nM kinase domain and 250 μM peptide.
Figure 4 Supplementary Figure 5
Initial velocities for phosphorylation of the five peptide sequences shown in Figure 3E by two Src-A kinase domains, c-Src and Fyn, and two Src-B kinase domains, Lck and Hck. The top panel shows the absolute rate constants. The bottom panel shows the all of the rates normalized to those of the universal high-efficiency substrate, peptide 3.
Figure 5 Supplementary Figure 1
Initial velocities for phosphorylation of LAT-based peptides by c-Src variants with F-G loop mutations. The two peptides correspond to LAT residues 214-233, with an Asp (wild-type) or Leu (mutant) at the -1 residue relative to Tyr 226. All measurements were carried out using 500 nM kinase domain and 250 μM peptide.
(Click on the titles to download the supplemental files.)
Enrichment scores for all peptides analyzed in screens with c-Src (wild-type and mutants), Lck, c-Abl, and ZAP-70. The reported values are the average of enrichment scores from all replicates with the specified kinase. In addition, peptide names and sequences are given. Peptide names correspond to the Uniprot identifier (The UniProt Consortium 2017) followed by the residue number for the central tyrosine.
Independent linear phosphorylation motifs extracted from screens of the Human-pTyr library using Motif-X (Schwartz and Gygi 2005; Chou and Schwartz 2011). For all analyses, the foreground sequence set corresponds to those sequences with an enrichment score greater than 1.5 in the bacterial surface-display screens, and the background sequence set is the full Human-pTyr library. Only motifs with a minimum occurrence of 20 and a significance threshold of 0.000001 were considered. Motifs containing multiple tyrosine residues were omitted for clarity, as they cannot be unambiguously interpreted.
Figure 5 Source Data 2.
Comparison of bacterial surface-display screens with reported kinase-substrate pairs described in a curated list from the PhosphoSitePlus database (Hornbeck et al. 2015). The kinase name, phosphosite designation, and enrichment score are given, alongside the sequence, as annotated in the PhosphoSitePlus database.