Tuesday, 23 August 2011

Surface-core association and the evolution of proteins

It is well known that the rate of amino acid replacement at the surface of a protein is faster than in the core. A recent study from Agnes Toth-Petroczy and Dan Tawfik published in PNAS presents a more detailed view on the evolution of proteins. Starting from 10 sequenced yeast genomes they compiled a set of nearly 4'000 families of orthologous proteins. They estimated the evolutionary rate at each site in each family with Rate4Site, a software developed by Tal Pupko. Then, they assigned each site to their structural position (core/surface) in 382 protein domains (compared to the 3,778 sequences, suggesting that we may still be far from having covered the whole structural space).

They found that positions fell into three different peaks according to their evolutionary rate: a slow peak, a fast peak and a very fast peak:
Fig. 2: Canonical types of positional rate distributions. To illustrate the canonical peaks, the computed distributions (black dots) were fitted by superposition of Gaussian peaks (in gray). The backbone structures mark residues by positional rates: in cyan, slow evolving, log2μ <0.5; in pink, log2μ= (0.5–2); in red, fast-evolving residues, log2μ >2). PDB codes and average rates per protein (μP, in red) are denoted above each distribution. (A) Proteins evolving at average rates, such as adenylate kinase, exhibit a large slow peak at an average positional rate of about −2.0, and smaller and broader fast peak at log2μ ∼ 0.5. (B) Carboxypeptidase exemplifies fast-evolving proteins that, in addition to the slow and fast peaks, exhibit a very fast peak with an average log2μ of ∼2.0. (C) Ubiquitin-conjugating enzyme RAD6 represents slowly evolving proteins in which the −2.0 peak is absent, and a very slow peak at −4.0 appears. (D) Histograms of positional rates for BNI1: its structured domain (PDB code: 1UX5; in black); the complete gene (dashed line), and its predicted disordered domains (∼43% of the protein; in red). The majority of disordered residues (89%) appear in the very fast peak. (From doi:10.1073/pnas.1015994108)


As expected, the slow peak corresponds roughly to the core, the fast peak corresponds to the surface, and the very-fast peak corresponds to the disordered regions. Some residues on the surface evolve as slowly as those at the core because they are probably involved in protein-protein interactions or could be involved in aggregation/misfolding. So they are said to be under evolutionary constraints.

In my opinion, the main point in this study is that they observed that the core of the protein tends to evolves mainly if the surface evolves too; as if they are connected. Or as the authors states:

"a change in one residue, most frequently a surface residue, facilitates sequence changes in contacting residues, and most notably in core residues."

Reference:
Tóth-Petróczy A and Tawfik DS. (2011) Slow protein evolutionary rates are dictated by surface-core association. Proc Natl Acad Sci U S A. 108(27):11151-6. Epub 2011 Jun 20. doi:10.1073/pnas.1015994108

No comments:

Post a Comment