Wednesday, February 07, 2007


Human proteins evolving slowly thanks to multi-tasking genes

Many human proteins are not as good as they might be because the gene sequences that code for them have a double role which slows down the rate at which they evolve, according to new research published in PLoS Biology.

By tweaking these dual role regions, scientists could develop gene therapy techniques that produce proteins that are even better than those found in nature, and could one day be used to help people recover from genetic disorders.

The stretch of DNA which codes for a specific protein is often interrupted by sections of apparently useless DNA - known as introns - which need to be edited out in order to produce a new protein.

Recently it has been discovered that some of the instructions on where to splice and re-splice the DNA in this editing process are contained in the coding section, or exon, of the DNA itself.

So, as well as spelling out which amino acids are needed to produce a specific protein, the part of the exon immediately next to the intron contains information that is essential for the gene editing process.

This means that these parts of genes evolve particularly slowly, making the proteins they encode for not as good as they could be had evolutionary processes been more able to improve them over time.

"Our research suggests that a gene with many exons would evolve at under half the rate of the same one that had no introns, simply owing to the need to specify where to remove introns," said Professor Laurence Hurst from the University of Bath (UK), who worked with colleagues from the University of Lausanne (Switzerland) on the project.

"This is one of the strongest predictors of rates of protein evolution known, indicating that this dual coding role is vastly more influential than previously believed."

The finding could have major implications for medicine and the development of gene therapy techniques in which people with a defective gene are given the correct version.

"Our results suggest that we could make the replacement gene even better than the normal version," said Professor Hurst, from the Department of Biology and Biochemistry at the University of Bath.

"We would just need to remove the introns and tweak the protein at the sites that were dual coding.

"We also found that genes that have lost their introns many millions of years ago evolve especially fast near where the introns once resided.

"This indicates that this tweaking of the dual role sections of genes is also what evolution does when introns are removed."

The research was funded by the Biotechnology and Biological Sciences Research Council, the Swiss National Science Foundation and the Center for Integrative Genomics at the University of Lausanne.

Source: University of Bath 6 February 2007


Based on the paper "Splicing and the Evolution of Proteins in Mammals"


Parmley JL, Urrutia AO, Potrzebowski L, Kaessmann H, Hurst LD (2007) Splicing and the Evolution of Proteins in Mammals. PLoS Biol 5(2): e14 doi:10.1371/journal.pbio.0050014


Authors Summary

Most of the DNA in our genes is actually not involved in the specification of proteins. Rather, the bits with the protein-coding information (exons) are separated from each other by noncoding bits, introns. Before a gene can be translated into protein these introns are removed and the exons are spliced back together to be translated into protein. While information about which DNA to remove is largely in the introns themselves, parts of the exons near the intron - exon boundary can, for example, function as splice enhancer elements. In principle, then, these parts of exons have two functions: to specify the amino acids of the resulting protein and to enable the correct removal of introns. What impact might this have on a gene's evolution? We show that near intron - exon boundaries, amino acid usage is biased towards nucleotides involved in splice control. Moreover, these parts of genes evolve especially slowly. Indeed, we estimate that a gene with many exons would evolve at under half the rate of the same gene with no introns, simply owing to the need to specify where to remove introns. Likewise, genes that have lost their introns evolve especially fast near the former intron's location. Thus, human proteins may not be as optimised as they could be, as their sequence is serving two conflicting roles.


It is often supposed that a protein's rate of evolution and its amino acid content are determined by the function and anatomy of the protein. Here we examine an alternative possibility, namely that the requirement to specify in the unprocessed RNA, in the vicinity of intron - exon boundaries, information necessary for removal of introns (e.g., exonic splice enhancers) affects both amino acid usage and rates of protein evolution. We find that the majority of amino acids show skewed usage near intron - exon boundaries, and that differences in the trends for the 2-fold and 4-fold blocks of both arginine and leucine show this to be owing to effects mediated at the nucleotide level. More specifically, there is a robust relationship between the extent to which an amino acid is preferred/avoided near boundaries and its enrichment/paucity in splice enhancers. As might then be expected, the rate of evolution is lowest near intron - exon boundaries, at least in part owing to splice enhancers, such that domains flanking intron - exon junctions evolve on average at under half the rate of exon centres from the same gene. In contrast, the rate of evolution of intronless retrogenes is highest near the domains where intron - exon junctions previously resided. The proportion of sequence near intron - exon boundaries is one of the stronger predictors of a protein's rate of evolution in mammals yet described. We conclude that after intron insertion selection favours modification of amino acid content near intron - exon junctions, so as to enable efficient intron removal, these changes then being subject to strong purifying selection even if nonoptimal for protein function. Thus there exists a strong force operating on protein evolution in mammals that is not explained directly in terms of the biology of the protein.


Why do some parts of proteins evolve more slowly than others? Why, in turn, do some proteins evolve more slowly than others? Intragenic conserved regions are typically considered to reflect domains of functional importance to the protein [1]. Similarly, proteins with a high density of important functional sites should evolve slowly. There are, however, potentially multiple other correlates to rates of protein evolution [1]. The expression parameters of a gene (rate of expression, protein abundance, and number of tissues in which a gene is expressed) are consistently reported to be important predictors [2 - 5]. This may in part reflect selection to resist mistranslation [6]. Other possible covariates include essentiality and the number of protein interactions, but the issues here are more contentious, not least because of covariance with expression parameters [7 - 17]. Here we test the hypothesis that selection acting to ensure that introns are correctly removed skews amino acid content in predictable ways and imposes constraints on rates of protein evolution.

In mammalian genes, which are rich in introns [18], correct removal of introns often requires the presence, in the flanking exons, of splice-enhancer domains, these being short (six nucleotide) blocks required for binding of serine/arginine-rich proteins [19]. The need for splice enhancers can impact the use of synonymous codons in the domains flanking intron - exon junctions, such that when a synonymous codon is used commonly in splice enhancers it is preferred over its less commonly used synonym [20,21]. Moreover, selection to preserve splice enhancers affects both the synonymous single nucleotide polymorphism profile [22,23] and the rate of evolution at synonymous sites of splice-enhancer-associated domains [24].

Might the same forces also act to cause skews in amino acid usage in the vicinity of intron - exon junctions? In a preliminary analysis, we showed that there is a tendency for enrichment near boundaries of an amino acid whose codons are common in splice enhancers: lysine is coded by AAA and AAG, both of which are common in splice enhancers, and at both 5' and 3' ends of exons, lysine's proportional usage increases [24]. Is it more generally the case that an amino acid's usage increases near intron - exon junctions if it commonly features in splice enhancers? Conversely, are some amino acids avoided near such boundaries if they are rare in splice-enhancer domains? To address these issues, we derive patterns of amino acid preference in the vicinity of intron - exon boundaries and compare these patterns with a metric of enrichment of amino acids in splice enhancers relative to rates of usage in the genome. In turn, we ask whether selective constraints are stronger near intron - exon boundaries, and whether such constraints explain much of the variation between proteins in their rate of evolution.


Recent posts include:

"Developmental Biology: Special Issue on the Sea Urchin Genome"

"Genetics of eye colour unlocked"

"'Silent mutations' may not always be silent..."

"Balancing Robustness and Evolvability"

"Evolution: RNA Silencing Sheds Light on the RNA World"

Technorati: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Add to: CiteUlike | Connotea | | Digg | Furl | Newsvine | Reddit | Yahoo