Multiple Transcript Properties Related to Translation Affect mRNA Degradation Rates in Saccharomyces cerevisiae

Degradation of mRNA contributes to variation in transcript abundance. Studies of individual mRNAs have shown that both cis and trans factors affect mRNA degradation rates. However, the factors underlying transcriptome-wide variation in mRNA degradation rates are poorly understood. We investigated the contribution of different transcript properties to transcriptome-wide degradation rate variation in the budding yeast, Saccharomyces cerevisiae, using multiple regression analysis. We find that multiple transcript properties are significantly associated with variation in mRNA degradation rates, and that a model incorporating these properties explains ∼50% of the genome-wide variance. Predictors of mRNA degradation rates include transcript length, ribosome density, biased codon usage, and GC content of the third position in codons. To experimentally validate these factors, we studied individual transcripts expressed from identical promoters. We find that decreasing ribosome density by mutating the first translational start site of a transcript increases its degradation rate. Using coding sequence variants of green fluorescent protein (GFP) that differ only at synonymous sites, we show that increased GC content of the third position of codons results in decreased rates of mRNA degradation. Thus, in steady-state conditions, a large fraction of genome-wide variation in mRNA degradation rates is determined by inherent properties of transcripts, many of which are related to translation, rather than specific regulatory mechanisms.


codon usage mRNA degradation 4-thiouracil translation
Alterations in the abundance of mRNA result from changes in both the rate of transcript synthesis and the rate of transcript degradation. Synthesis and degradation of mRNAs is critical for control of gene expression and cell survival, as ablation of either process results in rapid loss of viability (Nonet et al. 1987;Anderson and Parker 1998). The cis and trans factors that control rates of mRNA synthesis have been studied extensively in many systems (reviewed in Hager et al. 2009). By comparison, far less is known about factors that control rates of mRNA degradation. A complete understanding of gene expression regulation requires identification of the sources of variation in mRNA degradation.
Our understanding of the mechanisms by which mRNAs are degraded (reviewed in Parker 2012) is largely the result of studies of specific transcripts (Decker and Parker 1993;Muhlrad et al. 1995;Beelman et al. 1996). These studies have shown that mRNA degradation is controlled by cis factors, including sequence elements in the coding (Parker and Jacobson 1990;Wisdom and Lee 1991) and untranslated (Shaw and Kamen 1986;Muhlrad and Parker 1992) regions, as well as trans factors, including RNA binding proteins (Olivas and Parker 2000;Chen et al. 2001) and noncoding RNAs (Jing et al. 2005). However, the extent to which these different factors impact global patterns of mRNA degradation remains unclear.
Genome-wide mRNA degradation rates have been determined for a number of organisms including bacteria (Selinger et al. 2003), plants (Narsai et al. 2007), flies (Thomsen et al. 2010), mouse (Rabani et al. 2011) and human (Duan et al. 2013) cell lines. In the budding yeast, Saccharomyces cerevisiae, genome-wide mRNA degradation rates have been measured using a variety of methods, including transcriptional inhibition (Wang et al. 2002;Grigull et al. 2004;Shalem et al. 2008), genomic-run-on (García-Martínez et al. 2004), and metabolic labeling (Munchel et al. 2011;Miller et al. 2011). In general, the concordance between different global studies of mRNA degradation rates is poor, likely due to a combination of technical and biological sources of variation. Recently, we introduced RNA Approach to Equilibrium Sequencing (RATE-seq), which combines 4-thiouracil (4-tU) labeling and RNA-seq for determination of genome-wide in vivo mRNA degradation rates (Neymotin et al. 2014). Using approach to equilibrium labeling kinetics and nonlinear regression, RATE-seq overcomes several problems with existing methods, providing improved accuracy of mRNA degradation rates estimates in steady-state conditions.
Despite discrepancies in estimates of mRNA degradation rates among different studies, three consistent features have been demonstrated across multiple transcriptome-wide datasets. First, there is variation in the rates at which different transcripts are degraded: some vary by as much as an order of magnitude. Second, transcripts for genes encoding functionally related products have similar degradation rates (Selinger et al. 2003;Wang et al. 2002;Neymotin et al. 2014;Yang et al. 2003). Third, no single property of transcripts explains the observed variation (Narsai et al. 2007;Wang et al. 2002;Munchel et al. 2011;Miller et al. 2011). This latter point suggests that either causative factors are obscured in genome-wide studies, or a combination of different factors affect rates of degradation that have transcript-specific effects. Potential properties of transcripts that might impact their rate of degradation include transcript length (Dressaire et al. 2013;Duan et al. 2013), GC content (Kudla et al. 2009), transcript abundance (Dressaire et al. 2013), codon usage (Carlini 2005;Presnyak et al. 2015), and folding properties. However, testing the effect of any single property of transcripts on global degradation rates is inherently challenging as each parameter can vary independently across transcripts. At the same time, many transcript properties are correlated with each other, making it difficult to identify causative factors. Thus, to identify determinants of variation in mRNA degradation rates, all known properties of transcripts must be considered simultaneously, and experimental designs that modulate a single property are required for validation experiments.
Here, we analyzed factors that affect mRNA degradation rates in S. cerevisiae using multiple regression analysis (Eck and Stephan 2008;Duan et al. 2013). We examined the contribution to genome-wide variation in mRNA degradation rates of multiple transcript properties for which genome-wide measurements exist, or can be calculated, including protein levels (Ghaemmaghami et al. 2003), protein half-life (Belle et al. 2006), RNA abundance (Lipson et al. 2009), transcription rates (Pelechano et al. 2010), UTR lengths (Nagalakshmi et al. 2008), ribosome density (Ingolia et al. 2009), association with RNA binding proteins (Hogan et al. 2008), the codon adaptation index (CAI) (Sharp and Li 1987), and normalized translational efficiency (nTE) (Pechmann and Frydman 2013). A multiple regression model applied to mRNA degradation rates determined using RATE-seq accounts for 50% of the variation in mRNA degradation rates. Although less variation is explained when multiple regression models are applied to other genome-wide mRNA degradation datasets, many predictors are significant in multiple datasets, suggesting that they are reproducible transcript properties that impact degradation rates. These features include ribosome density, CAI, nTE, and GC content of the wobble position in codons (GC3), suggesting that translation and mRNA degradation rates are interdependent.
Using experimental studies of individual transcripts, we show that decreasing ribosome density of a transcript increases its mRNA degradation rate resulting in decreased steady state transcript levels. Using GFP coding sequence variants that differ only in their GC3 content, we show that coding sequence affects mRNA degradation. Increasing the GC3 content results in decreased mRNA degradation rates and in-creased steady-state levels. Our experimental results confirm that mRNA degradation rates are determined by multiple factors, many of which are linked to translation.

MATERIALS AND METHODS
Plasmid construction pCM188 (Garí et al. 1997) was used as the backbone for all plasmids. This low copy CEN4 plasmid contains the URA3 gene, a constitutively expressed tetracycline transactivator, and a multiple cloning site with a CYC1 TATA region upstream, all under control of two copies of the tetracycline operator. Transcription of the gene is repressed in the presence of tetracycline or its derivatives, including doxycycline. Plasmids DGP148, DGP149, DGP231, and DGP147 are pCM188 with degenerate forms of GFP (Kudla et al. 2009), in which the proportion of GC content in the third position of each codon is 0.38 (GFP1), 0.60 (GFP2), 0.67 (GFP3), and 0.71 (GFP4). The coding sequence of each GFP was cloned into the BamHI and NotI sites. Plasmid DGP217 is pCM188 with the GAP1 coding sequence and 39UTR cloned into the BamHI and NotI sites. Plasmid DGP218 contains the same insert as DGP217, except the start codon of GAP1 was mutated to GTG.
For all experiments, a single colony for each strain was inoculated in synthetic complete medium without uracil, to maintain selection of the plasmids. Studies of the GAP1 transcript were performed in nitrogenlimiting medium, with proline as the limiting nitrogen source, as described in Hong and Gresham (2014). Saturated cultures from overnight cultures were back-diluted 1:50 into medium of the same composition. Cells were allowed to grow for 5.5 hr (2.5 doublings) before transcription was inhibited with doxycycline at a final concentration of 10 mg/ml. Cells were collected by filtration and snap frozen in liquid nitrogen.

Multiple regression analysis
For multiple regression analysis, we used degradation rate constants (min 21 ), which are approximately normally distributed. To minimize the effects of extreme outliers on regression analysis, we removed mRNA degradation rates . 1.5 times the interquartile range, which resulted in the exclusion of , 5% of all genes for most datasets. Degradation rates for most datasets were calculated as ln(2)/t half-life , except in Munchel et al. (2011) and Neymotin et al. (2014), where the effects of dilution as a result of cellular growth was also considered. Transcript counts were from Lipson et al. (2009), and estimated based on an assumption of 60,000 mRNA/cell (Zenklusen et al. 2008). Protein per mRNA was calculated as the values from Ghaemmaghami et al. (2003) divided by the values for counts. CAI (Sharp and Li 1987) and nTE (Pechmann and Frydman 2013) was calculated for each transcript based on codon frequency tables in the seqinr package in R (R Core Team 2013) or in Pechmann and Frydman (2013). As many predictors appear log-normal, we log-transformed predictor variables [log 10 (Variable) or log 10 (Variable + 1)], except for GC content of each codon position and nTE, which are approximately normally distributed, and DG, which is negative in value. For categorical variables, transcripts were classified as bound by an RNA-binding protein based on data from Hogan et al. (2008), and functionally annotated based on GO SLIM files downloaded from the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/).
In a linear multiple regression model, a parameter of interest is modeled as being dependent on two or more predictors. We consider the degradation rate constant as the modeled variable, and all genome-wide measurements as predictors. To build the model, we followed two separate approaches. In the first approach, we first determined the P-value of the pairwise correlation of each predictor with degradation rate. This indicates whether the regression coefficient is significantly different from zero, and whether or not the predictor has any affect on degradation rate. Next we included all predictors that have a P-value , 0.05 into the multiple regression models. We then performed stepwise removal of terms by removing sequentially the predictors with the highest P-value. The final model is then the reduced model, where only significant terms remained. We obtained the same result using the step function in R, which reduces models based on Akaike's Information Criterion (AIC). In a second approach, we first calculated the significance of each predictor when it is the only one in the model, as above. We then started adding to the model based on the predictors with the lowest P-value. With each additional term, we checked that all of the terms in the model remained significant. If a new term was added and it was not significant, we removed it from the model. If a new term was added and a different existing predictor was no longer significant, we tested models containing either the new predictor or the one that was no longer significant, and retained the one that explained more variation. We did not add terms that were insignificant in the pairwise correlation with degradation rate. Both approaches gave similar results. Model diagnostics suggest that there is no obvious curvature or patterns in terms of increase or decrease in variance as a function of fitted values ( Figure S6). There is also minimal curvature in the normal Q-Q plot, suggesting the model follows linearity ( Figure S6).

Data availability
We performed all analyses using R (R Core Team 2013). In addition to custom written functions in R, we also used functions from the following packages: TeachingDemos, Biostrings, LSD, stringi, GeneRfold, and seqinr. The complete dataset is available as an R workspace, along with the code for performing the analysis and generating all figures at the Open Science Framework: https://osf.io/kf4u5/ (doi:10.17605/OSF.IO/KF4U5).

Multiple transcript properties affect global mRNA degradation rates
Previous studies have found evidence for the effect of specific properties of transcripts on the degradation rate of individual transcripts (Shaw and n Kamen 1986; Parker and Jacobson 1990; Muhlrad and Parker 1992;Olivas and Parker 2000). We tested the relationship between globally measured transcript features (Table 1) and genome-wide mRNA degradation rates determined using RATE-seq (Table S1). We found that several transcript features are significantly correlated with mRNA degradation rates ( Figure 1A and Table S2). The most significant single feature predictive of mRNA degradation rates measured using RATEseq was the length of the coding sequence, which explains almost 30% of the variance. The folding energy (DG) is also significantly associated with mRNA degradation rates, which may be due to the fact that folding energy and coding sequence length are highly correlated. Several features related to the translation of transcripts were also significantly associated with mRNA degradation rates, including ribosome density, the CAI, nTE, and the GC3 content. We also tested whether the function of the encoded product is predictive of mRNA degradation rate, and found that functional assignment using gene ontology (GO) terms explains a significant fraction of the variation ( Figure 1A). This is consistent with the observation that transcripts encoding proteins in similar functional categories degrade with similar kinetics (Wang et al. 2002;Neymotin et al. 2014;Yang et al. 2003). In addition, association with specific mRNA-binding proteins also explains a significant fraction of variation in mRNA degradation rates ( Figure 1A). These results suggest a relationship between several features of transcripts and the rate at which they are degraded. We observed similar relationships between some predictors and mRNA degradation rates that have been measured using other methods ( Figure S1), suggesting that some of these relationships are reproducible despite the poor agreement in mRNA degradation rates among studies. Although many transcript properties are correlated with each other ( Figure S2), some properties show no correlation and therefore may exert independent and differential effects on the rate of mRNA degradation. Therefore, we used multiple regression analysis to model the contribution of multiple transcript features to variation in mRNA degradation rates simultaneously (Materials and Methods). We initially built a model incorporating all factors, and used sequential reduction to arrive at a minimal model (Materials and Methods, Figure S7). We find that the explained variation when multiple transcript properties are included exceeds the variance explained by any single factor, suggesting that degradation rates are determined by a combination of transcript features ("model" in Figure 1A). The final model includes the following covariates: coding length, DG, ribosome density, 59 UTR length, GC3, nTE, and protein half-life. In addition, it includes GC2 and transcription rate, despite the relatively small contribution of each of these predictors (i.e., individually, they explain , 1% of the variance). When the categorical factors of gene function and association with specific RNA-binding proteins are included, 50% of the variance in mRNA degradation rates can be explained ("model++" in Figure 1A). Thus, rates predicted by a multiple regression model are in good agreement with experimentally determined rates ( Figure 1B). Models incorporating all features explain significant fractions of the variation reported in other mRNA degradation datasets, albeit with reduced explanatory power ( Figure 1C). Interestingly, we find that, in general, models applied to mRNA degradation rates measured using transcriptional inhibition tend to explain much less variation than models applied to RNA degradation rates measured with metabolic labeling methods.
Our model suggests a negative relationship between mRNA degradation rates and translation elongation rates, as estimated by CAI, nTE, and ribosome density. Translation elongation rates are slowed during peptide bond formation for proline residues (Gardin et al. 2014), and particularly when multiple prolines are encoded sequentially. To further investigate the role of translation elongation rates on mRNA degradation, we classified transcripts based on presence or absence of at least four sequential proline codons. Consistent with our multiple regression model, transcripts rich in proline degrade more rapidly than the rest of the transcriptome (Figure 2A). Interestingly, stretches of proline codons are also associated with overall lower protein expression levels ( Figure 2B).

Ribosome density affects mRNA degradation rate
Our computational analysis suggests that different aspects of translation affect mRNA degradation rates. Ribosomes are generally thought to protect mRNAs from degradation (Parker 2012). Consistent with previous analyses (Edri and Tuller 2014), we find that increased ribosome density is associated with decreased rates of mRNA degradation ( Figure  S3). To experimentally reduce the density of ribosomes on specific transcripts, we mutated the start codon of an endogenous transcript, GAP1, which encodes the general amino acid permease, from ATG to GTG, and placed it on a low-copy plasmid under control of doxycyclinerepressible promoters (Garí et al. 1997). Addition of doxycycline has little affect on cellular physiology, and no detectable affect on global gene expression (Wishart et al. 2005). ATG start codons are required for the small ribosomal subunit to recruit the large ribosomal subunit for fully formed ribosomes. Mutation of the GAP1 start codon to GTG is expected to reduce the number of ribosomes bound to GAP1 mRNAs as translation initiation at GTG occurs with a frequency of 5% compared to ATG start codons (Kolitz et al. 2009).
We tested the GAP1 transcripts for alteration in degradation kinetics as a function of start codon mutation. In the absence of the wildtype start codon, we find that the GAP1 transcript is decreased significantly in steady-state mRNA abundance ( Figure 3A), and that the transcript degrades more rapidly upon addition of doxycycline to repress transcription initiation (Figure 3, B and C). This suggests that a decrease in ribosome density results in an increase in the degradation rate of the GAP1 transcript, consistent with the global trend identified in our computational analysis.
Decreased GC content of the third codon position increases the rate of mRNA degradation Our multiple regression model predicts that factors involved in translation, including ribosome density, CAI, nTE, and the GC3 content, contribute to variation in mRNA degradation rates ( Figure 1A). GC3 content has been reported to affect mRNA abundance and mRNA degradation in Escherichia. coli (Kudla et al. 2009). In mammalian cells, GC3 content was also found to affect mRNA levels, but not degradation rates (Kudla et al. 2006), implying that mRNA synthesis or processing must underlie differences in mRNA levels. However, a more recent genome-wide study found evidence that decreased GC3 content is correlated with increased mRNA degradation rates (Duan et al. 2013).
To study the contribution of GC3 content to variation in mRNA degradation rates, we used GFP constructs that differ in sequence at synonymous sites only (Kudla et al. 2009). We studied four GFP transcripts that span a range of GC3 content from 38 to 71% ( Figure 4A). Changes in GC3 content also result in overall variation in total GC content ( Figure 4A). Coding sequences were placed under control of the identical doxycycline-regulated promoter on low copy plasmids, and engineered to have the same UTRs. We confirmed that all four constructs result in functional GFP expression (data not shown).
As all coding sequences are expressed from an identical promoter, differences in steady-state mRNA abundance must result from differences in rates of degradation, synthesis, or processing. We find that steady-state mRNA levels vary with GC3 content, with the highest GC3 content resulting in the highest steady-state mRNA abundance ( Figure  4B), consistent with observations in mammalian cells (Kudla et al. 2006). Following addition of doxycycline to repress transcription initiation, we confirmed that three of the four GFP-encoding transcripts degrade differentially in a GC3-dependent manner (Figure 4, C and D), consistent with our multiple regression prediction. Using the measured steady-state abundances and degradation rates for the three transcripts, we estimated synthesis rates. All three strains have similar estimated rates of synthesis, consistent with differences in degradation rates being the primary determinant of differences in steady-state mRNA levels. We find that a fourth construct, which has much lower GC3 content, and the lowest steady-state abundance, does not significantly differ in its mRNA degradation rate from the second lowest GC3 ( Figure S4). This may reflect a limitation of the sensitivity of our assay, or the fact that other factors are likely to interact with the effect of GC3 content.

DISCUSSION
The abundance of mRNAs is determined by both the rate at which they are synthesized and the rate at which they are degraded. In this study, we sought to construct a comprehensive model that predicts genome-wide variation in mRNA degradation rates. Using multiple regression Figure 2 The presence of multiple proline codons affects degradation rates and protein production. (A) Box plots of the mRNA degradation rates of proline-rich and proline-poor proteins. Transcripts that contain sequential proline codons, which slow translation elongation rates, tend to be less stable than other transcripts (Wilcoxon rank sum test, P = 7.5 · 10 232 ). (B) The abundance of poly-proline containing proteins is reduced compared to the global distribution of protein abundances.
analysis, we find that 43% of the variation in mRNA degradation rates determined using RATE-seq can be explained by considering multiple properties of transcripts in a single model. By including association with specific RNA-binding proteins and the function of the encoded product, 50% of the genome-wide variation in mRNA degradation rates can be explained. Interestingly, we find that, in general, methods for measuring RNA degradation that use transcriptional inhibition tend to explain less variation than less disruptive methods. This may reflect that fact that metabolic labeling methods, which perturb the cell only minimally, yield more physiologically relevant degradation rates than transcriptional inhibition, which results in cell death.
In our analysis of mRNA degradation rates measured using RATEseq, coding sequence length is the strongest single predictor of mRNA degradation rates: in general, the longer a transcript, the more rapidly it is degraded. Other genome-wide investigations have shown a positive relationship between the length of the mature mRNA and its rate of degradation (Feng and Niu 2007;Dressaire et al. 2013;Duan et al. 2013;Geisberg et al. 2014). In our original study using RATE-seq, we showed that estimates of mRNA degradation rates are not biased by transcript length (Neymotin et al. 2014). In the related DTA method of Miller et al.(2011), coding length is regressed out of estimates of mRNA degradation, and, therefore, the effect of coding length is no longer apparent in the resulting data. Studies of individual transcripts have shown that increasing transcript length by addition of specific sequences containing "instability elements" enhances a transcript's rate of degradation (Caponigro et al. 1993). Therefore, it is possible that it is not the length of the transcript that affects its degradation, but the presence of additional regulatory elements, to which trans factors such as RNA-binding proteins can bind, promoting their degradation.

Association with ribosomes increases mRNA stability
Our multiple regression analysis provided evidence that protein translation impacts mRNA degradation rates. Previous studies regarding the role of translation initiation rates on mRNA degradation have shown differing results. In a detailed study of the CYC1 transcript, all start codons were removed from its coding region, thereby preventing ribosome binding and translation (Yun and Sherman 1996). Following global transcriptional inhibition with thiolutin, the untranslated transcript degraded with similar kinetics to the translated transcript. Similarly, when translation of the MFA2 transcript was inhibited by introducing a strong secondary structure in the 59 region of the transcript, it was not found to alter the degradation kinetics following transcriptional inhibition using the GAL system (Beelman and Parker 1994). By contrast, using the same method to reduce translation of the PGK1 transcript results in an increased mRNA degradation rate (Muhlrad et al. 1995). Consistent with a recent report (Edri and Tuller 2014), our analysis shows that genome-wide increased association with translation machinery, as measured by ribosome density, is correlated with decreased mRNA degradation rates (Figure 1 and Figure S3).
To validate experimentally the effect of translation initiation rates and ribosome density on the rate of mRNA degradation, we mutated the start codon of GAP1 from ATG to GTG. Consistent with results from our genome-wide study, loss of the wild-type start codon results in reduced mRNA levels, and an increased rate of mRNA degradation. Using a functional assay for growth inhibition in the presence of D-amino-acids, which are toxic to yeast cells and transported by GAP1, we found that mutation of the start codon to GTG does not result in a complete loss of protein function ( Figure S5). This may be attributable to sufficient full-length protein expression from translation initiated at the GTG start codon to confer D-amino-acid sensitivity. In addition, a potential inframe start codon is located 288 nucleotides (96 amino acids) downstream of the wild-type start codon, and its use may result in a protein product that retains much of the wild-type GAP1 activity although the first 96 amino acids of GAP1 are known to affect the functionality and localization of the encoded permease (Merhi et al. 2011). An out-of-frame ORF between codons 110 and 187 could potentially promote nonsense mediated decay (NMD) of mutant transcripts. Therefore, in addition to a decrease in ribosome density, we cannot exclude the possibility that alterations in protein function and/or NMD contribute to the decreased stability of the mutant transcript.
We searched for common transcript features to understand why a subset of the transcripts (4%) fit the model poorly (.2 SD greater than the mean residual value) ( Figure S8). We find that the outlier transcripts, which are degraded more rapidly than expected, are shorter, have lower ribosome density, and are less translationally efficient (based on normalized translational efficiency score). Thus, although shorter transcripts tend to be more stable, the properties of these outliers suggest that reduced translation of short transcripts results in faster degradation, consistent with our overall model.

Synonymous coding mutations affect mRNA degradation rates
Regression analysis suggested genome-wide relationships between codon usage and mRNA degradation rates. We find a negative correlation Figure 3 The effect of ribosome density on GAP1 mRNA degradation rate. (A) Mutation of the wild-type start codon in GAP1 from ATG to GTG results in (A) a reduced steady state transcript level, and (B) an increased mRNA degradation rate. In (B) we show the average value for each time point 6 SEM. In (C) we show bootstrapped CI for the regression of all data points. Solid lines indicate the line of best fit, and dotted lines indicate CI. In blue is the transcript with a wildtype start codon. In red is the transcript with a mutated start codon.
between two different measures of biased codon usage, CAI and nTE, and mRNA degradation rate. Bias toward more frequent codons in a transcript may increase the elongation rate during translation (Plotkin and Kudla 2011). Therefore, this negative correlation suggests that faster elongation by ribosomes may result in decreased rates of mRNA degradation. This observation is consistent with the recent report by Presnyak et al. (2015), who showed that increased occurrence of optimized codons, as defined by nTE, in transcripts, results in decreased degradation rates. We note that, although the effect of nTE on mRNA degradation rates is significant, it explains only 7% of the genomewide variance in both RATE-seq data and the data generated by Presnyak et al. (2015), who used a RNA polymerase II temperature sensitive mutant for transcriptional shut-off (Table S2). We also find evidence that the presence of multiple sequential proline codons, which slows translation elongation, results in faster mRNA degradation, fur-ther supporting an effect of translation elongation rates on mRNA degradation rates.
By studying genome-wide degradation rates, we find evidence that the GC3 content is correlated negatively with rates of mRNA degradation. Despite the comparatively small amount of variance explained by GC3, the effect is significant and consistent with an earlier study that found a positive correlation between synonymous A|T dinucleotides spanning codon boundaries and mRNA degradation rates (Carlini 2005). To validate this effect experimentally, we tested the effect of GC3 content on mRNA degradation using GFP-encoding transcripts that differ in GC3 content. Consistent with our genome-wide analysis, we found that decreasing the GC3 content resulted in increased mRNA degradation rates, and lowered steady-state abundances. We surmise that the effect is not related to overall GC content of transcripts, as the GC content of the first (GC1) and second (GC2) position in codons Figure 4 GC3 content affects degradation kinetics and steady state levels. (A) Four GFP transcripts containing synonymous mutations span a range of GC3 content (left) and overall GC (right). Nucleotide-level alignment of each GFP-encoding transcript relative to GFP2. Positions of similarity in sequence are depicted by a gray line, and sites of synonymous mutations are indicated with colored triangles. (B) Differences in GC3 content affect steady state levels of transcripts. (C, D) GFP2 (red), GFP3 (yellow), and GFP4 (blue) degrade in a GC3-dependent manner. In (C) we show the average value for each time point 6 SEM. In (D) we show bootstrapped 95% confidence intervals for the regression of all data points. Solid lines indicate the line of best fit, and dotted lines indicate confidence intervals. explains much less variance. However, the effect of GC3 on mRNA degradation rate appears to be independent of the effect of codon bias, as the four GFP transcripts investigated do not show a systematic difference in CAI or nTE. Recently, increased GC3 content has been suggested to decrease mRNA degradation rates in human lymphoblastoid cells, and possibly explain variation in mRNA degradation rates between individuals (Duan et al. 2013). Thus, the relationship between GC3 content and mRNA degradation rates may be widely conserved in eukaryotes.

Conclusion
Our study shows that genome-wide variation in mRNA degradation rates is best explained by a combination of different transcript features, as suggested more than two decades ago (Caponigro et al. 1993). Many of the factors that affect genome-wide patterns of mRNA degradation rates are related to protein production, highlighting the close relationship between mRNA degradation and translation. Our statistical analysis and experimental validation confirm the effect of ribosome density and GC3 content on mRNA degradation rates. Whereas mutation of a start codon likely changes translation initiation rates, modifying GC3 within a transcript is likely to alter translation elongation rates. Thus, both translation initiation and elongation rates may impact mRNA decay rates, but perhaps through different degradation pathways.