Response of Polygenic Traits Under Stabilizing Selection and Mutation When Loci Have Unequal Effects

We consider an infinitely large population under stabilizing selection and mutation in which the allelic effects determining a polygenic trait vary between loci. We obtain analytical expressions for the stationary genetic variance as a function of the distribution of effects, mutation rate, and selection coefficient. We also study the dynamics of the allele frequencies, focusing on short-term evolution of the phenotypic mean as it approaches the optimum after an environmental change. We find that when most effects are small, the genetic variance does not change appreciably during adaptation, and the time until the phenotypic mean reaches the optimum is short if the number of loci is large. However, when most effects are large, the change of the variance during the adaptive process cannot be neglected. In this case, the short-term dynamics may be described by those of a few loci of large effect. Our results may be used to understand polygenic selection driving rapid adaptation.

ABSTRACT We consider an infinitely large population under stabilizing selection and mutation in which the allelic effects determining a polygenic trait vary between loci. We obtain analytical expressions for the stationary genetic variance as a function of the distribution of effects, mutation rate, and selection coefficient. We also study the dynamics of the allele frequencies, focusing on short-term evolution of the phenotypic mean as it approaches the optimum after an environmental change. We find that when most effects are small, the genetic variance does not change appreciably during adaptation, and the time until the phenotypic mean reaches the optimum is short if the number of loci is large. However, when most effects are large, the change of the variance during the adaptive process cannot be neglected. In this case, the short-term dynamics may be described by those of a few loci of large effect. Our results may be used to understand polygenic selection driving rapid adaptation. KEYWORDS polygenic selection mutation unequal effects dynamics genetic variance rapid adaptation In the study of fast adaptation, polygenic selection may be more important than selection on single genes. At single genes, strong selection driving fast adaptation generally leads to the rapid fixation of beneficial alleles or at least large allele frequency shifts between populations. Classical examples of this type of adaptation are the case of industrial melanism in moths (van't Hof et al. 2011) and insecticide resistance in Drosophila (Daborn et al. 2002). However, in many (if not most) cases of fast adaptation, such as in island lizards that are able to adapt very quickly to a changing vegetation [e.g., revealed in an experimental evolution study by Kolbe et al. (2012)], small shifts of allele frequencies at many loci may be sufficient to move a phenotype toward a new optimum under changed environmental conditions. There is a large and growing body of literature on the detection of adaptive signatures in molecular population genetics. Following pioneering work of Maynard Smith and Haigh (1974), the impact of positive selection on neutral DNA variability (selective sweeps) has attracted much interest. This theory has been applied to huge datasets that emerge from modern high-throughput sequencing. A large number of statistical tests have been developed to detect sweep signals and estimate the frequency and strength of selection (Kim and Stephan 2002;Nielsen et al. 2005;Pavlidis et al. 2010). However, most theory so far excludes the phenotypic side of the adaptive process (except for fitness). Usually, selection is simply modeled as a constant force that acts on a new allele at a single locus. This is in striking contrast to the classical phenotype-based models of adaptation that are successfully used in quantitative genetics (Barton and Keightley 2002). These models typically assume that adaptations are based on allele frequency shifts of small or moderate size at a large number of loci. Also, adaptation does not require new mutations, at least in the short term. Instead, selection uses alleles that are found in the standing genetic variation. Genome-wide data of the past few years show that this quantitative genetic view is relevant. In particular, association studies confirm that quantitative traits are typically highly polygenic. High heritabilities most probably result from standing genetic variation at a large number of loci with small individual effect. Also, local adaptation to environmental clines involves moderate frequency shifts at multiple loci (Hancock et al. 2010). As a consequence, there is growing evidence that the molecular scenario of sweeps only covers part of the adaptive process and needs to be revised to include polygenic selection.
Because genome-wide association studies (GWAS) yield information about the distribution of single-nucleotide polymorphisms relevant to quantitative traits (Visscher et al. 2012), it is important to understand the models of polygenic selection in terms of the frequency changes of molecular variants, i.e., in terms of population genetics. So far, however, the dynamics of only very simple polygenic models have been studied and applied to data [e.g., Turchin et al. (2012)]. In this article, we analyze the dynamics of a quantitative trait given by a much more general model that was originally proposed by Wright (1935) and recently re-visited by de Vladar and Barton (2014). These authors consider an infinitely large population evolving under stabilizing selection and mutation. Following the observations from many empirical studies [particularly from biomedicine, see Visscher et al. (2012)], they assume that the effects are locus-dependent. Furthermore, they consider additivity of the effects and linkage equilibrium between loci.
Here we study this model and obtain some analytical results on the stationary genetic variance and also the dynamics of the phenotypic mean. We show that the stationary genetic variance may exhibit nonmonotonic dependence on the shape of the distribution of effects. We also study how the trajectories of the allele frequencies and the mean trait respond to a sudden environmental shift. When most effects are small, as is the case in experiments on Drosophila (Mackay 2004), in livestock (Hayes and Goddard 2001;Goddard and Hayes 2009), and for human height (Visscher 2008), a simple analysis shows that the magnitude of the deviation of the phenotypic mean from the optimum decays roughly exponentially with time and approaches zero over a time scale that is inversely proportional to the initial genetic variance. When most effects are large, the short-term dynamics of the mean and variance can be understood by considering a few loci with large effects.

MODEL WITH LOCUS-DEPENDENT EFFECTS
We consider the ℓ-locus model recently analyzed by de Vladar and Barton (2014) where each locus is biallelic. The þ allele at site i has frequency p i whereas the 2 allele occurs with frequency q i = 1 2 p i . The effects are assumed to be additive so that the trait value is z ¼ P ℓ i¼1 sgnðiÞ g i , where sgnðiÞ ¼ 61 denotes the sign of the genotypic value of locus i and g i . 0 is the effect of the allele at the ith locus. The loci are assumed to be in linkage equilibrium so that the joint distribution of effects at the loci factorises. As a result, the nth cumulant c n of the phenotypic effect, obtained on averaging over the population distribution, can be written as the sum over the corresponding quantities at individual loci. The first three cumulants viz. mean c 1 , variance c 2 , and skewness c 3 , are given by the following (Bürger 1991): The allele frequency evolves in time under selective pressure and is given by the following (Barton 1986): where w is the average fitness of the population. For large ℓ, as the trait value z of an individual can be treated as a continuous variable, from (1a) and (1b), we obtain w ¼ where the approximate equality sign holds because s is assumed to be small. In the above expression, wðzÞ ¼ 1 2 ðs=2Þðz2z o Þ 2 is the fitness distribution of the phenotypic trait under stabilizing selection, z 0 the phenotypic optimum and Dc 1 ¼ c 1 2 z o the mean deviation from z 0 . Thus the maximum fitness (namely, one) is obtained when the population is at the phenotypic optimum and has no genetic variance. Inserting equations (1) and (3) in (2) and accounting for mutations, we obtain the following basic equation for the evolution of allele frequencies, where m is the probability of (symmetric) mutation between the þ and 2 allele at locus i. Note that the equation (1) of de Vladar and Barton (2014) is obtained by replacing s by 2s in the above equation.
On the right-hand side (RHS) of (4), the first term (in the first parenthesis) expressing the mean deviation from the optimum corresponds to directional selection toward the phenotypic optimum: if the mean is above (below) the optimum, the allele frequencies decrease (increase). However, once the phenotypic mean is sufficiently close to the optimum, stabilizing selection (described by the second term) takes over.
One of the difficulties in solving (4) is that it involves the mean c 1 , which depends on all the allele frequencies. Moreover, it has been shown that the differential equations for the cumulants do not close: each one not only involves two higher cumulants but also contains terms that cannot be written in terms of other cumulants (Barton and Turelli 1987;Bürger 1991).

GENETIC VARIANCE IN THE STATIONARY STATE
In the stationary state in which the left-hand side of (4) vanishes, if the mean c Ã 1 ¼ z o , the allele frequency p Ã i has three solutions, namely 1=2 and 16 The latter two solutions are stable for g i .ĝ, and therefore the allele frequency is close to fixation when the effects are large. For g i ,ĝ, the effects are small and the stationary state solution p Ã i ¼ 1=2 is the only stable solution for the allele frequency (de Vladar and Barton 2014). From these results, the stationary genetic variance (1b) is easily seen to be (de Vladar and Barton 2014) where, for large ℓ, the number of effects larger thanĝ is given by n l ¼ ℓ R N g dg pðgÞ with pðgÞ being the distribution of effects. Thus the genetic variance in the stationary state can be neatly written as where the first (second) term is the contribution from loci with large (small) effects. The gamma distribution pðgÞ $ g k21 e 2kg= g with shape parameter k . 0 and mean g has been used to fit the distribution of quantitative trait loci effects (Hayes and Goddard 2001). This distribution is L-shaped for k , 1 and bell-shaped for k . 1, whereas for k ¼ 1, it is an exponential function. For the gamma distribution, the stationary genetic variance (6) is given by where g r ¼ĝ= g and Gða; bÞ ¼ R N b dt t a21 e 2t is the incomplete gamma function. For the special case of exponentially distributed effects ðk ¼ 1Þ, (7) simplifies to give c Ã 2 ¼ ℓ g 2 ð1 2 e 2g r ð1 þ g r ÞÞ. For g )ĝ, the House of Cards (HoC) variance, namely c Ã 2 ¼ ℓĝ 2 =2 ¼ 4mℓ=s (Turelli 1984), is obtained whenĝ is finite but g/N for all k. In the opposite case ð g (ĝÞ, we have c Ã 2 ¼ ℓ g 2 ðk þ 1Þ=ð2kÞ, which depends on the shape of the distribution of effects. For fixed mean g, the genetic variance increases monotonically with the scaleĝ because a larger mutation probability increases the variance. If instead the distribution mean g is increased keepingĝ fixed, the variance increases with g toward the HoC value. This is because for fixed k, the width of the distribution increases with g and therefore larger effects can be accessed. The stationary genetic variance has been computed numerically for various shape parameters whenĝ ¼ 0:063; g ¼ 0:1 and ℓ ¼ 1000 in de Vladar and Barton (2014). From (7), the variance for k = 1, 2, 10, and 100 is found to be 1.32, 1.57, 1.94, and 2, respectively, which agrees well with the numerical data in their Figure 5. To understand how the variance depends on the shape parameter, we first note that with increasing k (and fixed g), the width of the gamma distribution decreases. For large k, if g .ĝ, the variance saturates to the HoC variance since almost all loci have large effects with narrow distributions whereas in the opposite case, most effects are small and the variance tends to ℓ g 2 =2 (see Figure 1). For small k, irrespective of whether g is above or belowĝ, we find that most effects are small. To see this, consider the fraction f s ¼ 1 2 ðn l =ℓÞ of loci with small effects which is given by If k , ðg r Þ 21 , the aforementioned equation yields f s % ðkg r Þ k =k!. Then for finite g r , when k/0, we find that n s /ℓ for any g r , as claimed previously. To summarize, as shown in Figure 1, forĝ , g, the variance increases with k toward the HoC variance, whereas for g .
g, both c Ã 2 and n s are nonmonotonic functions of k. When the effects are chosen from an exponential distribution, the fraction f s ¼ 1 2 e 2g r . On eliminating g r in favor of f s in (7) for k ¼ 1, we find the relative contribution of loci with small effects to the total variance to be which increases as f s =3 for small f s and approaches unity as f s increases toward one. The aforementioned expression shows that if 10% of the effects are small, their contribution to the variance is merely 3%, which increases to 21% when f s is one half. To obtain an equal contribution from small and large effects, a disproportionately large fraction (~83%) of small effects is required. We are unable to obtain an analytical expression analogous to (9) for arbitrary k since f s is not a simple function of g r (see (8), shown previously). However, a numerical analysis using (7) and (8) shows that for the same value of f s , small effects contribute more to the total genetic variance as the distribution of effects gets narrower. For f s = 0.1, the relative contribution is found to be 2%, 3%, and 5% for k = 1/2, 1, and 2, respectively. To obtain an equal contribution from loci with small and large effects, f s = 0.89, 0.83, and 0.77 is needed for the shape parameters k = 1/2, 1, and 2, respectively.

DYNAMICS OF THE ALLELE FREQUENCY
We now turn to a description of the allele frequency dynamics and will consider the situation when the phenotypic optimum is suddenly shifted. As mentioned previously, due to the term Dc 1 on the RHS of (4), all the frequencies are coupled, which makes it hard to obtain an exact analytical solution of the allele frequency dynamics. However, under certain conditions, it is a good approximation to consider only the c 1 term in (4) for the initial dynamics and the rest of the terms for long-term evolution.
To see this, we first note that because 0 , p i , 1, the mean jc 1 ðtÞj , P i g i % gℓ. For independent and uniformly distributed initial frequencies, as the average initial frequency is one half, the leading order contribution (in ℓ) to the initial mean is zero. The initial variance is, however, nonzero which gives the typical initial mean When the phenotypic optimum z o & g ffi ffi ℓ p and the number of loci is large, the initial value jDc 1 ð0Þj=g i $ ffi ffi ℓ p ) 1. Thus at short times, we can neglect 2p i 2 1 (which is bounded above by one) and the mutation term in comparison to the term 2Dc 1 =g i in (4). At large enough crossover time t · , as explained in the following section, the mean deviation is close to zero and the reverse condition holds, i.e., 2jDc 1 ðtÞj=g i ( 2p i 2 1 in (4), and we may set Dc 1 % 0 for later evolution. Biologically, these considerations mean that initially the effects are weaker than the mean trait deviation, but as the population adapts due to directional selection, the deviation of the mean from the phenotypic optimum becomes smaller than the effects. The aforementioned argument applies not only to uniformly distributed initial frequencies but in more general settings as well where jc 1 ð0Þj $ gℓ by replacing ffi ffi ℓ p by ℓ. Here we will focus on the dynamics of the allele frequency when the optimum is suddenly shifted to a new value z f ð , ℓ gÞ, starting from the population which is equilibrated to a phenotypic optimum value z o . In this situation, as the initial frequency is close to one half when g i ,ĝ, the frequency 2p i ð0Þ 2 1 is obviously negligible compared with Dc 1 ð0Þ=g i , whereas for g i .ĝ, 2p i ð0Þ 2 1 is close to one because the initial frequency is close to either zero or one (de Vladar and Barton 2014).

When most effects are small
The effects at most of the loci can be smaller than the scaleĝ either if k is large and g ,ĝ, or if k is small. Then for most loci, at short times, the full model defined by (4) can be approximated by where the last equation for cumulants is obtained from the results of Bürger (1991). Equation (10b) for n ¼ 1 shows that the magnitude of the mean deviation decreases with time, and for the phenotypic optimum smaller than the maximum attainable value of the mean ðz o ( gℓÞ, the trait mean becomes close to the phenotypic optimum at large times (see Figure 2A). We now assume that the variance c 2 is independent of time and stays at its initial value c 2 ð0Þ (Chevin and Hospital 2008). As explained in Appendix A, this approximation is good when a combination of the initial cumulants is small (see also Figure 2B). This allows us to solve (10a) and (10b), and we immediately find that c 2 ð0Þ ð12e 2 c 2 ð0Þst Þ : Equation (11) shows that the mean deviation approaches zero over a time scale t · $ ðsc 2 ð0ÞÞ 21 .
Next we analyze the long-term evolution of the allele frequencies. As Figure 2A shows, there is a small but nonzero mean deviation Dc Ã 1 in the stationary state. Taking this into consideration and accounting for the other terms in (4), for t . t · , we can write For Dc Ã 1 ¼ 0, the aforementioned equation can be easily solved to give where m i ¼ ðĝ=g i Þ 2 and p i ðt · Þ is obtained from (12). We check that the stationary state solutions ð16 ffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 m i p Þ=2 and 1=2 are obtained from the above result for m i , 1 and . 1, respectively. Furthermore, the solution p ðþÞ i ðtÞ is obtained for p i ðt · Þ . 1=2 and p ð 2 Þ i ðtÞ for p i ðt · Þ , 1=2. Figure 2 and Figure 3 show a comparison between the numerical solution of (4) and the approximation described previously, when the initial condition is the stationary state of the population equilibrated to a phenotypic optimum z 0 . The initial mean deviation Dc 1 ð0Þ is seen to be close to 2 z f , and the initial variance c 2 ð0Þ ¼ c Ã 2 for the zero mean deviation is 0.0967, which is close to the value 0.131 obtained from the set of effects used in Figure 2 and Figure 3. As Figure 2 shows, the dynamics of the mean deviation are captured well by (11) and approach a stationary value close to zero ðDc Ã 1 % 2 0:016Þ in about 1500 generations. The variance also evolves with time, but the change is not substantial and the approximation c 2 ðtÞ % c 2 ð0Þ is good over the time scale directional selection toward the phenotypic Figure 1 Genetic variance in the stationary state as a function of the shape parameter k when the effects are distributed according to the gamma function. The plot shows the total genetic variance (solid), variance caused by small effects (small dashes), large effects (large dashes), and the fraction of small effects (dotted) for (a) g ¼ 0:04;ĝ ¼ 0:08 and (b) g ¼ 0:1;ĝ ¼ 0:05 for ℓ ¼ 1000. The asymptotic values ℓ g 2 ðk þ 1Þ=ð2kÞ whenĝ .
g and ℓĝ 2 =2 whenĝ , g are also shown (top solid curves). optimum operates. Equation (11) also indicates that directional selection toward the optimum will occur faster when the initial variance is large since t · $ 1=c Ã 2 . Because the stationary genetic variance displays a nonmonotonic dependence on the shape parameter k of the gamma distribution (see Figure 1), the relaxation time for the mean deviation is expected to decrease and then increase with increasing k. Indeed, as the inset of Figure 2A shows, the difference c 1 ðtÞ 2 c Ã 1 (which, by definition, is zero in the equilibrium state for all k) equals a reference value 20.05 at time 360, 340, and 370 for k = 1, 5, and 20, respectively. The allele frequency dynamics are shown in Figure 3. We see that although the short-term dynamics can be accurately described by (12) for loci with effects smaller than or close to the distribution mean, there is a substantial difference when the effects are larger than the mean. This is because for such loci, the initial frequency is not close to half and the term involving q i 2 p i on the RHS of (4) cannot be neglected. For t . t · , the long-term behavior described by (13) is shown with Dc Ã 1 ¼ 0 and the actual mean deviation.
When most effects are large When g .ĝ and k is large, the number of loci with large effects is also large, and the initial allele frequencies are close to either zero or one.
In this parameter regime, both the variance and the skewness may change appreciably during directional selection toward the optimum, and the constant-variance approximation discussed above is not suitable. However, at very short times when Dc 1 (t) is close to its initial value, the solution (12) for the allele frequency gives From the aforementioned equation, we first note that the allele frequency at large-effect loci changes fast as expected intuitively. Equation (16) also shows that for Dc 1 (0) , 0, the allele frequency quickly increases toward unity, if the initial frequency is close to unity and therefore does not contribute to the dynamics of the variance or skewness. Thus, to understand the short-term dynamics, we need to focus our attention on large-effect loci with low initial allele frequency for negative initial mean deviation. Similar remarks apply to the situation when Dc 1 (0) is positive where the large-effect loci with high initial frequency determine the dynamics. In the following, we assume that Dc 1 (0) , 0 and consider the time evolution of the allele frequency P of the largest effect locus with lowest initial frequency. Figure 4 shows that the allele frequency P sweeps to fixation, but the frequency of the next relevant locus (i.e., the next largest effect locus with low initial frequency) does not. In such a case, we can approximate the mean c 1 and variance c 2 by the contribution from the frequency P with effect G, and obtain where P 0 [Pð0Þ. Then using the aforementioned expression for the phenotypic mean in (4) and neglecting mutations (since most effects are large), we get We thus find that the allele frequency P is a solution of the following equation: An explicit solution of (20) seems hard to obtain since a is in general not an integer. However, for large and negative a, the aforementioned equation yields Thus for z f ) G, the frequency P sweeps to fixation in a time of order ðsGz f Þ 21 . Figure 4 shows the allele frequency of the largest effect locus with lowest initial frequency obtained using (4). It agrees reasonably well with the solution of (20) and the expression (21) where a = 21.43. In Figure 5, the dynamics of the first two cumulants given by (1a) and (1b) are compared with the approximate expressions (17a) and (17b), respectively, and we see a good agreement.
A detailed numerical analysis of the set of parameter values of Figure 4 suggests that the dynamics of this example can be understood by considering one, two, or three of the largest-effect loci. When only the largest-effect locus was required, the second largest effect was much lower than the largest one.  (14) for t . t · where t · ¼ 1500. The dashed curve for t . t · is the solution of (13) with Dc Ã 1 ¼ 2 0:016. The other parameter values are the same as in Figure 2.

DISCUSSION
One of the fundamental questions in adaptation is whether the adaptive process is governed by many loci of small effect or few loci of large effect (Orr, 2005). However, which effects are small, and which large? de Vladar and Barton (2014) have provided a scaleĝ $ ffiffiffiffiffiffiffi ffi m=s p for the size of effects, which is a function of basic population genetic parameters, namely mutation probability m and selection coefficient s, relative to which an effect is defined as large or small. For a given distribution of effects, assumed here to be a gamma distribution with mean g and shape parameter k, an effect is small (large) if it is below (above)ĝ. But for fixed g andĝ, whether most or a few effects are small depends on the shape parameter k: for small k, most effects are small, but for large k, the number of small effects depends on the ratio g r ¼ĝ= g.

Genetic variance in stationary state
Here we have provided analytical expressions for the stationary genetic variance c Ã 2 when the effects are locus-dependent. We find that when most effects are small, c Ã 2 is a nonmonotonic function of the shape parameter of the gamma distribution (going through a maximum for intermediate values of k; see Figure 1). In contrast, it increases monotonically when most effects are large. As Figure 1 shows, when the shape distribution is narrow, large (small) effects contribute most to the variance when g r . 1ð , 1Þ. However, for broad distributions, although the number of small effects is large, small effects do not contribute much when g r , 1. The HoC variance is obtained irrespective of k when g r /0 because all loci have large effect in this limit. As noted previously (de Vladar and Barton, 2014), HoC provides an upper bound on the genetic variance.
Dynamics when most effects are small As the distribution of quantitative trait loci measured in experimental and natural populations (Hayes and Goddard, 2001;Mackay, 2004;Visscher, 2008;Goddard and Hayes, 2009) find most effects to be small, it is important to study this situation in detail. Here we have obtained analytical expressions for the dynamics by assuming the genetic variance to be constant. Although the fact that the variance does not change much in time when most effects are small was observed numerically in de Vladar and Barton (2014), an explanation of this behavior was not provided. Here, as explained in Appendix A, it is a good approximation to assume the variance to be time-independent provided the product of the initial values of the mean deviation and skewness is small.
In the absence of mutations, Chevin and Hospital (2008) have considered the effect of background with a time-independent genetic variance on the frequency at a single focal locus. Their results match the ones obtained here using the short-term dynamics model with directional selection only; in particular, (11) and (12) match the results (21) and (25) of Chevin and Hospital (2008), respectively, on identifying their parameters v 2 and a with 1=s and g from this study.
Our basic result concerning the dynamics of the phenotypic mean is that it relaxes over a time scale that is inversely proportional to the initial variance. Because the variance is of order ℓ, we thus have the important result that the mean approaches the optimum faster if a larger number of loci is involved. Moreover, this time depends nonmonotonically on the shape parameter of the gamma distribution. Note that the phenotypic mean deviation relaxes to zero when the phenotypic optimum is far below the upper bound gℓ on the phenotypic mean. However, when the phenotypic optimum exceeds the maximum typical value of the mean, such that the mean deviation remains substantially different from zero at late times, (10b) shows that all higher cumulants vanish at the end of the phase of directional selection.
Dynamics when most effects are large When the initial mean deviation is moderately large (and negative), the genetic variance changes by a large amount over the time scale directional selection occurs and the dynamics can be understood by considering a few loci whose effect is large but initial frequency is low.
However, for larger mean deviations (but smaller than ℓ g), a few large-effect loci do not completely capture the dynamics of the mean and the variance. As Supporting Information, Figure S1 shows, the initial increase of the absolute mean deviation and the transient rise of the variance can be explained by considering the large-effect locus. At later times, however, as the change in variance is small, we can use the constant-variance approximation to understand the dynamics of the phenotypic mean deviation until it nearly vanishes. The constantvariance approximation also can be used when the initial mean deviation is sufficiently small (see Figure S2).  (21) (small dashes) for the dynamics of the allele frequency P with the largest effect and lowest initial frequency ðG ¼ 0:776; P 0 ¼ 3:3 · 10 24 Þ. The solid curve at the bottom shows the numerical solution of the full model for the frequency of the next relevant locus with effect size 0.319 and initial frequency 1:9 · 10 23 . The parameters are ℓ ¼ 20; s ¼ 0:1; m ¼ 10 25 ;ĝ % 0:028 ( g ¼ 0:2; z o ¼ 7:8 · 10 25 ; z f ¼ 1:5; n l ¼ 19.

Applications
The approximations presented here hold for the short-term evolution of phenotypic traits and allele frequencies. This means that our results may be used to understand polygenic selection driving rapid adaptation. In this respect, our most important result is that the mean of a phenotypic trait may respond faster to a sudden environmental change when the number of loci is large and most effects are small.
Evidence for rapid phenotypic evolution has been reported in recent years from several groups of organisms. For instance in Drosophila subobscura, latitudinal clines of wing size have been formed within 20 years since this species colonized America (Huey et al., 2000). Similarly, in field experiments in which lizard populations were newly established on small islands in the Bahamas, the hindlimbs adapted very quickly to the different vegetations on the islands (Kolbe et al., 2012). To our knowledge, however, data from GWAS are not yet available in these cases.
The theory presented here can also be applied to the large amounts of GWAS data that have been gathered in model species such as humans and Drosophila. To analyze the observed allele frequency shifts in single-nucleotide polymorphisms associated with quantitative traits, such as human height (Turchin et al. 2012) and cold tolerance in Drosophila (Huang et al. 2012), the results derived in this study provide a more general theoretical basis than the dynamical equations used in previous analyses (e.g., Turchin et al. 2012).

Open questions
The analytical calculations in this article work when the phenotypic mean at the equilibrium coincides exactly with the optimum. However, in the stationary state, there is a small but nonzero mean deviation due to which the long-term dynamics are not accurately captured, especially when effects are small, as shown in Figure 3B. An improved calculation of the dynamics that takes a nonzero mean deviation into account is certainly of interest, for instance to estimate the frequency of selective fixations (leading to selective sweeps) in this model (Chevin and Hospital 2008;Pavlidis et al. 2012;Wollstein and Stephan 2014).
Another open question concerns the generality of our results presented here. The current model perhaps oversimplifies biological reality in that it neglects genetic drift and assumes additive effects, symmetric mutations and free recombination between loci. It can be shown that the current model (neglecting mutation) can be derived from the classical symmetric viability model with arbitrary position of Figure 5 Response to change in optimum when most effects are large. Solid lines show the mean deviation (A) and variance (B), whereas the large dashed curves show the contribution to these cumulants from the locus with the largest effect and lowest initial frequency ðG ¼ 0:776; P 0 ¼ 3:3 · 10 24 Þ. In both cases, the exact numerical solution of the full model is used. The numerical solution of (18) (small dashes) is also shown. The other parameter values are the same as in Figure 4.
the optimum under the quasi-linkage equilibrium assumption. In the latter model, the probability of selective fixation has been studied numerically for up to eight loci (Pavlidis et al. 2012;Wollstein and Stephan 2014). However, at present we are lacking an analytical understanding of the role of recombination in this model and how it relates to the high-recombination limit represented by our current model.

ACKNOWLEDGMENTS
We thank two anonymous referees for insightful comments on a previous version of this manuscript and A. Wollstein for technical support. K.J. was supported by a fellowship from the Center of Advanced Studies of the LMU Munich. W.S. was funded by grants 325/14-1 (Priority Program 1590) and 325/17-1 (Priority Program 1819) of the Deutsche Forschungsgemeinschaft.

LITERATURE CITED APPENDIX
A Validity of Constant-Variance Approximation As explained in the main text, we obtain (11) when the variance is assumed to be constant in time. To see under what conditions this approximation holds, we find a correction to the variance by assuming that the skewness c 3 is nonzero and time-independent (see the inset of Figure 2B). Using (11) on the RHS of (10b) for n ¼ 2, we immediately get c 2 ðtÞ ¼ c 2 ð0Þ h 1 2 C À 1 2 e 2c 2 ð0Þst Á i ; (A.1) where C ¼ Dc 1 ð0Þc 3 ð0Þ=c 2 2 ð0Þ. Plugging the above solution into (10b) for n ¼ 1 gives the mean deviation as The solution (11) is recovered if the constant C, which depends on the initial value of the first three cumulants, is negligible. If we start with the initial condition in which the population is equilibrated to an optimum and most effects are small, since most initial allele frequencies are close to one half, the initial variance is substantial and the skewness is close to zero. In this case, the constant-variance approximation is expected to work well. However, if most effects are large, since most initial allele frequencies are close to fixation, although the skewness remains small, the variance also becomes small, thus leading to an increase in C. Then for sufficiently small mean deviations, we also may employ the constant-variance approximation when most effects are large.