## Abstract

During his well-known debate with Fisher regarding the phenotypic dataset of *Panaxia dominula*, Wright suggested fluctuating selection as a potential explanation for the observed change in allele frequencies. This model has since been invoked in a number of analyses, with the focus of discussion centering mainly on random or oscillatory fluctuations of selection intensities. Here, we present a novel method to consider nonrandom changes in selection intensities using Wright-Fisher approximate Bayesian (ABC)-based approaches, in order to detect and evaluate a change in selection strength from time-sampled data. This novel method jointly estimates the position of a change point as well as the strength of both corresponding selection coefficients (and dominance for diploid cases) from the allele trajectory. The simulation studies of this method reveal the combinations of parameter ranges and input values that optimize performance, thus indicating optimal experimental design strategies. We apply this approach to both the historical dataset of *P. dominula* in order to shed light on this historical debate, as well as to whole-genome time-serial data from influenza virus in order to identify sites with changing selection intensities in response to drug treatment.

- fluctuating selection
- change point analysis
- time-sampled data
- approximate Bayesian computation
- Wright-Fisher model

The common assumption of constant selection intensity through time utilized in many tests of selection is often criticized as unrealistic in natural and experimental populations, both owing to environmental changes (*e.g.*, fluctuations in climate, predator density, or nutrient availability) as well as to genetic changes (*e.g.*, epistasis, clonal interference). Despite this, such considerations are not accounted for in most population genetic models, since inferring changing selection coefficients (*s*) from single-time point polymorphism data are difficult. However, owing to recent technological advances, time-sampled polymorphism data are increasingly available, and time-serial analytical methods are expanding (Malaspinas *et al.* 2012; Mathieson and McVean 2013; Foll *et al.* 2014; Lacerda and Seoighe 2014; and see review of Bank *et al.* 2014), allowing for an empirical evaluation of the importance of changing *s* models.

Fluctuating selection in natural populations was suggested by Wright (1948) with regards to the phenotypic time-serial data of *Panaxia dominula* (scarlet tiger moth) to account for its observed annual fluctuations in coloring polymorphism (Fisher and Ford 1947). Since, there have been several theoretical considerations of fluctuating selection (Kimura 1954; Karlin and Levikson 1974; Karlin and Lieberman 1974; Chevin 2013; Gossmann *et al.* 2014; Gompert 2015), as well as many observations of fluctuating selection in natural populations (for a review, see Bell 2010). Nonetheless, until recently, analyses of fluctuating selection centered on *random* or *seasonal* oscillations of selection strength through time, as the mathematical complexity of analytical methods only allowed the simplest cases to be considered.

Approximate Bayesian Computation (ABC) has the advantage of being flexible in integrating complex models due to computational efficiency and the lack of likelihood computation (Beaumont 2010). Recently, a hierarchical ABC-based method based on the Wright-Fisher model was developed in order to infer genome-wide effective population size and per-site selection coefficients from whole-genome multiple-time point datasets (Foll *et al.* 2014, 2015). While the initial approach performs well overall, the authors noted the possibility for observations inconsistent with a single-*s* Wright-Fisher model; this was indeed observed at certain sites in their analysis of the influenza virus genome, which were simply excluded from consideration. Thus, as a natural extension, we here investigate the presence of changing selection intensities in these outlier single nucleotide polymorphisms (SNPs); in doing so, we also develop an extended Wright-Fisher ABC-based method capable of detecting and quantifying changing selection intensities through time (termed CP-WFABC).

## Methods

### Wright-Fisher ABC-based method

This approach first relies on a previously developed Wright-Fisher ABC method (WFABC - Foll *et al.* 2014, 2015) in order to estimate a constant effective population size (*N _{e}*). WFABC infers an effective population size from time-serial datasets through the average fluctuation of neutral trajectories over time (following Wright 1948). If whole-genome multi-time point datasets are available, the estimated value is referred to as the genome-wide effective population size. The posterior of the

*N*estimated from WFABC is used as a prior for the following extended method. The trajectory

_{e}*X*of a given allele with a known

*N*consists of time-serial allele frequencies f

_{e}*(*

_{t}*t*= 1,…,

*T*) where

*T*is the total number of generations (with

*T*> 4 to allow for a change point to be realizable with the Wright-Fisher model), from which a sample

*n*is taken at sampling time points

_{m}*m*= 1,…,

*M*(with

*M*> 4 to allow for a change point to be detectable). Parameters to be inferred include the selection coefficient prior to the change in selection intensity (

*s*), the selection coefficient subsequent to the change in selection intensity (

_{1}*s*), the change point in selection (

_{2}*CP*), and the dominance coefficient (

*h*) for diploid models. The joint posterior distribution of these parameters can be estimated by(1)(2)for the diploid model and the haploid model, respectively. The ABC approach allows these parameters to be inferred using Wright-Fisher model simulations without calculating the likelihood or

_{.}

The Wright-Fisher model simulator with a change point in selection strength is used to simulate the data *X*, with relative fitnesses *w _{AA} =* 1

*+ s*,

*w*1

_{Aa}=*+ sh*and

*w*1 for the diploid model, and

_{aa}=*w*and

_{A}= 1 + s*w*for the haploid model (Ewens 2004). Initially, the random sampling of an allele from generation

_{a}= 1*1*to generation

*CP-1*is simulated using

*s*, and onwards from the change point (

_{1}*CP*) using

*s*. In order to simulate realistic allele trajectories with changing selection coefficients, the allele needs to be segregating at the time of the change point. This condition is necessary since the change in selection coefficient cannot occur if the allele is either lost or fixed beforehand, assuming the infinite-site model with no back mutations. Thus, only alleles segregating at the change point are accepted as simulated datasets.

_{2}The associated summary statistic for these time-serial data are *Fs*’, an unbiased estimator of *N _{e}* based on the allele frequency change between two sampling time points that remains unbiased even in the presence of highly skewed allele frequencies and small sample size (Jorde and Ryman 2007), an important property for inferring change points along the allele trajectory. It is given as:(3)(4)where

*f*and

_{m}*f*are the allele frequencies at two consecutive sampling time points,

_{m+1}*m*and

*m*+1 (separated by

*t*generations),

_{i}*f*, and is the harmonic mean of the chromosome sample sizes

_{i}= (f_{m}+ f_{m+1}) / 2*n*and

_{m}*n*at two consecutive time points. Unlike the WFABC approach that summarizes time-serial trajectories into only two summary statistics (increasing and decreasing

_{m+1}*Fs*’; Foll

*et al.*2014), here

*Fs*’ is summarized at every pair of consecutive time points as

*Fs’*, where

_{i}with i = 1,…,M−1*M*is the number of sampling time points. This modification allows additional information, such as the timing of increase or decrease in allele frequency, to be captured. In order to retain information about directionality that is lost by using

*Fs*’ as a summary statistic in the original WFABC approach, increasing allele frequencies are made positive and decreasing allele frequencies are made negative with regards to the absolute value.

The joint posterior distribution of the parameters of interest is obtained using the algorithm described in Beaumont *et al.* (2002). The approximate posterior density(5)with U(X) as data-specific summary statistics, θ = (*s _{1}*,

*s*,

_{2}*CP*,

*h*) for the diploid model and θ = (

*s*,

_{1}*s*,

_{2}*CP*) for the haploid model, is obtained using an ABC algorithm as follows:

i. Simulate

*K*trajectories from the Wright-Fisher model with a change in selection intensity, with θ randomly sampled from its prior*P(*θ*)*, conditional on the allele segregating at the change point.ii. Compute U

*(x*for each trajectory using the_{k})*Fs’*summary statistic between all consecutive sampling time points_{i}*i*: U*(x*where_{k,i})*i*= 1,…,*M−1*where*M*is the last sampling time point.iii. Retain the 1000 best simulations with the smallest Euclidian distance between U

*(x*(from the simulated) and U_{k,i})*(X*(from the observed) to obtain an approximate posterior density of P_{i})*(*θ*|X)*.

For the first step, simulations are performed with the same initial conditions as the observed data, including effective population size, initial allele frequency, and the sampling points and sizes. In addition, a minimum allele frequency in one of the sampling time points is imposed on simulated trajectories as is done in observed data. This ascertainment scheme takes into account the nonrandom criterion of considering only the trajectories reaching values above the sequencing error threshold in the observed data (Foll *et al.* 2015).

For the second step, it is important to note that the *Fs*’ summary statistic is calculated between every pair of consecutive sampling time points (Figure 1). This construction of the summary statistic enables information on both the timing and strength of the allele frequency change to be captured, as the timing of the change is essential in detecting the change point and the strength of the change is essential in estimating the corresponding selection coefficients. For the diploid model, an additional parameter *h* is inferred jointly with the other three parameters, as its value is one of the determining factors in the timing of allele frequency change (Haldane 1932).

For the third step, the simulated *Fs*’ summary statistics U*(x _{k,i})* between every pair of consecutive sampling time points are compared with the corresponding observed

*Fs*’ summary statistics U

*(X*– allowing a small fraction of the simulated trajectories with allele frequency changes that best match the observed trajectory (in terms of both timing and strength) to be retained.

_{i})### Wright-Fisher ABC-based method with change point analysis

In order to increase computational efficiency and sensitivity in change point detection, an additional summary statistic is integrated into the Wright-Fisher ABC-based method. This novel summary statistic is derived from change point analysis – statistical techniques developed and used in many disciplines ranging from finance to quality control in order to detect and estimate change (*e.g.*, Chen and Gupta 2001). Among the techniques available, the cumulative sum control chart (CUSUM) developed by Page (1954) is able to detect small and sustained shifts in the statistics β obtained from a sample (Ryan 2011). Instead of using the entire CUSUM procedure as a separate method for detecting change, the CUSUM value is integrated into the Wright-Fisher ABC-based method as an additional summary statistic that characterizes the time-sampled trajectory of an allele:(6)where and . The CUSUM value *S* is accumulated only when the statistic β is different from its average value in the dataset.

The change point *S _{CP}* is the sampling time point with the maximal absolute value of

*S*, which is the furthest point from the initial value zero attaining the maximal accumulation of difference from the average value:(7)Here, we take

_{i}*Fs*’ at each pair of consecutive sampling time points as the statistic β, since it is a time-serial measure of the allele frequency change, which is indicative of the selection strength change. Thus, when

*Fs*’ is used as the statistic β in the CUSUM, the maximal CUSUM value

*S*is the potential change point of the allele trajectory, as illustrated with an example in Figure 1.

_{CP}In the change point Wright-Fisher ABC (CP-WFABC), an additional summary statistic *S _{CP}* is used to characterize observed and simulated allele frequency trajectories for detecting a change point. In the third step of the ABC algorithm, the Euclidean distance between U

*(x*and U

_{k,i})*(X*is calculated only if the maximal CUSUM value

_{i})*S*of the simulated data matches the maximal CUSUM value

_{CP,k}*S*of the observed data:(8)This additional step allows the computation to be more efficient – especially when there is a large number of time points sampled – as the Euclidean distance is calculated for a fraction of simulated trajectories whose maximal CUSUM value is equal to that of the observed (

_{CP}*i.e.*, with the same time-sampled characteristic). Furthermore, as the CUSUM is sensitive to small and sustained changes, integrating the CUSUM into the Wright-Fisher ABC method increases its sensitivity for detecting small and sustained changes in selection strength. The potential bias in the calculation of the maximal CUSUM value is counteracted by the fact that the bias would be present in both the observed and the simulated trajectories.

### Simulated data with constant selection and with changing selection

We generated simulated datasets of different effective population sizes using the Wright-Fisher model for two scenarios: (1) trajectories of constant selection with only *s* and *h* (for diploid models) as parameters, and (2) trajectories of changing selection with *s1*, *s2*, *CP*, and *h* (for diploid models) as parameters. For selection coefficients, uniform priors of (−1, 1) were used. The uniform prior of *CP* was set to occur between the second generation and the second-to-last generation (2, *T−*1), where *T* is the number of generations of the population in the time-serial data. The dominance coefficient *h* for the diploid model was randomly drawn from one of three values: complete recessiveness, codominance, or complete dominance [0, 0.5, 1]. Although these prior ranges are uninformative, the constraint on the trajectories to be segregating at the change point shapes the distribution of the prior ranges according to the input parameters such as ploidy, effective population size, initial allele frequency, and number of generations; the updated priors for the haploid population of *N _{e}* = 100 and the diploid population of

*N*= 50 are shown as examples (Supplemental Material, Figure S1 and Figure S2).

_{e}The other input values – such as the number of generations (*T* = 100), the sampling time points (*M* = 10), the sample size (*n* = 100), the initial allele arising as a new mutation, and the ascertainment of observing a minimum frequency at 2% – were kept constant for the two scenarios. We retained the best 0.1% of 1,000,000 simulations for each pseudoobservable trajectory using the rejection algorithm based on the Euclidean distance as described above. The mode of the posterior distribution from the best simulations (Sunnåker *et al.* 2013) was used to evaluate the estimated parameter value against the true parameter value.

### Data availability

The R package of CP-WFABC is available on jensenlab.epfl.ch.

## Results

### Model choice in the change point Wright-Fisher ABC method

The first step of CP-WFABC is to be able to distinguish changing selection trajectories from constant selection trajectories. ABC model choice was constructed to choose between two models: M_{0} with a single selection coefficient, and M_{1} with two selection coefficients and a change point. The relative probability of M_{1} over M_{0} can be computed through the model posterior ratio as the Bayes factor *B _{1,0}* (Sunnåker

*et al.*2013):(9)when the model prior

*p*(M

_{0}) is equal to

*p*(M

_{1}). In practice, the model priors are made equal by producing the same number of simulations for each model and retaining the best simulations from the lot. The posterior ratio is computed as the number of accepted simulations from M

_{1}over those of M

_{0}, giving the Bayes factor

*B*,which is an indicator of the support for a specific model. The performance study was conducted with a haploid population of

_{1,0}*N*= (100, 1000, or 10,000) and a diploid population with

_{e}*N*= (50, 500, or 5000) using the simulated datasets of the two scenarios described in the previous section as M

_{e}_{0}and M

_{1}, respectively.

We considered two cases for the pseudoobservables to test the sensitivity and specificity of the ABC model choice: the first case when the pseudoobserved trajectories have a single selection coefficient, and the second case when they have changing selection coefficients. One thousand pseudoobservable trajectories were generated for each case with the data ascertainment minimum frequency set to 2% for at least one of the sampling time points. Additionally, for the second case, pseudoobservable trajectories were accepted only when the allele was segregating at the time of the change point – a constraint for realistic combinations of selection coefficients, change points, and dominance (for diploids) – in order to reproduce changing selection trajectories in real datasets. All other input values were kept constant as in the simulated datasets described in the previous section.

The results of the ABC model choice from a haploid population with *N _{e}* = 100, and a diploid population with

*N*= 500, are represented as ROC curves (Xavier

_{e}*et al.*2011) in Figure 2. Specificity is given on the

*x*-axis showing the true negative rate, while sensitivity is given on the

*y*-axis showing the true positive rate of the Bayes factor

*B*calculated from 1000 pseudoobservables of changing selection (where

_{1,0}*B*should be large) and 1000 pseudoobservables of constant selection (where

_{1,0}*B*should be small). The overall ROC curves in black (all trajectories) show that when the specificity threshold is most conservative in detecting no false positives (

_{1,0}*i.e.*,

*B*= infinite), the Bayes factor

_{1,0}*B*has a sensitivity of ∼30% for all populations. Considering that the pseudoobservable trajectories were simulated randomly from a wide range of prior values, the Bayes factor

_{1,0}*B*from CP-WFABC is, in general, sensitive and specific. The ROC curves in black for the other haploid and diploid populations (Figure S3) also indicate that the Bayes factor

_{1,0}*B*is sensitive and specific as they are above the diagonal line of no-discrimination. The area under the ROC curve (AUC) is used to assess how reflective the Bayes factor

_{1,0}*B*is of the true model, as summarized for all pseudoobservable populations in Table 1. The AUC values show that the Bayes factor

_{1,0}*B*is ∼80% more probable to rank a randomly chosen changing selection case above a randomly chosen constant selection case. Additionally, the distribution of Bayes factors

_{1,0}*B*under the null model M

_{1,0}_{0}(

*i.e.*, case 1) was used to compute the significance level α at 1% (Good 1992). For both diploids and haploids, the significance threshold is higher for smaller population sizes (Table 2), and the calculation of these thresholds will be important in any given data application.

Following the detection of changing selection trajectories using ABC model choice, the quality of parameter estimation by the model chosen was evaluated. The cross-validation results from the haploid population of *N _{e}* = 100 are shown in Figure 3 and those from the diploid population of

*N*= 500 in Figure 4 (see Figure S4, Figure S5, Figure S6, and Figure S7 for additional results). For the case where the pseudoobservables were of constant selection, the estimation for a single

_{e}*s*(and the dominance

*h*for diploid) using CP-WFABC is very accurate, as the mode of the best simulations from the M

_{0}model for each pseudoobservable lies along the red diagonal line. Exceptions include uninformative trajectories where the allele surpasses the minimum frequency of 2% in the first sampling and is lost immediately due to genetic drift or negative selection and therefore not observed in subsequent samplings. Such uninformative trajectories will always keep the same set of best simulations from the M

_{0}model since their selection strength is indistinguishable, and they result in horizontal lines along the estimated negative value. This phenomenon is particularly pronounced when population size is small as shown in Figure 3 and Figure S6, since the role of genetic drift is more significant.

For the second case, when the pseudoobservables are of changing selection intensity, the joint estimation of the parameters is also effective for a restricted range of values. In Figure 3, Figure 4, Figure S4, Figure S5, Figure S6, and Figure S7, the mode estimation of each pseudoobservable is color-coded according to the three categories of trajectory shape. The green dots are pseudoobservable trajectories that change from positive *s _{1}* to positive

*s*. The blue dots are those that change from positive

_{2}*s*to negative

_{1}*s*, while the magenta colors include all other cases (

_{2}*e.g.*, neutral or negative

*s*to any value of

_{1}*s*). There is a clear clustering by category – with the best estimation being of positive values of

_{2}*s*below 0.5, moderate values of

_{1}*s*between −0.5 and 0.5, and CP values for the blue category of positive

_{2}*s*to negative

_{1}*s*. In trajectories other than those with positive

_{2}*s*to negative

_{1}*s*, the change point is difficult to detect, particularly for diploid populations where the additional dominance parameter

_{2}*h*was estimated (Figure 4, Figure S6, and Figure S7). This trend is also observed when the ROC curves are generated according to these three categories (Figure 2 and Figure S3). For all populations, the Bayes factor

*B*is more sensitive and specific for trajectories changing from positive

_{1,0}*s*to negative

_{1}*s*(ROC curves in blue), reaching above 60% of the true positive rate when there are no false positives. Despite the restricted range of good parameter estimation in

_{2}*s*,

_{1}*s*, and

_{2}*CP*, the estimation of dominance is robust for both cases of constant and changing selection (except for the small population size of

*N*= 50; Figure S6).

_{e}In order to evaluate the performance of the joint parameter estimation, the coefficient of determination *R ^{2}* is used to assess the cross-validation between the estimated values (

*y*) and the true values (

_{m}*f*), compared with the simple average of the estimated values (). The closer the

_{m}*R*value is to 1, the better the parameter estimation as shown:(10)Table 3 and Table 4 summarize the performance of the joint parameter estimation for all cases as the

^{2}*R*values for the haploid and diploid populations, respectively. The first case is when the pseudoobservables are of constant selection intensity, in which case the true model (M

^{2}_{0}) performs only slightly better than the false model (M

_{1}) for estimating

*s*. This discrepancy in parameter estimation of M

_{0}is mainly owing to uninformative pseudoobservable trajectories with constant selection (which have been lost or fixed) being associated with the true model (M

_{0}) of constant selection; this is due to the constraint for the allele to be segregating at the change point in the (false) model M

_{1}of changing selection. In the cross-validation of the constant selection case, the parameters estimated form horizontal lines at negative estimated values for those trajectories that are lost, and cluster at the top right corner for those trajectories that are fixed (Figure 3, Figure 4, Figure S4, Figure S5, Figure S6, and Figure S7).

For the second case in which the pseudoobservables have changing selection coefficients, parameter estimation from the true model (M_{1}) performs better than that from the false model (M_{0}) for all parameters – particularly when population sizes are large. As expected, there is a trend of better parameter estimation as population size increases (though see Figure S4 and Figure S5). Additionally, it has been shown that the value of the Bayes factor *B _{1,0}* is a good indicator of the parameter estimation performance (results not shown).

## Data application

### Historical dataset of P. dominula

A long-running dataset based on the *medionigra* morph responsible for darker wing color in wild populations of *P. dominula* (Figure S8) began in 1939 with collections by Fisher and Ford (1947) and continued through to 1999 (Cook and Jones 1996; Jones 2000). Despite this phenotypic time-serial data having been analyzed previously from various angles (O’Hara 2005; Mathieson and McVean 2013; Foll *et al.* 2015), it is still relevant to consider a model of changing selection in time, as Wright (1948) originally suggested.

The recent reconsiderations of the dataset tend to favor a lethal-recessive model with an effective population of *N _{e}* = 500 (Mathieson and McVean 2013; Foll

*et al.*2015) – however, the biological question of how the

*medionigra*morph could have reached the initial frequency of 11% in the dataset remains unanswered with this conclusion of constant strong negative selection. Wright asserted that the trajectory of the

*medionigra*morph during this period could be explained by fluctuating selection with “no net selective advantage or disadvantage.” Although this alternative hypothesis has been considered as a random fluctuation of selection by estimating selection coefficients between every sampling time point (see O’Hara 2005), the quantitative plausibility of a directional change-in-

*s*model over a single-

*s*model lacks thorough investigation. Thus, we reanalyzed this dataset using the CP-WFABC method in order to investigate the possibility of changing selection in the

*medionigra*morph during the 60-yr data collection.

Using the ABC model choice introduced here as a test for a change in selection strength, and to estimate the parameters of interest for the chosen model, we assume the *medionigra* allele is a single codominant locus responsible for the homozygous and heterozygous expressions of the phenotypic forms *bimacula* and *medionigra*, respectively (Cook and Jones 1996). The model M_{0} assumes a single selection coefficient, thus the only parameter to estimate is *s*. The M_{1} model assumes a change in selection strength, thus the parameters of interest are *s1*, *s2*, and *CP*. Both M_{0} and M_{1} take the prior range of (−1, 1) for the selection coefficients and the prior range of (2, 59) for the change point in the M_{1} model. For the M_{1} model, these uninformative priors are updated with the constraint that the allele must be segregating at the time of change point. Here, we create 10^{7} simulated datasets for each M_{0} and M_{1}, and apply the rejection algorithm of the CP-WFABC method to retain the best 1000 simulations compared with the observed trajectory. The effective population size is assumed to be *N _{e}* = 500 as in previous studies (Wright 1948; Cook and Jones 1996; O’Hara 2005), with an initial allele frequency of 11% and a minimum frequency ascertainment of 2%.

The Bayes factor for M_{1} over M_{0} is calculated as 0.952, indicating that the single coefficient M_{0} model cannot be rejected in favor of the changing selection M_{1} model (Table 4). From the parameter estimation of the model M_{0} (Figure S9), the mode of the posterior distribution for *s* is given as −0.15 as asserted by Fisher and Ford (1947). When the ABC model choice was repeated with a smaller population size of *N _{e}* = 50 as suggested by Wright (1948) and O’Hara (2005), the Bayes factor increases to 1.87 (

*i.e.*, the changing selection model is twice as likely as the constant selection) – however, this value is not large enough to be significant for a diploid population of

*N*= 50 (Table 4).

_{e}### Experimental evolution of influenza virus with drug treatment

The evolution of pathogens within a host is one of the most important cases in which the possibility of fluctuating selection must be considered – as they may experience drastically changing selective pressures due to host immune response, specific drug treatments, and/or pathogenic cooperation or competition (Tanaka and Valckenborgh 2011; Hall *et al.* 2011). Thus, how these pathogens adapt to these rapid external and internal changes is of major concern to the biomedical community.

The time-serial experimental dataset of influenza A conducted by Renzette *et al.* (2014) and Foll *et al.* (2014) is an interesting case study on the impact of drug treatment on influenza virus evolution. The dataset consists of 13 sampling points from which population-level whole-genome data were collected. Drug treatment with a commonly used neuraminidase inhibitor (oseltamivir) began after the collection of the third sample and continued, at increasing concentrations, until the final passage. Using WFABC, the genome-wide effective population size across the sampling time points was estimated (*N _{e}* = 176) and the SNPs under selection were identified.

Here, we apply the CP-WFABC method on two cases of interest from this study to consider a possible change in selection strength under drug treatment: the first case includes trajectories identified as being driven by positive selection, while the second includes outlier trajectories (*i.e.*, trajectories not fitting a single *s* Wright-Fisher model). For all cases, we test the model M_{0} (*i.e.*, a single selection coefficient) and M_{1} (*i.e.*, a changing selection coefficient), with parameters of interest (*s*) and (*s1*, *s2*, *CP*), respectively. The number of generations per passage is assumed to be 13 (following Foll *et al.* 2014), and the minimum frequency of 2% is set as an ascertainment for observing the minor allele in the data. *De novo* mutations are assumed to occur at the sampling time point where the SNP was first sampled (except for trajectories whose initial sample frequency was first sampled above 1/*N _{e}*, which are assumed to be standing variation). This assumption is based on the high mutation rate, large population bottlenecks associated with passaging, and large census population size between passages. A total of 10

^{7}datasets were simulated for each M

_{0}and M

_{1}; the best 1000 trajectories from the lot were retained using the rejection algorithm described in the

*Methods*section. The uniform prior ranges for the selection coefficients were set as (−1, 1), for the change point as (

*t*, 157) where

_{0}*t*is the generation when the SNP was first sampled, and with the constraint of segregating alleles at the change point for M

_{0}_{1}.

The results for the Bayes factors and the parameter estimates are summarized in Table 5 for all trajectories of interest. The Bayes factors of most trajectories show strong support for the changing selection model: the stronger the selection strength change, the larger the Bayes factor. Using the Bayes factors from the simulation studies as guidance (Table 4), the significance threshold to reject M_{0} is computed as 3.7 for a small haploid population. As expected, the trajectories identified as outliers of the single *s* Wright-Fisher model (NP 159, PB1 33) all reject the constant selection model M_{0} with a large Bayes factor. We also note that the Bayes factor for the drug-resistant mutation H275Y (NA 823) does not support the changing selection model strongly, confirming that the experimental evolution procedure kept the selective pressure of the drug constant by adjusting the drug concentration to reduce viral plaque numbers to 50% at each passage. As shown in Figure 5, the change points are estimated to be mostly between the seventh and eighth passages, a notable result since three of these trajectories (HA 48, HA 1395, NA 582) are increasing rapidly after the drug-resistant mutation H275Y appears, whereas one trajectory (NP 159) from a different segment decreases rapidly. This result may indicate that positive selective for the three SNPs (including HA 1395; a known compensatory mutation encoded also as D112N) increased along with the drug-resistant mutation H275Y, potentially due to epistatic interactions, whereas another SNP (NP 159) decreased at that time, potentially owing to clonal interference. The single selection estimates from WFABC (Foll *et al.* 2015) are similar to the estimates of CP-WFABC only when the Bayes factor does not reject M_{0} – strong evidence that an alternative model of changing selection must be considered for some trajectories in order to correctly estimate selection coefficients.

We also applied CP-WFABC to the control case of SNPs increasing in frequency without drug as a comparison to the case with drug. The effective population sizes of the viral populations were averaged to be 226 in the absence of drug from the previous study (Foll *et al.* 2014). All of the other inputs and the ABC model choice conditions were kept the same as in the drug case. The Bayes factor results summarized in Table 5 demonstrate that none of the SNP trajectories under selection in the control experiment can reject M_{0} (*i.e.*, constant selection).

## Discussion

These simulations demonstrate that the novel CP-WFABC approach presented here is able to detect changing selection trajectories via ABC model choice, and also to estimate a wide range of parameters of interest. Performance was analyzed separately for three categories of allele trajectories according to the nature of the change in selection strength: (1) a change from positive *s _{1}* to positive

*s*, (2) a change from positive

_{2}*s*to negative

_{1}*s*, and (3) all other changes. The datasets for each possible combination were generated using the Wright-Fisher model with a change in selection strength, using the most general prior ranges for all parameters

_{2}*s*,

_{1}*s*,

_{2}*CP*, and

*h*for diploids, with the only constraint being segregation of the allele at the change point. For both the detection and parameter estimation, CP-WFABC performs best when the change is large, particularly for the second category of change (positive

*s*to negative

_{1}*s*), as shown in the ROC curves (Figure 2 and Figure S3) and the cross-validation graphs (Figure 3, Figure 4, Figure S4, Figure S5, Figure S6, and Figure S7). For the first category (positive

_{2}*s*to positive

_{1}*s*) and the third category (any other changes), the change point is difficult to estimate, particularly for diploids where the additional parameter

_{2}*h*is also estimated. The ABC model choice of CP-WFABC has the best sensitivity for full specificity, for larger population sizes (

*N*> 500 for diploids), and for haploid populations.

_{e}The parameter estimates of *s _{1}* and

*s*perform best when the values are moderate – given as the intervals (0, 0.5) and (−0.5, 0.5) respectively. For

_{2}*s*, the optimal parameter range for estimation is (0, 0.5), where a

_{1}*de novo*mutation that survives negative selection and segregates until the change point is indistinguishable from other drifting mutations with similar trajectories with uninformative low allele frequency. These trajectories naturally arise more frequently when population size is small and in diploids where the dominance effect plays a role, as shown in the third category of change (Figure 3, Figure 4, Figure S4, Figure S5, Figure S6, and Figure S7; magenta points). When an initial frequency of 10% is used instead of a

*de novo*mutation, the advantage of having a more informative trajectory at the beginning is counteracted by the effect of more cases dominated by negative selection or genetic drift segregating until the change point. Thus, the performance of CP-WFABC for standing variation is similar to that of

*de novo*mutation. For

*s*, the optimal parameter range for estimation is (−0.5, 0.5), as trajectories with extreme values are less informative since they are lost or fixed directly after the change point (explaining the clustering of the change points at earlier times). For diploid populations, estimates of

_{2}*h*are accurate to the level of determining dominance from codominance or recessiveness, particularly for large population sizes (Figure S7), given the difficulty of joint estimation with the three other parameters. Indeed, estimation of this additional parameter comes at the cost of worse performance for the other parameters, as can be seen in the ROC curves and cross-validation graphs: the detection and parameter estimation of changing selection cases is always better for haploids. Thus, in diploid cases, we recommend fixing the dominance parameter if known, in order to improve the performance of CP-WFABC.

Although CP-WFABC is intended to detect and evaluate changing selection intensities, the simulation studies demonstrate that the method also performs well in estimating parameters for cases of constant selection (as has been demonstrated by Foll *et al.* 2014). For haploids with large population sizes, in particular, the estimated values of the single parameter *s* correlate almost perfectly with the true values (Figure S5). However, when the population size is small for both diploids and haploids, some trajectories that are lost by genetic drift are difficult to estimate, as shown in Figure 3 and Figure S6 as horizontal lines along negative estimated values. This limitation of constant selection coefficients, however, is due to the simulation conditions of the *de novo* mutation at the first generation and the minimal ascertainment scheme (minimum frequency of 2% at one of the sampling time points). For real datasets, the conditions are likely to be less stringent, since such uninformative trajectories will not be considered for parameter estimation.

Finally, we utilized this approach to make inference in two very different time-sampled datasets: *P. dominula* (diploid) and Influenza A virus (haploid). The time-serial *medionigra* trajectory of *P. dominula* was reanalyzed to test for a change in selection strength and/or direction during 60 yr of data collection. By assuming *h* as codominant, the results for *N _{e}* = 500 indicate that the model M

_{0}of constant selection cannot be rejected according to the Bayes factor from the ABC model choice algorithm. The selection coefficient from this model is estimated as −0.15, corresponding with that calculated by Fisher and Ford (1947). However, when the population size is assumed to be smaller (

*N*= 50), the Bayes factor result supports M

_{e}_{1}(changing selection) twice as strongly as M

_{0}(constant selection), but not strong enough to reject M

_{0}according to the significance level test computed with the distribution of Bayes factors under the null model M

_{0}. This dataset of the

*medionigra*morph thus demonstrates the difficulty of detecting and evaluating a change in selection when the assumption of a known and constant effective population size is not fulfilled. Depending on the assumed population sizes, the conclusions may indeed change, as this study demonstrates. Furthermore, if the population size fluctuates, this method has the caveat of detecting a change in

*N**

_{e}*s*rather than only in selection strength.

Next, CP-WFABC was applied to SNP trajectories of interest from an experimental dataset of influenza A virus. The ABC model choice test was conducted on the trajectories identified as 1) being positively selected and 2) as outliers from the single-*s* WFABC method. For the SNPs in the presence of drug, the Bayes factor for six out of nine trajectories favored the changing selection model M_{1}. As shown in Figure 5, the change points for four out of these six trajectories occurred between passages seven and eight – the interval during which three trajectories from the Hemagglutinin and Neuraminidase segments (HA and NA, respectively) increased rapidly while one trajectory from the segment NP decreased rapidly along with the known drug-resistant mutation NA 823 (H275Y). These results appear to support the presence of epistasis and/or clonal interference, where the selection strength of the other SNPs is influenced by the appearance of a drug-resistant mutation under drug pressure. In fact, a known compensatory mutation HA 1395 (D112N) for infectivity (Thoennes *et al.* 2008) was among the three trajectories increasing rapidly, further confirming that increased infectivity might contribute to tissue culture adaptation, thus reinforcing the use of the method to evaluate biological hypotheses. Moreover, the estimated values of selection coefficients differed greatly between the constant-selection method (WFABC) and CP-WFABC. In particular, the estimates of *s* for PA 2194 and NP 159 from the Wright-Fisher model were inferred by WFABC as being near zero (*i.e.*, neutral), though the more robust CP-WFABC estimates here indicate fluctuation of *s* from negative to positive values. Thus, these fluctuations cannot be explained by genetic drift alone, as previously speculated. We have therefore identified some cases where an alternative model of changing selection is essential for correctly estimating selection parameters and identifying change points. In the absence of drug treatment, the Bayes factor test could not reject the constant selection model M_{0} for any of the identified SNPs suggesting that in, the absence of the drug, the selective pressures on the population are largely constant as expected.

The simulation studies of changing selection reveal some important points to consider from the standpoint of experimental design. For CP-WFABC, the parameter estimation of a single selection coefficient between two sampling time points performs reasonably well for haploid population sizes above *N _{e}* = 1000. However, it is advisable to have three sampling time points to enhance the parameter estimation by maximizing the information contained in the allele trajectory for both selection periods, particularly for diploids and in smaller population sizes (

*N*< 1000). Thus, in order for a change to be detectable, it is required to have at least four sampling time points where the change must occur between the second and third sampling time points. The simulation studies of CP-WFABC confirm that the estimation of

_{e}*CP*performs best at the intermediate range of time-sampled data, as any change happening before the second and after the second-to-last sampling time point is impossible to detect (Figure 3, Figure 4, Figure S4, Figure S5, Figure S6, and Figure S7). Finally, it remains a future challenge to expand this method to consider more than one change point in selection strength, as some of the trajectories in the influenza A application (such as PA 2194 and PB1 33) suggest the presence of several change points along the trajectory, and to resolve the problem arising from the assumption of constant population size.

## Acknowledgments

We thank Kristen Irwin for helpful comments. The computations were performed at Vital-IT (http://www.vital-it.ch) in the Swiss Institute of Bioinformatics. This work was funded by grants from the Swiss National Science Foundation and a European Research Council starting grant to J.D.J.

## Footnotes

Supplemental Material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.115.023200/-/DC1

*Communicating editor: Y. Kim*

- Received September 29, 2015.
- Accepted January 28, 2016.

- Copyright © 2016 Shim
*et al.*

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.