## Abstract

Researchers in evolutionary genetics recently have recognized an exciting opportunity in decomposing beneficial mutations into their proximal, mechanistic determinants. The application of methods and concepts from molecular biology and life history theory to studies of lytic bacteriophages (phages) has allowed them to understand how natural selection sees mutations influencing life history. This work motivated the research presented here, in which we explored whether, under consistent experimental conditions, small differences in the genome of bacteriophage φX174 could lead to altered life history phenotypes among a panel of eight genetically distinct clones. We assessed the clones’ phenotypes by applying a novel statistical framework to the results of a serially sampled parallel infection assay, in which we simultaneously inoculated each of a large number of replicate host volumes with ∼1 phage particle. We sequentially plated the volumes over the course of infection and counted the plaques that formed after incubation. These counts served as a proxy for the number of phage particles in a single volume as a function of time. From repeated assays, we inferred significant, genetically determined heterogeneity in lysis time and burst size, including lysis time variance. These findings are interesting in light of the genetic and phenotypic constraints on the single-protein lysis mechanism of φX174. We speculate briefly on the mechanisms underlying our results, and we discuss the potential importance of lysis time variance in viral evolution.

A key opportunity in evolutionary genetics is the decomposition of beneficial mutations into their proximal, mechanistic determinants. To date, much of this work has applied methods from molecular biology to dissect the basis of protein evolution (Harms and Thornton 2013; Dean and Thornton 2007), but life history theory (Stearns 1992; Roff 2002) similarly decomposes population growth rate into constituent life history traits. It can thus predict which mutations influencing life history will be favored by natural selection (*i.e.*, increase population growth rate) (Stearns *et al.* 2000). The simple life history of the lytic bacteriophage (phage) makes it particularly well suited to this sort of work. Each generation begins with a free phage adsorbing to a host cell and injecting its genome. Transcription and translation facilitate the assembly of progeny phages within the host, and, finally, host cell lysis releases the next generation of free phages. Assuming the density of host cells is not limiting, Wang *et al.* 1996 provided an expression for phage growth rate in terms of mean burst size and mean time to lysis. Later, Bull *et al.* 2006 extended this treatment by incorporating adsorption and phage death rates. Other studies have further developed this formalism (Rabinovitch *et al.* 1999, 2002), with many applying it to explore the mechanistic basis of beneficial mutations in lytic phages (Pepin *et al.* 2006; Shao and Wang 2008; Bull *et al.* 2011). Several have demonstrated that a mutation’s fixation probability depends critically on which component of life history it affects (Wahl and DeHaan 2004; Hubbarde *et al.* 2007; Patwa and Wahl 2008, 2009).

At steady state (*i.e.*, when host cell infections are at their stable age-of-infection distribution), population growth rate is entirely determined by mean life history traits (Bull *et al.* 2011). But the moment a novel mutation appears, it establishes a lineage that, by definition, is far from this equilibrium. In populations deviating from a stable age distribution, variance in both lysis time and adsorption rate may play a significant role in determining growth rate (Storms and Sauvageau 2014), since it may affect how long the deviation from stable age distribution persists (Bull *et al.* 2011).

These findings motivate interest in the possibility of genetic control of higher moments in phage life history traits. Previously, Dennehy and Wang 2011 used a microscopy-based approach to demonstrate heritable variation in lysis time mean and variance among a panel of phage λ holin protein mutants.

To explore heritable variation in phage life history traits, including lysis time variance, we studied a panel of eight genetically distinct clones of the lytic bacteriophage φX174. φX174 is an ubiquitous (Afshinnekoo *et al.* 2015) and industrially important phage (Labrie *et al.* 2014). It was chosen for this research in large part due to its small, well-characterized genome (Hayashi *et al.* 1988), the first of its kind ever sequenced (Sanger *et al.* 1977). In addition, it infects a well-studied host (*Escherichia coli* C), has a short generation time, and is robust to a wide range of temperatures, making it amenable to evolution experiments and other lab manipulations (Wichman and Brown 2010).

The gold standard for measuring phage lysis time and burst size is the one-step phage growth assay (Hyman and Abedon 2009). However, this method affords only limited access to information about lysis time variance and the relationship between burst size and time for a given phage. Therefore, the panel of eight clones was assessed using an updated version of a venerable experimental method first proposed by Burnet 1929, and later extended by Delbrück 1945. We simultaneously inoculated each of a large number of replicate host cell volumes with, on average, less than one phage particle of identical haplotype. These volumes were then plated at 15-sec intervals on lawns of host cells, and the resulting plaques were counted after incubation.

From this assay, we obtained data that could help determine mean and variance in both lysis time and burst size (Figure 1). However, the information we wanted was obscured by a suite of hidden variables. In wells showing lysis events, we did not know when lysis *actually* occurred. We also did not know the number of phage particles originally in these wells and, if that number was greater than one, how many had lysed their hosts. To infer lysis time and burst size from the two observed metrics (time of sampling and plaque count), we somehow had to circumvent these hidden variables. We overcame this challenge by developing a novel inferential apparatus that complemented the experimental framework with a maximum likelihood-based approach.

In this article, we describe our finding that, under the tested conditions, genetic heterogeneity could give rise to significant heterogeneity in φX174 lysis time and burst size, including lysis time variance. These findings are interesting in light of the genetic and phenotypic constraints on the single-protein lysis mechanism of φX174, and they point to promising avenues for future theoretical and experimental work.

## Materials and Methods

### Media, host cells, and bacteriophages

Difco LB Lenox Broth (Detroit, MI), pH 7.50, and supplemented with 2 mM CaCl_{2} was used throughout. We studied the performance of φX174 on *E. coli* C [DSMZ 13127, kindly provided by Olivier Tenaillion, Institut National de la Santé et de la Recherche Médicale (INSERM), University Denis Diderot Paris 7], the phage’s natural host. A stock of *E. coli* C was stored in 40% glycerol at –80°. We used this primary stock to seed bacterial colonies on agar plates stored at 4°. Colony plates were created anew every 2–4 wk. To generate experimental bacterial cultures, 10 mL LB was inoculated with a single colony and shaken in a 50-mL flask at 200 rpm and 37° for about 16 hr. The resulting stationary-phase cells were then diluted somewhere between 33- and 80-fold into 40 mL LB in a 250-mL flask. The diluted culture was shaken at 37° for about 3 hr until it reached an optical density of 0.6. Exponential-phase cells prepared in this way were used for each assay.

To synchronize phage infections, we treated cells using a starvation protocol adapted from Denhardt and Sinsheimer 1965 and kindly provided by Darin Rokyta, Florida State University. The treatment starved cells so that they “shut down,” allowing for phage adsorption but not entry. (It is likely that the treatment also had the fortunate side effect of reducing variance in time to cell division compared to an exponential-phase culture, since it was bound to cause a lag in the cell cycle. See *Discussion* for the cell cycle’s relevance.) First, we transferred 1 mL cells to a 1.5-mL centrifuge tube and pelleted them by spinning in an Eppendorf centrifuge (5417R) for 2 min at 20,800 rcf. After aspirating the supernatant, we resuspended the cells in 1 mL HFB-1 solution [0.06 M NH_{4}Cl/0.09 M NaCl/0.1 M KCl/0.1 M Tris-HCl (pH 7.4) /1.0 mM MgSO_{4}/1.0 mM CaCl_{2}] by vortexing for 8 sec. Then, we centrifuged the solution, removed the supernatant, and resuspended in HFB-1 twice more. After removing the supernatant from the third and final HFB-1 rinse, we resuspended cells in 1 ml HFB-2 solution (HFB-1/10 mM MgCl_{2}/5 mM CaCl_{2}). This served as our culture of starved bacterial cells, ready for phage adsorption.

We studied a panel of eight genetically distinct clones, chosen based on reports of lysis profiles different from the wild type. These included three *Epos* mutants carrying point mutations in the *E* gene, whose product is the chief player in φX174 lysis (*pos4b*, *pos5*, and *pos6*, first described in Bernhardt *et al.* 2002; kindly provided by Ry Young and Rohit Kongari, Texas A&M University); four carrying point mutations in the *D*-promoter region, which exerts regulatory control over transcription of the *E* gene (*mut319*, *mut321*, *mut323*, and *mut324*, previously described in Brown *et al.* 2010 and Brown *et al.* 2013; kindly provided by Celeste Brown and Amber Stancik, University of Idaho); and a wild type (also provided by Ry Young and Rohit Kongari) (Table 1 and Figure 2). Clones are available from D.M.W. on request. Presence of the mutations of interest was verified by Sanger sequencing of the 65–730 base region of each clone’s genome (nucleotides numbered as in Sanger *et al.* 1977). The clones’ sequences in this region (GenBank accession KU646482–KU646588) were identical to the wild-type sequence (GenBank accession J02482.1) apart from the noted mutations. The wild type employed was the ancestor of the *Epos* mutants, which differed from the wild-type ancestor of the *D*-promoter mutants (GenBank accession AF176034) at five sites: G833A (nonsynonymous), G1650A (nonsynonymous), T2811C (synonymous), A4518G (nonsynonymous), and T4784C (synonymous) (Pearson *et al.* 1997). The G833A difference entailed an amino acid change in the *E* (but not the *D*) gene, though it fell well outside the region coding for the E protein’s essential transmembrane region (Figure 2) (Tanaka and Clemons Jr 2012).

Phage titers were determined by combining a known volume of phage stock with 200 μL exponentially growing host cells (described above), and plating in top agar held at 42°, consisting of 7.0 g/L Difco BactoAgar (Detroit, MI) added to LB medium (as described above). After several minutes on the bench at room temperature, we incubated plates agar-side-up at 37°. Preliminary results (not shown) demonstrated that accurate plaque counts required at least 4 hr of incubation. Occasionally, for logistical reasons, we interrupted incubation with overnight refrigeration at 4°.

### Serially sampled parallel infection assay

We began by adding 90 μL starved cells to each well of a TempPlate nonskirted 96-well PCR plate (USA Scientific item no. 1402-9598) at 15°. The PCR plate was mounted on a thermocycler to control its temperature precisely. Using a 12-channel pipettor, we then added 10 μl 30-PFUs/mL phage dilution to each well in each row of 12 wells, allowing for a 30-sec interval between rows. PCR plates were held at 15° for at least 20 min, during which time phages could adsorb to the starved cells. We estimate that 10–20% of phage particles adsorbed to hosts during this incubation period (Celeste Brown, personal communication; Newbold and Sinsheimer 1970).

Then, infections were initiated by raising the PCR plate temperature to 37° and adding 100 μL warm LB (incubated at 37° for at least 45 min) to each well in a row with a 12-channel pipettor, again allowing 30-sec intervals between rows. Consequently, the infective process in each row was staggered by half a minute, allowing ample time for the sampling procedure that followed. Note that the temperature increase alone did not trigger infection. The addition of rich medium, such as LB, was required for infection to advance (Newbold and Sinsheimer 1970). Based on past reports of φX174 attachment kinetics under conditions similar to our assay (Brown *et al.* 2013), unattached phage particles likely adsorbed to hosts very quickly at this time, with an average adsorption time of less than 30 sec [assuming adsorption followed an exponential decay process (Abedon *et al.* 2001)].

It is important to note that, based on our knowledge of the mutated genes’ functions (Fane *et al.* 2006) and past results (Brown *et al.* 2013), the studied mutations likely had little or no effect on phage adsorption rate. This includes the nonsynonymous difference (G1650A) between the *D*-promoter mutants and other clones in coat protein F. While a change in the F protein could theoretically have affected adsorption (Bernal *et al.* 2004), it is unlikely to have done so here, because the difference was merely between one basic polar residue (His) and another (Arg).

Five minutes after infection was initiated, we began sampling wells at 15-sec intervals. Sampling was performed by transferring the entire 200 μL contents of a well to a sterile 96-well PCR plate, held at 4° to halt the infective process. We assume that this sampling procedure did not artificially induce lysis in sampled infections. We staggered sampling between rows such that, factoring in the 30-sec gaps in infection between each row, we sampled one well every 15 sec between *t* = 5 min and *t* = 28.75 min. The endpoint of 28.75 min was chosen on the basis of preliminary experiments that showed most infections ending in lysis by that time. Even though not all phages would lyse their hosts by 28.75 min, we found no evidence that φX174 undergoes T4-like lysis inhibition (Abedon 1992), so we believed that samples taken between 5 min and 28.75 min would give us a reasonably complete picture of the clones’ lysis phenotypes.

At the conclusion of this process, the contents of each sampled well—now at 4°—were plated and incubated, as described, and then counted. Typical results are shown in Figure 1. Four to six replicate assays were performed for each clone.

### Statistical methods

#### The model:

We began with an overview of the model that we assumed generated the data. We then developed methods of estimation from the model and applied them to real experimental data.

Each time an assay was performed, we assumed a small, Poisson-distributed number of phage particles had been added to each well. The mean of the Poisson was allowed to vary between assays, which is to say, we assumed that phage concentrations in the stocks differed by clone and could change over time.

Within the wells, we assumed phage particles adsorbed instantaneously to starved cells upon the initiation of infection. (We discuss the rationale underlying this assumption in the *Results* section). We assumed that each phage particle infected and then lysed a cell *T* minutes after the addition of warm media, where *T* could not be less than a biologically imposed minimum (the latent period), and lysis was equally probable every moment thereafter (though we also considered a model in which lysis probability changed with time, as described below). We assumed that burst size grew linearly with lysis time, a relationship generally assumed and supported by past work (Hutchison and Sinsheimer 1966; Abedon 1989; Wang *et al.* 1996; Abedon *et al.* 2001; Chao *et al.* 2002; Bull *et al.* 2004; Wang 2006; but see Hutchison and Sinsheimer 1963 and Storms *et al.* 2014). The fate of each infecting phage particle within a well was assumed to be independent of other particles in that well. Wells were sampled and plated at 15-sec intervals from *t* = 5 min through *t* = 28.75 min, and we assumed 100% plating efficiency, implying that the plaque count on a plate revealed the true number of phage particles in a well at the time of sampling. A plaque count of zero meant no particles were in the well, a count of one to a few implied that as many particles as visible plaques were added but that no host cells had been lysed yet, and a count of more than a few represented the sum of the burst sizes of lysis events prior to the sample time, plus any infected yet unlysed cells. Note that, in the third case, we did not know how many phage particles were in the well to begin with, we did not know how many lysis events had occurred by the time of sampling, and, of the lysis event(s) that had occurred, we did not know when they occurred, except that they must have occurred prior to the moment of sampling. From a statistical perspective, this suite of hidden variables made the problem both interesting and challenging.

#### Notation:

Developing estimators under this model required substantial notation. Table 2 summarizes this notation for the reader’s reference.

#### Number of phage particles per well:

Our first task was to estimate , the Poisson parameter that governs the probability of having phage particles in each well for assay *a*. This was somewhat complicated by the fact that, before the latent period ended and lysis began, we observed the actual Poisson counts, but after this period, we only observed the true Poisson counts for “zero” wells and wells that do not show lysis events. In wells showing lysis events, we knew only that the Poisson count must have been greater than zero. To derive an estimator tailored to the situation, we ignored the time element of the data and defined *p* as the probability that a randomly selected phage particle from a randomly selected well would not have lysed its host. We let *n* be the Poisson number of phage particles initially in a well (*i.e.*, before lysis occurs). Then, we noted that each sampled well fell into one of three categories: samples with zero phage particles, samples with phage particles but no lysis events, and samples showing lysis events. The probability of a well belonging to each category was, respectively, , , and .

We defined wells not showing lysis events as those having plaque counts greater than zero, and less than or equal to seven. When the Poisson parameter *β* is less than or equal to one, counts greater than seven are highly improbable in wells where lysis has not occurred (). Of course, it is possible that wells showing fewer than eight plaques could in fact be the results of infections. But, ultimately, we had to set the threshold for distinguishing lysis events from nonlysis events *somewhere*. This threshold-setting exercise was not straightforward. Complicating factors included the following:

Poisson average numbers of particles per well varied for each assay, so setting the threshold based purely on the Poisson probabilities was not trivial.

There was a nonzero chance of slight inaccuracies in some plaque counts. Rare errors could have occurred in the counting of plaques that would have increased counts on average (

*e.g.*, plaque-like air bubbles). In addition, certain experimental manipulations (*e.g.*, mechanical stress from pipetting) could feasibly have induced very premature lysis of some cells, resulting in plates with extremely low plaque counts that otherwise would have had just one plaque.The few plaque counts of six and seven, a couple of which were sampled at unusually early time points, could have exerted disproportionate influence on estimates of clones’ lysis phenotypes if included as lysis events. This is because, as discussed below, an artificially early lysis event would force the value of the parameter to shift to some time before that event.

Therefore, given these intermediate plaque counts’ rarity, their potential origins in experimental error, and their potential to distort our estimates of clones’ lysis phenotypes, we chose to set the threshold for lysis events at greater than or equal to eight. It is worth noting, however, that when our data instead were analyzed using a threshold of greater than or equal to five, our results did not change qualitatively (analysis not shown).

We let the number of wells in assay *a* be and the number of wells that showed lysis events be . The likelihood of the data, then, was(1)We then took the log of the likelihood, the partial derivatives with respect to and *p*, set each equal to zero, and did a considerable amount of algebra (see Appendix 1 for details). This resulted in the following estimator:(2)We validated this estimator by simulating data under and 2, and we compared it to two reasonable alternatives: the “zero-class” estimator [, where is the proportion of sampled wells that had plaque counts of zero], and the “mean-count” estimator (, where *X* is the mean among observed plaque counts, applied only to wells sampled before we witnessed any lysis events). The results (which we will not cover in detail) revealed that Equation 2 had a very slightly positive bias ( when , falling to at ) but a lower mean squared error than the other estimators across conditions. Valuing the nearness of the estimate to the truth above all else, we adopted the new estimator of throughout.

#### Time to lysis:

We considered two nested models describing time to lysis. Both assumed that phages do not lyse cells until the latent period of ends. The exponential model assumed that lysis occurred with equal probability each moment after , while the Weibull model allowed the probability of lysis to change with time. Under the Weibull, the time to lysis, *T*, had the following distribution:(3)where α was a hazard parameter that dictated how lysis probability changed with time. When , the Weibull model simplified to the exponential model. Because they are nested, we employed a likelihood ratio test to determine if the improved fit under the Weibull justified the additional parameter. To do so, we analyzed each genetically distinct clone separately. We fitted the data to each model using maximum likelihood (as described below), estimated parameters, obtained the log-likelihood, and took the difference: . We then simulated 100 datasets under the null (exponential) parameter estimates. For each simulated dataset, we fitted the data to both models and calculated the difference in their log-likelihoods: . We then approximated the *P*-value by the proportion of greater than or equal to . Large *P*-values (*i.e.*, ) would indicate the simpler exponential model described the data sufficiently well. This test returned *P*-values greater than 0.3 for all clones, and we therefore assumed the exponential model for the duration of the study. One important implication of this assumption was that, under the exponential model, the lysis time variance was equivalent to .

Defining as the number of phages in well *r*, assay *a* that had lysed their hosts by time *t*, followed the Poisson distribution with rate . If we let be an indicator function that is one when is greater than or equal to one, and zero when is zero, and we let be the number of phages that had not lysed their hosts in well *r*, assay *a* at time *t*, we show in Appendix 2 that the log-likelihood of the data were proportional to(4)Note that directed us to sum over all assays of a particular clone. We analyzed each clone separately. We have omitted a clone subscript throughout to avoid further complicating notation. In practice, we obtained joint maximum likelihood estimates of λ and by establishing a grid of joint values at narrow intervals, calculating the log-likelihood of each using Equation 3, and selecting those values that together give the best likelihood. Confidence intervals on λ and were obtained by parametric bootstrap. To do this, we used , , and to simulate one suite of datasets to match (in number and well count) the real datasets for the clone under analysis. We analyzed the simulated suite of datasets using Equation 4 and a grid of potential values to obtain and . The one condition we imposed in this estimation step was that must be less than or equal to the minimum observed lysis time from the real data. If we did not do this, the bootstrap would have returned values and hence confidence bounds for greater than this minimum lysis time, and we knew such values would not be valid confidence bounds when the probability of the real data under them is zero. We repeated 500 bootstrap replicates. We then sorted the values, and defined the 95% confidence interval as the central 95% of the values. We independently sorted the values and used the central 95% of these. Thus, the confidence intervals on and were not joint. We did not validate that these intervals capture the truth greater than or equal to 95% of the time, but we did confirm that the means of and were close to and , respectively, suggesting that the parametric bootstrap was a valid technique.

We also were interested in inferring mean lysis times from our data. Given the unique properties of exponential distributions, we could derive each clone’s mean lysis time simply by summing its estimated and λ. A confidence interval for this estimate could be obtained by repeating this procedure using the upper and lower bounds on the individual parameter estimates.

We now turn our attention to how to determine whether lysis time parameters λ and are different for different clones. We conducted several formal tests for differences between clones using a likelihood ratio framework. To test whether both λ and clones differ, we note that, under the null hypothesis, a single shifted exponential function would describe the data from both clones being compared; the alternative hypothesis asserted the data for each clone came from a distinct function. We calculated the log-likelihood of the data under the null () by pooling both clones’ data, fitting as just described to obtain and , and then using Equation 4. We calculated the log-likelihood under the alternative () by doing the same thing, except we fitted each clone’s datasets separately, calculated their log-likelihoods separately, and then summed. We define . To determine whether is significantly large, we needed the distribution of under the null. We obtained this by simulation. We simulated data analogous in size and sample times to the pooled data using and , repeated the fitting process, and calculated log-likelihoods and their difference to yield . We repeated this process 500 times, and took the approximate *P*-value of as the proportion of bootstrap simulations where was greater than .

To test whether just λ differed between pairs of clones, we used a similar likelihood ratio test. Under the null, both clones shared a common value of λ but could have different values of ; under the alternative, both λ and could differ between clones. Note that the null did not fix λ at a single specific value; rather, it merely stipulated that λ, whatever value that may be, be shared by the clones being compared. We fitted the data to the null by finding the maximum likelihood estimates for each clone under the constraint of a shared value of λ and unconstrained values of and then summing their log likelihoods (). We fitted the alternative by imposing no constraints, and summing the two log-likelihoods (). We defined . To obtain the distribution of under the null, we simulated 500 datasets using the null parameter estimates, repeated the model-fitting exercise for each, and took the difference in the log-likelihood: . The proportion of the bootstrap replicates where was greater than was the approximate *P*-value for this comparison. To test for significant differences in , we repeated this procedure, except was constrained and λ was allowed to vary.

To test whether the distribution of *P*-values calculated for each set of pairwise parameter comparisons diverged from the null expectation (*i.e.*, a uniform distribution), we binned the *P*-values into four categories [0.00,0.25), [0.25,0.50), [0.50,0.75), and [0.75,1.00] and conducted an exact multinomial test against expected frequencies in each bin under the null. The null bin frequencies were equal across the four categories (*i.e.*, 25% in each bin).

#### Burst size as a function of time:

We modeled the potential for burst size to change with time as a linear function, , where was the expected burst size when *T* minutes had elapsed since , μ was the burst size at , α was the rate of increase in burst size (*i.e.*, the slope parameter), and ε captured the normally distributed variation from expectation (mean 0, variance ). Note that none of the terms in this model was observable. If a well sampled at time *t* showed a lysis event, we knew that lysis occurred before *t*, but we did not know exactly when. In addition, we did not know how many phage particles initially were in the well. Thus, both lysis time and burst size were hidden variables. In Appendix 3, we show that the observed count (rather than the burst size) could, like burst size, be expressed as a linear function:(5)where was the count observed in well *r* sampled at time *t* from assay *a*, was the number of phage particles in well *r* (assay *a*) that had lysed their hosts by time *t*, was the sum of their lysis times, and δ was normally distributed with mean zero and variance . Our goal was to estimate the parameters μ, α, and σ, but because Equation 5 depended on the unobservable and , we could not use standard regression. Instead, we employed an expectation maximization (EM) algorithm, through which the problem was circumvented by replacing and with their expected values before doing regression, and then iterating (Meng and Rubin 1993). More specifically, our EM algorithm involved the following steps: (1) guess values for and ; (2) estimate the parameters μ, α, and σ by least-squares regression (equivalent to maximum likelihood) assuming the guesses of and are correct; (3) assuming the parameter estimates are correct, calculate the expected values of and , and update the guesses to these values; and (4) repeat steps 2 and 3 until convergence occurs. It turns out that both parameter estimation (step 2) and calculating expectations (step 3) involved a fair amount of math; we present this in Appendix 3.

To obtain confidence intervals and test for differences between clones in the regression parameters, we used a parametric bootstrap approach. For each genetically distinct clone, we simulated 1000 datasets using the estimated parameters and, each time, estimated α, μ and σ. We then sorted the bootstrap set of parameter values individually, using the 25th and the 975th values (*i.e.*, the central 95%) to define the confidence intervals. To keep our analysis simpler, we did not conduct formal likelihood ratio tests for significant differences between clones for these parameters. Instead, we compared confidence intervals and applied the conservative criteria that when confidence intervals were nonoverlapping, the two clones were significantly different.

Data and code required to repeat this analysis can be found in the Supplemental Material File S1.

### Data availability

Strains are available from D.M.W. upon request. File S1 contains a description of all supplemental material, including raw experimental data and the R code that was used to analyze them. Sequences of the WT (*Epos* mutant ancestor) and *D*-promoter ancestor are given by GenBank accession numbers J02482.1 and AF176034, respectively. Regions of the mutants' genomes were sequenced, and these sequences also were archived in GenBank (accession numbers KU646482 to KU646588).

## Results

Thirty-seven assays were conducted for the eight clones (four to six each). (Raw data from the assays are available in Supplemental Material, File S1.) The estimated Poisson parameter ranged from 0.20 to 0.91, with a mean of 0.46. It follows that approximately 63% of all wells had no phage particles, 29% had one particle, 6.5% had two particles, 1% had three, and 0.1% had four. Observed plaque counts ranged from zero to 1270, with eight being the smallest plaque count classified as representing a lysis event (see section *Statistical methods*).

Across the clones, the earliest times at which lysis was observed ranged from 11.0 min (*pos6*) to 14.5 min (*mut321*, *mut323*). Estimates of are shown in Table 3. This range of latent periods was consistent with past results, as reported in Bernhardt *et al.* 2002 (one-step growth assays in figure 2A in Bernhardt *et al.* 2002, about 12 min for wild type, *pos6*, and *pos4B* infections, though about 22min for *pos5*) and Brown *et al.* 2010 (about 12–14 min for the *D*-promoter mutants and wild type).

The smallest observed value of λ among the clones was 4.8 min (*mut321*), meaning that the cumulative probability of lysis after rose most rapidly for this clone under the experimental conditions. The wild type was found to have the largest λ (11.2 min). Values of λ for other clones, along with confidence intervals, can be found in Table 3.

Figure 3 illustrates how the observed increase over time in the proportion of wells showing lysis events was used to estimate the cumulative lysis probability function for each clone. The figure also includes pairwise comparisons of each clone with the wild type, demonstrating clear differences between them. Note that Figure 3 does not visualize the β parameter, which, in addition to the lysis events shown, was important in the estimation of lysis time phenotypes. Taken together, the curves form three rough clusters: (1) *pos6*, (2) wild type and *pos4B*, and (3) the remaining clones. This clustering can be observed clearly when the parameters composing those functions, λ and , are plotted against each other (Figure 4).

These visual comparisons were formalized into pairwise statistical tests. Along with parameter estimates, Table 3 shows the pairwise *P*-values associated with the null hypothesis that the clones being compared came from the same function (*i.e.*, that jointly, both λ and are the same for the clones). If either parameter differed, this test would reject the null. By contrast, Table 4 shows the results of two different sets of tests: the set of pairwise tests of the null in which λ was held to be the same between clones, but was allowed to differ, and a similar test in which was held to be the same, but λ was allowed to differ. In other words, the first of these two sets of tests gauged differences between clones’ values of λ, whereas the second set of tests gauged differences between clones’ values of .

Notice that we have not corrected for multiple tests. Our *P*-values are based on 500–1000 bootstrap replicates, whereas the amount of replication needed to achieve significance under a Bonferroni correction would be in the 5000–10,000 range and computationally infeasible. We note, however, that if the joint null hypothesis were universally true (*i.e.*, all clones followed the same function), we would expect the *P*-values of the 28 pairwise tests to follow a uniform distribution, and we would expect just one or two false positives (28 0.05 = 1.4). Indeed, we would be unlikely to have more than three false positives, since the binomial probability of greater than three false rejections is 0.0491. Instead, the distribution of *P*-values for the joint λ- comparisons skewed strongly from a uniform distribution [exact multinomial test, *P* < 0.0001 (McDonald 2014)], and we observed 13 pairwise results less than or equal to 0.05 (Table 3). [The binomial probability of 13 or more false rejections is .] We suggest, therefore, that most, though perhaps not all, of the differences that appeared as significant were truly significant. Similarly, the tests for pairwise differences in only λ and only yielded 10 and 13 values less than or equal to 0.05, respectively, which is far more than expected under the null of a uniform distribution (exact multinomial test, *P* = 0.0068 and *P* < 0.0001, respectively). (The probability of 10 or more false rejections is , and the probability of 13 or more is ; Table 4.)

We also note that our omission of adsorption time from our statistical model (see *Materials and Methods*) had little effect on these findings. By ignoring adsorption time, we treated the time to lysis as a single exponential process, when in reality it involves the sum of two exponential waiting times (Abedon *et al.* 2001). However, additional analysis (not shown) demonstrated that, because adsorption happens at least an order of magnitude faster than lysis (Brown *et al.* 2013), the distribution of lysis time under the one-exponential model is very similar to that generated by a more complex two-exponential model. Furthermore, the variance of our one-exponential model is slightly larger than that of the two-exponential model, which implies that tests underlying the pairwise phenotypic comparisons were conservative. To see why, recall that the variance of an exponential with mean λ is . The variance of the sum of two independent exponentials with means and is the sum of their individual variances: . But the variance of a single exponential with mean is . Therefore, although we might have gained marginally more power in our phenotypic estimates by adopting a more complex two-exponential statistical model (supplemented with data on adsorption) or following the preattachment protocols described in Brown *et al.* 2013 and Cherwa Jr *et al.* 2009, the methods we employed were adequate.

The regression analysis of burst size by time revealed that burst size increased with time for all clones (Figure 5). The parameter estimates for each clone are provided in Table 5 along with confidence intervals. Figure 5 illustrates these regression parameter estimates graphically, showing that clones fell into four clusters of decreasing slope: (1) *pos6*, (2) *pos4B* and *pos5*, (3) wild type and *mut319*, and (4) *mut321*, *mut323*, and *mut324*. Formal testing of parameter differences was not conducted. However, where confidence intervals were nonoverlapping, there was good evidence for a significant difference. For slope, the confidence intervals within the four clusters were generally overlapping while intervals between clusters were not.

Table 6 summarizes the calculated mean and variance in lysis time for each clone.

## Discussion

In recent years, it has been found that the probability that a beneficial mutation will fix in a lytic phage population depends sensitively on the life history parameter it affects (Wahl and DeHaan 2004; Hubbarde *et al.* 2007; Patwa and Wahl 2008, 2009). Furthermore, analytical work has demonstrated that variance in life history traits, not just the trait means, can influence growth rate predictions, particularly in iteroparous populations far from a stable age distribution (Bull *et al.* 2011). Inspired by this body of work, we wished to learn more about whether bacteriophage life history traits, including higher moments (*e.g.*, lysis time variance), could be subject to the phage’s genetic control.

To this end, we used a serially sampled parallel infection assay to interrogate the lysis time and burst size phenotypes of a panel of eight genetically distinct φX174 clones. Among these clones were four with mutations in the *D* promoter that have been reported to downregulate the transcription of several genes, including the lysis gene (*E*); three *Epos* clones with mutations in *E* that have been found to upregulate its expression; and the wild-type ancestor of the *Epos* mutants. (As noted in the *Materials and Methods*, these mutations were introduced on slightly different genetic backgrounds.)

Using a novel statistical approach that allowed us to overcome the challenge of hidden variables, we found significant variation in both lysis time and burst size among these clones, including evidence that φX174 exerts genetic control over lysis time variance under our experimental conditions.

The inferred differences among the clones’ burst size phenotypes are generally consistent with our limited understanding of the mutations’ biological effects (Table 5 and Figure 5). In accordance with their depressed rates of transcription at the *D* promoter (Brown *et al.* 2010), burst size increased more slowly than the wild type in three of the four *D*-promoter mutants. (Burst size increased with lysis time at about the same rate as the wild type in the fourth *D*-promoter mutant, *mut319*.) By contrast, burst size increased more quickly than the wild type in the three *Epos* mutants. This may be related to the findings of Bernhardt *et al.* 2002 that the R3H and L19F mutations lead to markedly faster E protein synthesis than the wild type, perhaps by increasing mRNA translatability. Given the compact genome of φX174 and the location of *E* within the *D* gene (Figure 2), it is possible that these mutations somehow upregulate gene expression more globally, thereby precipitating more rapid accumulation of progeny virions and larger burst sizes.

As with burst size, we inferred significant variation in lysis time phenotypes among the eight clones under the tested experimental conditions. The clones seem to differ in their overall cumulative lysis probability functions (Table 3) and in both and λ individually (Table 4).

Within these differences, we observe an inverse relationship between and λ (Figure 4): under the tested conditions, the mutations appear to delay the onset of lysis while causing the cumulative lysis probability to rise more quickly compared to the wild type (Figure 3). (The most noteworthy exception is *pos6*, the only clone with a value of less than that of the wild type.) A consequence of this trend is that, despite substantial differences in and λ, the clones’ mean lysis times (computed as the sum of and λ) are rather similar (Table 6). A mechanistic investigation of the apparent maintenance of mean lysis time is beyond the scope of this study, but we speculate that it may reflect constraints on lysis timing imposed by the host’s cell cycle (for φX174: Witte *et al.* 1998; Young *et al.* 2000; and other phages as well: Hadas *et al.* 1997; Storms *et al.* 2014). While mutations in *E* and related genes could potentially bias lysis timing earlier or later, or make the process more or less noisy, the host’s physiology and cell cycle may prevent extreme divergence from a common mean.

In fact, given the genetic and phenotypic constraints on φX174 life history, it is notable that burst size and lysis time phenotypes appear to diverge as much as they do. A primary genetic constraint is the phage’s compact genomic architecture. The *E* gene is entirely embedded within the structurally essential *D* gene (Barrell *et al.* 1976; Figure 2), and the *D* promoter simultaneously regulates the transcription of *E*, *D*, and four other structural genes downstream (*i.e.*, *J*, *F*, *G*, and *H*) (Fane *et al.* 2006). The work presented here stands with and expands upon previous studies (*e.g.*, Brown *et al.* 2010, 2013; Bernhardt *et al.* 2002) in demonstrating that, despite the relatively limited genotype space accessible to the organism, the genomic architecture of φX174 permits a significant degree of flexibility in lysis time and burst size phenotypes. Still, the genetic constraints on these phenotypes are clearly visible in the form of pleiotropic effects on burst size and lysis time in our data.

The inferred disparity among our clones’ lysis time parameters is also interesting given the aforementioned phenotypic constraints imposed by the host’s cell cycle and physiology. Whereas λ and many other larger DNA phages effect lysis via a tightly regulated, two-protein lysis strategy, φX174 and other *Microviridae* lyse their hosts by way of a single protein (Young *et al.* 2000; Young and Wang 2006). The lysis timing of φX174 is thought to be dictated largely by the timing of host cell division and, therefore, compared with phages like λ, even less subject to the phage’s genetic control (Witte *et al.* 1998; Young *et al.* 2000; Zheng *et al.* 2008; Dennehy and Wang 2011; Singh and Dennehy 2014; Storms *et al.* 2014). But our demonstration of genetic control of both lysis time mean and variance hints at the possibility of as-yet-unknown mechanisms exerting some control over phage lysis timing. Indeed, researchers have long speculated that protein E may mediate lysis timing not just through its interactions with MraY, but also through interactions with the host protein SlyD (Young and Wang 2006). SlyD plays an important role in cell division and cell cycle regulation (Roof *et al.* 1997) and is required for stability of wild-type E protein (Bernhardt *et al.* 2002).

### Strengths and limitations of our experimental and statistical approach

The strength of the experimental and statistical approach taken in this study lies in its power and applicability. It is powerful in that, more effectively than most other methods, it simultaneously captures information related to lysis time variance and burst size as a function of time. The approach also has potential application to a range of microbiological questions. The principles and tools underlying the assay and inferential apparatus could be applied usefully to other cases in which one seeks to make inferences about dynamic, Poisson-distributed processes that are sampled destructively over a time course (*e.g.*, tracking the fates of individual tagged mutants in a dynamic population of cells or viruses).

However, we concede our approach has limitations. There is increasing evidence that host cells’ physiology plays a vital role in determining phage infection dynamics (Hadas *et al.* 1997; Rabinovitch *et al.* 2002; Storms *et al.* 2014; Bull *et al.* 2014). Yet, in synchronizing and rapidly sampling infections, our assay exposed the cells to temperature changes and nutritional regimes on a short timescale, which may have altered cells’ physiology in ways important for the measured phenotypes (van Elsas *et al.* 2011). While these sorts of manipulations are not uncommon in the literature (*e.g.*, Wang 2006; Cherwa Jr *et al.* 2009), they limit the generality of some conclusions drawn from this work. The unique conditions of the assay complicate comparisons to previous estimates of the clones’ burst size and lysis time phenotypes, and they make it impractical to validate our methods against traditional single-step growth experiments with much confidence.

By the same token, the lysis times and burst sizes inferred in this work may differ from those φX174 exhibits in nature. Environmental differences in factors like host density and diversity, as well as temperature changes and nutritional regimes, could possibly lead to differences between phenotypes measured in the assay and in the wild. That said, it is unclear how well any particular assay approximates the phage’s natural environment, which remains largely uncharacterized (Michel *et al.* 2010). Further study of the ecological niche of φX174 would be necessary to clarify which environmental differences between our assay and the wild are most relevant to the phage’s life history phenotypes.

Finally, our assumption of a linear increase in burst size with time, though substantiated by significant past work with φX174 and other phages (Hutchison and Sinsheimer 1966; Wang *et al.* 1996; Wang 2006), may be subject to debate (Storms *et al.* 2014). We hope further research will help better evaluate the appropriateness of this assumption.

When it comes to achieving the core aims of this work, we believe the strengths of our experimental and statistical approach outweigh its limitations. It allowed us to demonstrate that, under at least one set of conditions held constant for eight similar but distinct φX174 clones, burst size and lysis time phenotypes (including lysis time variance) differed significantly. Furthermore, it is reasonable to assume that the degree of genetic variability among the clones is comparable to that which is the substrate for natural selection acting in the wild.

### Evolutionary implications of heritable variation in lysis time variance

Our finding that variance in the lysis time of φX174 can be subject to at least some genetic control inspired us to ask, “What, then, could be the evolutionary implications of mutations affecting variance in phage lysis time?” Trivially, if burst size grows more slowly than exponentially once a cell has been infected, then a mutation increasing lysis time variance will be selectively favored, because early lysis events contribute more to population growth than late lysis events detract (Bull *et al.* 2011; but see Singh and Dennehy 2014). Empirically, phage burst size is thought to increase linearly (Hutchison and Sinsheimer 1966; Wang *et al.* 1996; Wang 2006; but see Storms *et al.* 2014), immediately suggesting one mode of selection. Mutations increasing lysis time variance in a given environment likely would fix quickly under this mode of selection.

In order to control for the influence of this mode of selection, we recently have explored a model in which burst size increases exponentially in time since infection (D. M. Weinreich, C. Brown, and L. M. Wahl, unpublished data). This has allowed us to infer the intrinsic selective consequences of mutations that influence lysis time variance. As expected, in phage populations growing at their stable age-of-infection distribution in unlimited host cells (Bull *et al.* 2011; Bull 2006), we found such mutations to be selectively neutral, since, by construction, the contribution of early bursts to population growth is now exactly balanced by (exponentially larger) late bursts.

However, the moment a mutation appears, the mutant subpopulation is by definition not at stable age distribution, and indeed we have found in simulations that mutations increasing variance in lysis time are transiently enriched in the population (D. M. Weinreich and C. Brown, unpublished data). This effect follows from the fact that the frequency of phages with high lysis time variance exceeds (sometimes considerably) the frequency of their low-variance brethren early in each lysis cycle. While the reverse is true late in the cycle, the effect is of smaller magnitude and much shorter lived. Finally, overlaying periodic demographic bottlenecks (*e.g.*, as employed in lab conditions, but also surely in nature), we observed that mutations increasing lysis time variance can enjoy a dramatic increase in population frequency, despite their neutrality when they finally reach their stable age-of-infection distribution (D. M. Weinreich and C. Brown, unpublished). We therefore suggest a second mode of selection acting on mutations affecting variance in lysis time.

### Future directions

The findings reported here suggest two complementary avenues for future investigation. One is microbiological, and it involves exploring φX174 lysis in relation to its genetic determinants and its host’s biology. It would be interesting to explore the mutants’ performance in a range of assays, such as those involving thermally regulated microscopy (Dennehy and Wang 2011), single-cell microfluidics-based techniques (Yin and Marshall 2012), or periodic filtration and sampling (Hutchison and Sinsheimer 1963). (The last and oldest of these techniques could prove especially fruitful, since it would allow direct access to both lysis time and burst size.) In addition, it would be highly informative to use methods like those designed by Storms *et al.* 2014 to study how φX174 mutants’ burst sizes and lysis times vary as a direct function of the host’s cell cycle. These types of experiments could yield valuable insight into the regulation and host-dependence of φX174 life history phenotypes.

Theoretical evolutionary problems present another opportunity for future research. Does natural selection act on lysis time variance in nature? What is the fixation probability of a mutation that affects lysis time variance, and how do populations carrying such mutations evolve in the face of periodic bottlenecks? Existing computational models could be strengthened by more thoroughly exploring the models’ parameter sensitivity and investigating how mutations that pleiotropically influence adsorption, lysis time, and burst size affect predicted population dynamics. In addition, the models must be integrated with other theoretical work exploring the fixation probabilities of mechanistically distinct beneficial mutations in lytic phages (Wahl and DeHaan 2004; Hubbarde *et al.* 2007; Patwa and Wahl 2008, 2009). Once elaborated and refined, models could be tested empirically through both “single-sampling” assays and longer-term evolution experiments. The former approach would assess the probability of loss of lysis time variants introduced at very low frequency in replicate phage populations, while the latter would explore whether real evolutionary outcomes match our theoretical predictions.

## Acknowledgments

We thank Olivier Tenaillion, Amber Stancik, Celeste Brown, Rohit Kongari, and Ry Young for providing the bacterial and phage strains used in this study. Additionally, we thank Darin Rokyta for providing us with the *E. coli* starvation protocol and Sohini Ramachandran for offering feedback on an early draft of this manuscript. We also are grateful for support from the Brown Division of Biology and Medicine to D.M.W., from the Brown Dean of the College to J.Y., and from the National Science Foundation (NSF) (G-K12 Award 0638688 to T. Herbert) to support M.H.B. C.R.M. and P.J. were supported by grant number R01-GM076040 from the National Institutes of Health. This research is based in part upon work conducted using the Rhode Island Genomics and Sequencing Center, which is supported in part by the NSF under EPSCoR grants nos. 0554548 and EPS-1004057.

## Appendix 1: Number of Phages per Well Poisson Estimator

The counts from an assay place each well into one of three categories: empty, not showing lysis events but nonzero, and showing lysis events. The probability of a well being empty is . The probability that a well has *n* phages is . The probability that all *n* phages fail to lyse their host is , leading the probability of *n* in a well showing no lysis events to be . Finally, the probability of a well showing one or more lysis events is one minus the probability of a well being empty, plus the sum of all the probabilities associated with wells that do not show lysis events, which turns out to be . The categorization of wells into three types leads to the following likelihood.(6)Therefore,(7)The maximum likelihood estimator is obtained by first calculating two partial derivatives,(8)(9)and then setting both and . Denote the solution by and , respectively. The solution associated with Equation 8 leads to(10)and the solution associated with Equation 9 gives(11)Now multiply both sides of the above equation by (12)Note that second term on the right side of Equation 10 is equivalent to the left side of Equation 12. This leads to the following simple relationship between and :(13)Thus and . We can eliminate from Equation 10 by substituting for to get(14)Solving Equation 14 for gives the estimator,

## Appendix 2: Time to Lysis

Recall that we model the probability of a phage lysing its host by time *t* with a shifted exponential, . Define as the number of phages in well *r*, assay *a*, that have lysed their hosts by time *t* (when sampling occurs). is Poisson with mean . Let be an indicator with value 1 if and 0 otherwise. Let be the number of phages that have not lysed their hosts in well *r*, assay *a*, at time *t* (). Based on standard theory is independent of and is also Poisson distributed but with mean . However, is only directly observable when no phages in the well have lysed their hosts (*i.e.*, when ). We now show how to use the binary together with Poisson distributed to estimate λ and . First, the likelihood of the lysis data are given by,(16)The likelihood for the number of phages in wells not showing lysis events is given by(17)However, to account for the fact that is only observable when no lysis occurs (*i.e.*, ), we must adjust the likelihood for wells showing plaques but not showing lysis events as follows.(18)The combined likelihood is,(19)Dropping factorial terms that do not involve parameters and taking the log, the log-likelihood is proportional to(20)As discussed in the main text, we estimate λ and by brute force, where we calculate the likelihood using Equations 3 and 20 for a large array of joint values, taking the largest as the (approximate) maximum likelihood estimates.

## Appendix 3: Burst Size as a Function of Size

The count observable in well *r* from assay *a* taken at time *t*, , can be expressed as,(21)where is the lysis time of phage *i* in well *r* of assay *a*, and *I* represents an indicator function that takes value 1 when the condition that follows is true, and 0 otherwise. This simplifies to, where is the number of phages initially in well *r* of assay *a* sampled at time *t*, is the sum of their lysis times, and δ is normally distributed with mean zero and variance . Our goal is to then estimate μ, α, and σ. Because and are unobservable but are needed to estimate the parameters, we employ the following EM algorithm which replaces them with their expected value. Note that these steps summarize the process; we provide derivations and details in subsections below.

*EM algorithm*

Begin with initial guesses for and for each well. Let .

Given these values, estimate α and μ using least-squares.(22)

Use the least-squares estimates and to estimate (23)

Impute the values and using their posterior expected values. Details below.(24)(25)

Return to step 2, replacing

*j*with , until convergence occurs.

*Least squares estimates for α and μ (step 2 in EM algorithm)*

Because the error is normal, values of μ and α that minimize the sum of squared differences between observed and expected counts provide the maximum likelihood estimates. The function that expresses this sum of squares is,(26)Then(27)Set and solve. The solutions are the least squares estimates denoted by and satisfy the following set of linear equations.(28)For notational convenience let(29)The estimates are then(30)and

(31)*Conditional expectations (step 4 in EM algorithm)*

Now we will calculate the expected number phages conditional on the size of the burst as well as the expected sum of lysis times given the burst size . Throughout this section, assay *a* and well *r* will be fixed; for notational convenience we will drop those subscripts and just denote by respectively. Because we will be simulating the joint distribution of conditional on we will sometimes use a second subscript. In this case will represent the *j*th simulated value. To do this, we will first need the joint distribution of and , which we obtain through the following simulation algorithm.

Simulate from a Poisson distribution with mean . If repeat, otherwise proceed.

Simulate from a shifted exponential with mean , and shift estimated using the likelihood equation (4).

Calculate and for each value of

*t*in the assay. Round values of to the nearest tenth of a minute (for values < 1 min), the nearest quarter minute for values between 1 min and 5 min, and the nearest minute for values between 5 min and 100 min. Store.Repeat 1 through 3 for where we set .

For each value of

*t*, convert the observed joint values of to proportions.

The above algorithm generates the empirical distribution . We note that the distribution for conditional on and is normally distributed with mean and variance . Denote this distribution by(32)then(33)and(34)where(35)Using Equations 33 and 34 we can calculate the expected values of and for use in the EM algorithm.

## Footnotes

Supplemental Material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.115.024075/-/DC1

*Communicating Editor: P. C. Phillips*

- Received October 24, 2015.
- Accepted February 2, 2016.

- Copyright © 2016 Baker
*et al.*

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.