Abstract
Predicting phenotypes based on genotypes and understanding the effects of complex multi-locus traits on plant performance requires a description of the underlying developmental processes, growth trajectories, and their genomic architecture. Using data from Brassica rapa genotypes grown in multiple density settings and seasons, we applied a hierarchical Bayesian Function-Valued Trait (FVT) approach to fit logistic growth curves to leaf phenotypic data (length and width) and characterize leaf development. We found evidence of genetic variation in phenotypic plasticity of rate and duration of leaf growth to growing season. In contrast, the magnitude of the plastic response for maximum leaf size was relatively small, suggesting that growth dynamics vs. final leaf sizes have distinct patterns of environmental sensitivity. Consistent with patterns of phenotypic plasticity, several QTL-by-year interactions were significant for parameters describing leaf growth rates and durations but not leaf size. In comparison to frequentist approaches for estimating leaf FVT, Bayesian trait estimation resulted in more mapped QTL that tended to have greater average LOD scores and to explain a greater proportion of trait variance. We then constructed QTL-based predictive models for leaf growth rate and final size using data from one treatment (uncrowded plants in one growing season). Models successfully predicted non-linear developmental phenotypes for genotypes not used in model construction and, due to a lack of QTL-by-treatment interactions, predicted phenotypes across sites differing in plant density.
- Function-Valued Traits
- quantitative genetics
- Brassica rapa
- leaf development
- high-throughput phenotyping, phenotypic plasticity, Bayesian vs. frequentist, genotype to phenotype
Expressed phenotypes reflect the independent and combined effects of genetic and environmental inputs over time. However, understanding the relationship between genotype, the environment, and phenotype can be complicated. For example, phenotypic plasticity generated via genotype-by-environment interactions can alter the course of development, allowing a single genotype to exhibit multiple distinct phenotypes (Bradshaw 1965; Schlichting and Pigliucci 1998; Diggle 2002). Alternatively, phenotypic traits may be buffered against environmental effects, a phenomenon referred to as canalization (Waddington 1954; Rice 1998; Debat and David 2001; Hallet al. 2007). Predicting phenotypes based on genotypes across multiple environments is therefore complicated by differential environmental sensitivity, yet is critical for understanding and predicting crop yields and evolutionary outcomes (Yang and Zhu 2005; van Eeuwijket al. 2010; Oberet al. 2012; Crossaet al. 2013; Zhanget al. 2015).
Understanding the agronomic and evolutionary performance of complex traits requires a description of the developmental processes and growth trajectories that generate these phenotypes (Kell 2002; Onogiet al. 2016). An ontogenetic description is necessary, in part because selection does not act only on final phenotypes. Instead, selection acts continuously on phenotypes throughout organismal ontogeny as well as directly on growth rates themselves (Prusinkiewiczet al. 2007; Bakeret al. 2015). Furthermore, in comparison to single time point measurements when growth has ceased, characterizing the processes that produce a final phenotype leads to a more complete description of the genomic architecture underlying a given trait (Bakeret al. 2015). Analysis of developmental traits is, however, complicated by the fact that these processes are continuous while the data collected are typically discrete. Characterizing a developmental process necessitates a large number of measurements, which, among other statistical effects, reduces the power of significance testing under multivariate analyses (Griswoldet al. 2008). Function-Valued Trait (FVT) estimation fits mathematical functions to measurements collected over time, and thereby reduces the number of independent traits. Under one variant of FVT modeling, parameters that characterize the function are estimated and then analyzed as phenotypic trait data (Jaffrézic and Pletcher 2000; Kingsolveret al. 2001; Stinchcombe and Kirkpatrick 2012; Bakeret al. 2018). While there are advantages and limitations to each of several approaches, one advantage to the “parameters as traits” approach is that different parameters have ontogenetically interpretable features. For functions characterizing leaf growth, for example, one can distinguish between the slope (rate of growth), growth duration, and asymptote (final size), and examine the potentially unique genetic architectures, inter-relationships, and environmental sensitivities of these traits.
In additional to the external environment, developmental phenotypes and growth processes can be influenced by a plant’s internal physiological status (Hammeret al. 2005). For example, manipulating photosynthetic efficiency and carbon availability via transgenesis dramatically affects leaf morphology (Raines and Paul 2006; Schneidereitet al. 2006). Recent advances in FVT modeling use hierarchical Bayesian approaches to incorporate genotype-specific cofactors such as maximum photosynthetic capacity (Amax) and thereby factor out genotypic variation in carbon pools available during development and growth (Bakeret al. 2018). Residual genetic variation that remains after statistically factoring out Amax should reflect factors that more directly affect leaf expansion independent of genotypic-specific carbon resources (Bakeret al. 2018). Further, FVT modeling in a Bayesian framework leads to increased estimates of trait heritability compared to frequentist approaches, because the former takes into account the underlying structure of the data and provides for improved error handling (Bakeret al. 2018). Whether Bayesian approaches to FVT estimation also afford concomitant improvements in QTL mapping remains an open question.
Quantitative-genetic analyses of both FVT and other complex traits require high genotypic replication. The challenge of phenotyping throughput of numerous genotypes can be addressed for some traits by remote sensing approaches (Weberet al. 2012). In Brassica rapa, spectroradiometric indices are genetically correlated with leaf-level physiological measurements and aspects of final leaf morphology, such that genotypes with high values of, for instance, the MERIS Terrestrial Chlorophyll Index (Dash and Curran 2004) have high values of Amax and large leaf sizes (Bakeret al. 2018). Whether QTL can be identified that jointly affect spectroradiometric indices as well as physiological traits or leaf FVT remains unexplored. Such QTL could allow high-throughput remote sensing to be used as a proxy for more time-consuming direct measurements of leaf gas-exchange and morphological development, and these QTL could be used in molecular plant breeding programs (Yinet al. 2000). Notably, simulation studies demonstrate that accounting for genotypic variation in gas-exchange can accelerate plant breeding (Hammeret al. 2005).
Data collected over ontogeny and in multiple environments may contribute to realistic models of population dynamics and of crop performance (Chenuet al. 2009). More generally, predicting phenotypes based on genotypes is a focal area of current genomics research, in part because it forms the basis of highly efficient genomic selection strategies for crop improvement (Janninket al. 2010; Bustos-Kortset al. 2016). However, few studies have predicted ontogenetic growth trajectories based on genotypic information (for an exception, see Reymondet al. 2003). Here, we report on improved characterization of developmental and growth processes and their environmental dependencies, which are relevant for both evolutionary ecologists and agronomists. Specifically, we expand upon the data and analyses presented in Baker et al. (2015) by applying new Bayesian FVT estimation routines (developed in Bakeret al. 2018) to existing leaf data, leaf phenotypic data collected in a separate season, and high-throughput spectroradiometric data. We combine ontogenetic and physiological data in hierarchical Bayesian models to estimate leaf FVT in the annual species, Brassica rapa. Using these FVT estimates, we test for phenotypic plasticity expressed in response to season and crowding and ask whether the magnitude of plasticity differs among FVT parameters (e.g., growth rates vs. durations). We then map QTL to assess FVT genomic architecture including QTL-by-environment interactions that underlie plastic responses. We test if Bayesian FVT estimation approaches improve QTL identification compared to frequentist approaches. Finally, we test whether QTL-based (genotypic) models can predict complex, multi-locus, non-linear patterns of leaf developmental phenotypes in B. rapa.
Materials And Methods
Species description
Brassica rapa (Brassicaceae) is an annual to biennial herbaceous crop. We utilize Recombinant Inbred Lines (RILs) generated by crossing the R500 × IMB211 genotypes. R500 is a late-flowering yellow sarson oil seed with relatively large, broad leaves, while IMB211 is derived from a Wisconsin Fast Plant (WFP) and selected for rapid cycling, flowers early, and has relatively small, lanceolate leaves (Bakeret al. 2015). All RILs are expected to be >99% homozygous (as described in Hinata and Prakash 1984; Brock and Weinig 2007; Iniguez-Luyet al. 2009). This experiment includes 119 RILs and genotypes representative of the R500 and IMB211 parents.
Experimental Design and Data Collection
Plant growth:
To test for plastic responses to inter-annual microclimatic variation (year) and crowding, we grew plants during three growing seasons and in two density treatments. Plant growth and experimental design follow protocols described in Baker et al. (2015). Briefly, in 2010, 2011, and 2012, the IMB211 × R500 RILs were germinated in the greenhouse in pots filled with fertilized field soil, and then transplanted into the field at the University of Wyoming Agricultural Experiment Station. The crowded (CR) treatment consisted of 5 plants of the same genotype per 4” peat pot with the central plant designated as a focal individual on which measurements were collected. The uncrowded (UN) treatment consisted of a single plant per pot. Each block consisted of one replicate of each RIL and each parental genotype. Locations of replicates within each block were randomized, and each block was randomly assigned to a treatment (CR or UN). Plants were transplanted into the field with 25 cm between each focal plant. In 2010, 12 UN and 12 CR blocks were transplanted into the field. In 2011, 6 UN and 6 CR blocks were transplanted into the field and in 2012 8 CR and 8 UN blocks were transplanted. All plants were watered to field capacity, and pesticides were applied as needed. Temperature data from 2011 and 2012 were collected (File S1) and used to generate degree days (DD, hereafter) assuming genotypes share the same B. rapa-specific base value of 0.96° as described in Baker et al. (2015) and Vigil et al. (1997).
Physiological data:
Photosynthetic capacity (Amax; μmol m−2 sec−1) was collected from UN and CR plants in 2010 as described in Edwards et al. (2012) genotypic means were calculated as described in and Baker et al. (2015) and below.
Spectroradiometric data:
Spectral reflectance measurements were collected in 2010 and 2011 as described in Baker et al. (2018). The spectroradiometer collected wavelengths from 350 to 2500 nm with a 1.4 nm sampling interval for the visible/NIR spectral region (350 nm – 1000 nm) and 2 nm for the short-wave infrared spectral region (1000 nm – 2500 nm) (Anet al. 2015); when interpolated, there is one spectral data point per nanometer, for a total of 2151 spectral data points per measurement. For each replicate plant, 10 measurements were collected and averaged to produce a single measurement (Anet al. 2015).
Raw spectroradiometric data were used to calculate common remote sensing vegetation indices. These include: mcari1 and mcari2 (Modified Chlorophyll Absorption Ratio Index), mtci (Transformed Chlorophyll Absorption Ratio Index) (Haboudaneet al. 2004), sipi2 (Structure Insensitive Pigment Index) (Haboudaneet al. 2002), tcari (Transformed Chlorophyll Absorption Ratio Index) (Peñuelaset al. 1995), ari1 and ari2 (Anthocyanin Reflectance Index) (Haboudaneet al. 2002), cri (Carotenoid Reflectance Index) (Gitelsonet al. 2001), npci (Normalized Pigment Chlorophyll Index)(Gitelsonet al. 2002), pri2 (photochemical reflectance) (Peñuelaset al. 1994), psri (Plant Senescence Reflectance Index) (Garbulskyet al. 2011) and wi (water index) (Merzlyaket al. 1999). We binned raw spectroradiometric data and calculated the reflected red to far red (R:FR) ratio for each plant as (655nm-665nm)/(725nm-735nm) (Penuelaset al. 1997).
Morphological data:
Leaf lengths and widths (LL and LW, respectively) were recorded on the second epicotylar leaf for all plants in 2011 and 2012. Data collection started at leaf emergence and was conducted 2-3 times per week as described in Baker et al. (2015).
Data analysis
Function-Valued Trait (FVT) Modeling:
FVT modeling for trait estimation used Bayesian approaches that fit logistic growth curves to longitudinal LL and LW data (Equations 1 & 2, respectively) as described in Baker et al. (2018). LW and LL for each individual replicate plant is represented by a minimum of 5 and maximum of 16 sequential measurements. Briefly, we utilized a three-level hierarchical Bayesian model that retains the measurement data structure to account for information across all plants and genetic lines within the population, (including replicate plants within each line) and a global mean parameter.Leaf length and width were modeled independently across treatments and growing seasons. The Function-Valued Trait (FVT) parameters Lmax and r estimate the Leaf maximum size (in mm) and rate of growth, (mm/DD) respectively. Two parameters estimated the duration of growth. The first, d, was calculated as the duration (in Degree Days, DD) of time between germination and 95% of leaf growth. The second, iD, was algebraically extracted from the growth curve and describes when the growth curves reached their inflection point in Degree Days and transitioned from exponentially accelerating to decelerating growth rates.
The hierarchical Bayesian model was implemented using the Bayesian Statistical Modeling Python module PyMC and the model parameters were estimated via MCMC using the Metropolis-Hastings algorithm (Chib and Greenberg 1995; Patil et al. 2010). The MCMC estimations were performed using a single chain to sample 500,000 iterations, which includes the first discarded 440,000 burn-in iterations; the remaining 60,000 iterations were retained. By thinning to 1 iteration in 20, the retained iterations were reduced to 3,000 samples for every FVT parameter from which the posterior distributions were tabulated. All parameters’ trace and auto-correlation plots were examined to ensure that the MCMC chain had adequate mixing and had reached convergence (Bakeret al. 2018). All observed data for each genotype were plotted with two 95% credible interval envelopes. The inner, yellow envelope represents the credible intervals for the model based on the observed data. The outer, green envelopes are computed from the posterior distributions of the model parameter values. The green envelopes in effect correspond to the 95% credible intervals within which one expects any future observations to fall if a new experiment were performed using the same genotype and environment (Kruschke 2015; SAS Institute 2017).
Inclusion of Photosynthetic Capacity as a co-factor:
To account for potential carbon limitation caused by differences in photosynthetic capacity, we included the genotype-specific genotypic means for photosynthetic capacity (Amax) as a co-factor in the prior distributions of the individual plant effects (Bakeret al. 2018).
Phenotypic plasticity:
Prior to all analyses, phenotypic data were subjected to outlier analyses. Any observations more than three standard deviations from the mean were omitted. Visual inspection of histograms and quantile-quantile plots indicated that outlier exclusion yielded distributions closer to the normal distribution and never resulted in increased departure from the normal distribution (Bakeret al. 2017; Bakeret al. 2018). To detect environmental factors that might affect the correspondence between genotype and phenotype, we analyzed phenotypic datasets (File S2) from all years and both density treatments and tested for the main effects of genotype, treatment, and year and all possible interactions using the lme4 and lmerTest packages in the R statistical environment (Bates et al. 2014; Kuznetsovaet al. 2015; R core Team 2015). In these tests, all effects were considered random and block was nested within a year × treatment interaction. Significant main effects of environment (either year or treatment) were considered evidence of phenotypic plasticity, and interactions of treatment × genotype, year × genotype, or the three-way interaction of treatment × year × genotype were considered evidence for genetic variation in phenotypic plasticity.
Best Linear Unbiased Predictions (BLUPs):
Genotypic means were estimated by calculating BLUPs for all leaf FVT and spectral reflectance traits. BLUPs were calculated independently for each year and for UN and CR treatments in R using the lmer function in the lme4 package while controlling for block effects (Bates et al. 2014; Kuznetsovaet al. 2015). The random effects of block (nested within a year × treatment interaction) and genotype, treatment and year (and their two- and three-way interactions) were assessed using a series of chi-square tests that compare models with and without the given random effect (rand function in lmerTest).
QTL mapping:
QTL analyses were performed in R/qtl (Broman and Sen 2009) based on an updated and highly resolved RNA-seq based SNP map with an average distance of 0.7 cM between informative markers (Bakeret al. 2015; Markelzet al. 2017). The scanone function was used to perform an initial round of interval mapping (1cM resolution with estimated genotyping errors of 0.001 using Haley Knott regression) to identify additive QTL. QTL model space was searched using an iterative process (fitqtl, refineqtl, and addqtl) to identify additional QTL while taking into account the effects of QTL identified by scanone and addqtl. All significance thresholds (0.95) were obtained using 10,000 scanone permutations (Bromanet al. 2003; Broman and Sen 2009). QTL and their 1.5LOD confidence intervals are displayed using MapChart2.0 (Voorrips 2002).
We hypothesized that Bayesian methods might better estimate trait values in comparison to frequentist trait estimation, thereby improving phenotype-genotype associations estimated under QTL mapping. To compare QTL identified for Bayesian FVT model parameters with previous frequentist (least squares, LS) FVT model parameters (Baker et al., 2015), we remapped the previously published LS parameters using the updated B. rapa genetic map and the same protocol as described above for the Bayesian parameters. Percent variance explained (PVE) is calculated as PVE = 100 × (1 - 10^(-2 LOD/ n)), where n is the number of genotypes.
QTL-by-environment interactions:
The effects of environment and year on QTL were assessed using a series of linear regression models and two-way ANOVAs in the car package for R (Fox and Weisberg 2011). P-values <0.05 for type III F-values of treatment × QTL maker, year × QTL maker, or the three-way interaction of treatment × year × QTL marker interactions were considered evidence of QTL by environment interactions. To avoid testing the same QTL multiple times, when QTL for the same trait across different treatments and years colocalized (as defined by overlapping 1.5 LOD confidence intervals) and expressed the same direction of effect, only one QTL × environment interaction is reported.
Predictive analyses:
For the purposes of predictive modeling, we chose uncrowded leaf widths (UNLW) from 2012 because we detected the most QTL for this trait. To predict growth phenotypes based on genotypes, we used a resampling approach. For each sample, we randomly excluded 5 genotypes (without replacement). We re-evaluated our Bayesian FVT models, extracted trait parameters, and calculated trait BLUPs as described above. We re-mapped QTL using the same methods as above, except we set the significance threshold for QTL detection to 0.90. Including QTL of marginal statistical significance but with potentially biologically meaningful effects increased the precision with which we could perform phenotypic predictions. We extracted the effect size and direction of each QTL identified for Lmax and r using the effectplot function in R/qtl and constructed predictive models for Lmax and r (Equation 3 and 4, respectively) where i indicates the genotype in question, and
are the population means for Lmax and r, respectively, which are added to the sum of the products of each QTL direction and effect size. We estimated values for r and Lmax parameter for each of the 5 randomly excluded genotypes based strictly on genotypic information and independent of any phenotypic information. We repeated this process 20 times, each time randomly selecting a different set of RILs to predict phenotypes based on genotypes (and to exclude from Bayesian modeling, BLUP estimation, and QTL mapping procedures). This approach allowed us to predict phenotypes based on genotypes for ∼100 different genotypes. Analyses departed slightly from n = 100 because we dropped genotypes when, for example, randomly selected genotypes represented parents of the RIL population and could not be used for mapping or had no genotypic information.
We evaluated the success of our predictive models across environments in two ways. First, we examined the correlations between predicted (based on 2012 UN data) and observed r and Lmax (in 2012 UN and in 2012 CR). Second, we used predicted values for r and Lmax (along with a constant, L0, which estimated the initial value for leaf width) to predict logistic growth curves describing the increase in leaf width over time for each genotype. We plotted predicted growth curves (red line) in conjunction with measured phenotypic data (colored circles) for all replicate plants per genotype. We applied our Bayesian models to the observed phenotypic data (excluding the 5 predicted genotypes) and estimated growth curves (black lines) and credible intervals. We visually compared predicted growth curves to credible intervals surrounding observed phenotypes. When comparing predicted to observed growth curves for 2012 uncrowded data, we considered predictions successful if they fell within the 95% Bayesian credible regions for the model (yellow envelopes, Figure 2) and marginally successful if they fell within the 95% credible limits for future observation (green envelopes, Figure 2). When projecting across treatments (from uncrowded to crowded environments), we considered predictions successful if they fell within the green envelopes. We scored the proportion of successful or unsuccessful predictions for r and Lmax within each sample and assessed whether the proportion of successful predictions was greater than that expected by chance (0.50) using a Z-statistic. Replicate level phenotypic FVT data used in all analyses are available in File S2 and upon request.
Data Availability
Data and code used in these analyses are previously published and can be found in the relevant citations or are included in the supplemental files of this publication including environmental data from the experimental field in 2010, 2011, and 2012 (Online Resource File S1) and replicate level phenotypic data used for analyses (Online Resource File S2).
Results
Phenotypic plasticity
We partitioned variation among genetic and environmental sources. There were significant main effects of block (nested within the treatment-by-year interaction) for all model parameters, indicating plasticity to unmeasured microenvironmental variation. The main effect of genotype was significant for all leaf parameters other than LL_r and LW_r, for which genetic variance was subsumed into environmental interaction terms. We found no evidence of phenotypic plasticity to the crowding treatment for any leaf trait (Table 1). By contrast, there were significant main effects of year on all leaf FVT parameters except leaf length duration of growth (LL_d), leaf width maximum size (LW_Lmax), and duration of growth (LW_d). We also observed evidence for genetic variation in the expression of phenotypic plasticity among years (genotype × year interactions) for all leaf FVT traits (Table 1). Interestingly, for leaf FVT estimating growth rates and durations (d, iD, r) the interaction term was always several fold larger in magnitude than the main effect of genotype as estimated by the test statistic value. In contrast, the main effect of genotype for FVT estimating final leaf size (Lmax) was ∼15-20 fold larger in magnitude than the corresponding genotype × year interaction term (Table 1). The three-way density treatment × year × genotype showed a similar pattern of strong significance for iD, d and r parameters and either non-significance for LW_Lmax or low magnitude in comparison to the main effect of genotype for LL_Lmax (Table 1). Notably, the patterns of plasticity were similar for LL and LW parameters, e.g., the effect of genotype for both LL_r and LW_r is only significant in the two environmental interactions and iD and d both show similarly large effects of the two and three-way interactions involving year. In sum, the results suggest that the trajectory of leaf growth (estimated from iD, d and r) may be modulated by inter-annual environmental variation, but that the final size of leaves (estimated from Lmax) is less so.
We detected significant genetic variation among genotypes for all spectroradiometric indices except ari2 (which was marginally significant, P < 0.01) and cri. We found no evidence for phenotypic plasticity for population mean trait values for any of the indices or R:FR ratio, because the main effects of treatment and year were always non-significant (Table 2). However, we found evidence for genetic variation in phenotypic plasticity across density treatments for mcari, sipi2, tcari, and psri as well as across years for npci. There were no significant three-way interactions of genotype × year × treatment.
Comparison of QTL mapping using Frequentist vs. Bayesian trait estimation
To assess the effectiveness of Frequentist (least square or LS) vs. Bayesian procedures in estimating FVT, we mapped 2012 FVT parameters using genotypic values arising from LS (Bakeret al. 2015) and Bayesian FVT estimation procedures, the same map, and the same mapping procedures as previously described. Bayesian FVT estimation outperformed LS FVT modeling for all metrics considered. Bayesian trait estimation yielded 33 QTL compared to 26 identified via LS approaches. On average, Bayesian QTL tended to have greater LOD scores than QTL mapped for LS-estimated traits (4.53 vs. 4.25, respectively; P = 0.065) and tended to explained a greater percent variance (16.0% vs. 15.8%, respectively; P = 0.067) based on a Wilcox rank-order tests. Bayesian and LS FVT modeling often identified the same QTL, based on overlapping confidence intervals (14 QTL). However, each method also identified unique QTL (with non-overlapping confidence intervals). Bayesian modeling yielded 19 unique QTL (File S3).
Additive QTL:
Having compared the mapping results under the two trait estimation procedures, we proceeded with analyses of QTL based on Bayesian trait estimation. We detected 96 additive QTL across seven of the ten B. rapa chromosomes with no QTL detected on chromosomes four, five and eight (Figure 1 and File S3). However, because many of the 1.5 LOD support limits for individual QTL overlap, an alternative interpretation is that we identified as few as 14 highly pleiotropic QTL. Individual QTL had LOD scores of between 3.12 and 10 representing between 8.07 (UNLL_iD11) and 32.1% (UNLW_d12) of the genotypic variation. Consistent with greater genetic variances for 2011 compared to 2012 leaf FVT traits (Table 1), we detected more QTL for the 2012 than 2011 field seasons (41 vs. 15).
RNA-seq based linkage map of Brassica rapa with QTL and their 1.5 LOD confidence intervals. The 1482 SNP-based markers are relatively evenly dispersed across each of the ten chromosomes. Gaps in markers are indications of low expression associated with centromeres (or ancient centromeres). Positive QTL (with respect to IMB211) are red and negative QTL are blue. Overlapping 1.5 LOD confidence intervals are interpreted as evidence for colocalization of QTL. UN, uncrowded and CR, crowded treatments; LW, leaf width; LL, leaf length; Lmax, maximum estimated leaf size (mm); r, growth rate; d, duration of growth; iD, timing of the switch between accelerating and decelerating growth; 10, 11, and 12 represent QTL identified in 2010, 2011, and 2012 field seasons, respectively. Specific cM positions, percent variance explained, and genomic markers can be found in File S3.
The genetic architecture underlying LL and LW FVT is similar as demonstrated by colocalization of many LL and LW FVT QTL. For instance, QTL jointly affect Lmax for leaf length and width (bottom of chromosome 1, middle of chromosome 2, middle of chromosome 6) as well as d or iD for leaf length and width (middle of chromosome 2, middle of chromosome 3, and top and bottom of chromosome 10; Figure 1 and File S3). QTL frequently colocalized across density treatments, that is, a QTL affecting an LL parameter in the CR treatment tended to also have a significant additive effect in the UN treatment (e.g., for Lmax UN and CR QTL colocalize in the middle of chromosomes 3 and 6, bottom of chromosome 7 and the top of chromosome 10; Figure 1 and File S3).
QTL for spectroradiometric indices often formed clusters within the genome (e.g., QTL on the top of chromosomes 1 and middle of chromosome 9). QTL for spectroradiometric data also colocalize with Lmax FVT QTL (e.g., UNLW_Lmax12, CRLW_Lmax12, UNLW_Lmax11, and UNLL_Lmax11 with mtci, pri1, sipi2, psri, pri2, wi, and ari1 on chromosome 1). Spectroradiometric indices, however, rarely colocalize with FVT parameters describing leaf growth rates or duration (for an exception, see mtci, pri1, ari1, RFR, and wi on chromosome 3, which co-localize with multiple estimates of duration; Figure 1).
QTL-by-environment interactions
Consistent with the limited genotype × treatment interactions, we found no significant QTL × density treatment effects for FVT parameters describing leaf development. An additional field season worth of data allowed us to analyze QTL × year and QTL × treatment × year interactions for leaf FVT parameters. Consistent with the highly significant genotype × year interactions (Table 1), multiple QTL underlying leaf growth rates (r), durations (d), and inflection sizes (iD) exhibited year × QTL interactions (Table 3). By contrast, FVT estimating final leaf sizes (Lmax) did not exhibit any significant QTL × year interactions (Table 3).
For several spectroradiometric QTL, significant QTL × density treatment (2011 only) and QTL × year (uncrowded treatment only) effects were detected (Table 3). Three-way interactions for spectroradiometric traits and environmental variables were not tested because spectroradiometric data were only collected on both treatments in one year.
Predictive modeling
We predicted leaf width growth dynamics (r and Lmax) for randomly chosen genotypes grown in the uncrowded treatment in the 2012 field season. We evaluated our predictions using two methods. First, we compared our predictions to genotypic means estimated from the corresponding phenotypic values (BLUPs) using correlation analyses. Values of r and Lmax predicted from QTL-based genotypic models were significantly genetically correlated with observed trait means for UN 2012 LW_r (r = 0.26, df = 94, P = 0.01) and UN 2012 LW_Lmax (r = 0.61, df = 94, P = 3.9e-11). The relatively low correlation value for LW_r likely reflects the fact that we detected fewer QTL for LW_r and therefore our predicted values for LW_r were less precise than those for Lmax. Second, we asked whether our estimated predictions fell within the credible envelopes surrounding replicate-level phenotypic data. Predicted LW_r and LW_Lmax fell within the yellow credible envelope 87% and 85% of the time, respectively, and these predictions were considered successful (Figure 2 & File S4). For both r and Lmax, our success rates for prediction were significantly greater than chance (i.e., 50%; Z = 3.29, P = 5e-04). Predicted values of r and Lmax fell within the green credible envelope for predicted future phenotypes for a given genotype grown in the same conditions (Figure 2) 97% and 100% of the time, respectively, and these were considered marginally successful.
Predicted and measured phenotypes for genotype 206 in (A) Uncrowded and (B) Crowded treatment during the 2012 growing season. Colored circles represent measured phenotypic data from multiple individuals of genotype 206. The black line is a logistic growth curve fitted to the genotypic mean using Bayesian routines. Any new observation for an individual from genotype 206 grown in the same conditions is predicted to fall within the green 95% credible envelope. Our predicted phenotype (based on QTL data and incorporated into the logistic functions from Equation 3 and 4; red growth curve) falls within the yellow 95% credible envelope for the fitted logistic growth curve.
We asked whether our models based on UN QTL could predict phenotypes expressed across density environments, given the similarity in genetic architecture between the UN and CR treatments (lack of QTL × E in 2012; Table 3). Predicted values of LW_r were uncorrelated with CR observed trait means of r (r = 0.16, df = 94, P = 0.1205), but predicted values of LW_Lmax were significantly correlated with observed CR Lmax means (r = 0.58, df = 94, P = 4.19e-10). We considered predictions across the density treatments successful based on less stringent criteria than within a single environment. Specifically, we used the wider green credible intervals. When evaluating our predictions of r and Lmax based on credible envelopes, our predictions were successful 100% and 98% of the time, respectively. Our successful prediction rates were significantly greater than chance (0.50) for LW_r (Z = 4.9, P < 0.0003) and LW_Lmax (Z = 2.5, P = 0.0052). The high rate of successful predictions despite lack of genetic correlation for LW_r is likely because the credible envelopes are generated from distributions based on observed phenotypic data whereas the correlations are based solely on means of predicted values vs. observed phenotypes.
Discussion
Predicting phenotypes based on genotypes is a central objective of evolutionary developmental biology and a key component of breeding programs (Moczeket al. 2015). However, predictive models require understanding the genetic architecture of phenotypes throughout ontogeny because the trait of interest may be influenced by different combinations of genetic and environmental factors at different times during growth and development (Benjamini and Hochberg 1995; Benjamini and Yekutieli 2001). Quantifying growth trajectories is also important because selection may act throughout ontogeny rather than on final phenotypes alone (Dobzhansky 1956; Moose and Mumm 2008; Bakeret al. 2015). Final phenotypes may also be contingent upon prior environments and developmental events (Diggle 1997; Weinig and Delph 2001; Dechaineet al. 2014). To predict growth phenotypes, we explored the genetic basis and environmental dependencies (plasticity) of non-linear leaf length and width growth trajectories utilizing mathematical modeling to fit logistic growth curves to ontogenetic data in a population of Brassica rapa RILs. We evaluated the effectiveness of Bayesian vs. frequentist approaches to Function-Valued Trait (FVT) estimation by mapping QTL for FVT parameters. Our “parameters as data” approach to FVT modeling allowed us to detect differences in the genetic architecture and environmental sensitivities among growth parameters (i.g. rates vs. durations vs. final sizes; Jaffrézic and Pletcher 2000; Kingsolveret al. 2001; Wu and Lin 2006; Stinchcombeet al. 2010; Bakeret al. 2015; Jianget al. 2015; Zhanget al. 2017). Notably, although growth parameters (r, Lmax, d, iD) showed different genetic architectures, the genetic basis of any single parameter was very similar between leaf length and width. Finally, we built and tested QTL-based models for predicting growth phenotypes from genotypic data.
Phenotypic plasticity of leaf FVT
Environmental inputs can have profound influences on phenotypic expression. Understanding how genotypes plastically respond to the environment over the course of development is a first step in developing cultivars that are optimally suited to their growth conditions (Kang 1997). In our study, spectroradiometric indices did not often express phenotypic plasticity in response to either density or interannual microclimatic variation (i.e., year; Table 3). For leaf FVT, the plastic responses to interannual microclimatic variation were much more pronounced than responses to density. The main effect of crowding was never significant for any leaf FVT, while plasticity to year was often significant. Additionally, we identified genetic variation for phenotypic plasticity (that is, changes in rank-order genotypic values and variances) among all leaf FVT (Tables 1), indicating that different genotypes respond to inter-annual variation in different ways. However, the magnitude of the genotype-by-environment response was several times larger for growth parameters (r, d, and iD) compared to final size (Lmax), indicating that ontogenetic aspects of leaf size (for both length and width) are more plastic than final sizes (Table 1).
Bayesian vs. frequentists Function Valued Trait approaches
To investigate the genomic architecture underlying leaf FVT across environments, we mapped QTL for FVT and asked whether Bayesian approaches to FVT modeling and trait estimation resulted in improved mapping compared to frequentist (least squares) approaches (Baker et al., 2015). When comparing mapping results from Bayesian and frequentist approaches for 2012 data, Bayesian routines yielded higher estimates of FVT trait heritability (Bakeret al. 2018). In side-by-side comparisons in the current study, parameters extracted from Bayesian FVT models mapped a greater number of QTL than those estimated from frequentist FVT models (33 vs. 26), had marginally higher LOD scores (P = 0.065) and tended to explain a larger proportion of phenotypic variance (P = 0.067). We attribute these improvements in QTL detection and resolution to the fact that hierarchical Bayesian models extract global information more efficiently from the data and reduce error propagation compared to frequentist approaches (Charmet 2000; Bakeret al. 2018) thereby improving trait estimation and phenotype-genotype association tests that are the basis of QTL mapping.
Genetic architecture of spectroradiometric indices and leaf length and width FVT
Few studies have compared the genetic architecture of spectroradiometric reflectance patterns, developmental dynamics, and final phenotypes. Our QTL analysis of spectroradiometric indices and a R:FR ratio indicates that these indices are largely genetically distinct from FVT parameters of leaf development. When QTL for spectroradiometric indices do colocalize with QTL for FVT parameters, these tend to be QTL for final sizes rather than parameters that describe growth dynamics (e.g., chromosome 1 in Figure 1), indicating that single time point spectral indices may serve as suitable proxies for final leaf sizes, but not growth dynamics. With regard to leaf length and width FVT, functional annotation of individual genes have identified loci that specifically affect leaf length and have little to no effect on leaf width (Tsukaya 2005). However, quantitative genetic studies often find that LL and LW are tightly genetically correlated and may be pleiotropically regulated (Fujitaet al. 2009; Bakeret al. 2015; Drostet al. 2015). Previously, we found that FVT parameters for leaf length and leaf width (e.g., LW_r and LL_r; LW_d and LL_d) were highly genetically correlated within an environment and showed similar plasticities across environments. In addition to correlations of the same FVT parameter between LL and LW, we found that different FVT parameters were moderately correlated (Bakeret al. 2015; Bakeret al. 2018). Estimates of final sizes (Lmax) were genetically correlated with aspects of growth such as rates (r) and duration (d); however, these correlations were not perfect (Baker et al., 2015). Here, we found that QTL for Bayesian FVT parameters describing r and Lmax rarely colocalized (see LW_r and LW_Lmax from 2012 on chromosome 3 (UN) and 2 (CR) for exceptions; Figure 1). Our results confirm previous studies of the genetic architecture of leaf growth dynamics vs. final size (Baker et al., 2015) and support the general hypothesis that leaf growth and final size have different genetic architectures. The independent genetic control of final size and growth dynamics implies that these may be independently selected upon to increase a plant’s space occupancy or light harvesting ability (reviewed in Wuet al. 2004; Wu and Lin 2006).
Genetic and environmental interactions for leaf FVT
Within a single organ, environmental sensitivities may differ among traits and across developmental time (Wright and McConnaughay 2002; Sugiyama and Gotoh 2010). Robust QTL-by-environment detection is dependent upon large sample sizes and multiple environments. In our RIL population of over 100 lines, we observed QTL-by-environment interactions that corresponded with genotype-by-environment interactions for leaf FVT (Compare Tables 1 and 3). Specifically, we found no QTL-by-density treatment interactions. Two traits (LW_iD and LW_r) had significant QTL-by-year interactions, and these QTL-by-environment interactions likely contribute to the population-level phenotypic plasticity exhibited by LW_iD and LW_r in response to interannual environmental variation (year). We also observed significant QTL-by-year interactions for QTL underlying traits that did not exhibit phenotypic plasticity as assessed by the main effect of year. For example, QTL-by-year interactions for growth duration (d) likely underlie rank order differences in genotypic means and changes in trait variances across growing seasons (compare Tables 1 and 3). We found strong evidence for QTL-by-environment interactions for aspects of growth dynamics (rates, durations, and inflection points), but not for final leaf sizes. Notably, the differing extent of QTL-by-environment interactions among FVT parameters further supports the hypothesis that leaf growth dynamics and final size have independent genetic underpinnings.
Predicting phenotypes based on genotypes (QTL)
Reliably predicting phenotypes is important to understanding evolutionary dynamics, selecting seed sources for ecological restoration, and designing molecular breeding programs. Predictions of fitness and yield may be improved by incorporating information about plant growth (Assefaet al. 2015) and physiology (Amelonget al. 2015). However, most models constructed to predict phenotypes based on genomic data focus on single-time-points such as seedling traits (Cooperet al. 2005; Xuet al. 2011), disease resistance (Paceet al. 2015), or components of yield (Yinet al. 2000; Baoet al. 2015). Models that do incorporate ontogeny tend to treat sequential time points independently (e.g., Reymondet al. 2003) rather than incorporating interdependence of ontogenetic data via FVT models. Further, FVT modeling can describe non-linear growth curves that capture the potentially independent aspects of exponential and asymptotic growth phases. We developed models to predict complete growth curves based on QTL underlying FVT parameters (r and Lmax; Equations 6 & 7). In the same (uncrowded) environment, FVT parameters predicted based on genotypic information were significantly correlated with independently observed FVT phenotypes and components of our predicted logistic growth curves fell within 95% credible intervals for the actual phenotypic data 85% of the time. These success rates are significantly greater than chance and equivalent to predictive models for grain yield that integrate crop growth models and whole genome information (Technowet al. 2015).
We also asked whether our models could provide accurate predictions across environments. Consistent with the lack of significant main effect of density treatment and QTL-by-density interactions, our models successfully predicted phenotypes in 2012 crowded conditions 98-100% of the time. As for other complex traits, future work on predictive modeling for multi-locus non-linear growth traits should focus on expanding beyond additive models to include interactive effects such as epistasis and to explicitly incorporate environmental parameters to account for phenotypic plasticity and gene by environment interactions (Yinet al. 2004; Tardieuet al. 2005; Chenuet al. 2009; Bustos-Kortset al. 2016).
Conclusions
A major goal in evolutionary biology and crop science is understanding the connections between genotypes and phenotypes. Our Bayesian FVT trait estimation approach to leaf growth isolates leaf developmental genetic programs by factoring out endogenous influences on growth and development such as genotype-specific photosynthetic capacities. We detect differences in the environmental sensitivity among traits: the strength of the plastic response was much stronger in growth traits compared to final sizes. We also detect distinct genomic architectures underlying different components of leaf growth curves. The Bayesian approach to FVT modeling provides superior QTL detection compared to previous frequentist approaches. Together, our data indicate that leaf growth dynamics and final sizes have different genomic architectures and different patterns of environmental sensitivity. Genotypic models effectively predicted FVT phenotypes, suggesting that comparatively simple QTL models can capture the non-linearity intrinsic to some FVT in this species; elaboration of these models to include environmental effects and expression data are a topic of ongoing research.
Acknowledgments
University of Wyoming undergraduates E. Gimpel, J. Whipps, K. Anderson, M. Pratt, J. Beckius, C. Blemenshine, S. Cheeney, M. Yorgason, W. Gardner, C. Planche, C. Gifford, L. Lucas, K. Riggs, D. Lorimer, D. Nykodym, and L. Steinken, assisted with data collection and entry. M. Rubin (University of Syracuse) helped with logistics, M. Knapp (Kansas State University) provided temperature data. C. Seals and R. Pendleton facilitated plant growth. The manuscript was enriched by helpful discussions with M. Brock (University of Wyoming), R. J. C. Markelz (University of California, Davis and RevGenomics, Oakland CA), and J. Lovell (HudsonAlpha Institute for Biotechnology, Huntsville, AL). This work is supported by National Science Foundation Grants IOS-1306574 to RLB, IOS-0923752 to CW and SW, and IOS-1025965 to CW.
Footnotes
Supplemental Material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.117.300350/-/DC1
Communicating Editor: R. Wisser
- Received October 11, 2017.
- Accepted February 8, 2018.
- Copyright © 2018 Baker et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.