Increased Genomic Prediction Accuracy in Wheat Breeding Through Spatial Adjustment of Field Trial Data

In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models.

Trait-associated markers then become selection targets to assist in molecular breeding programs (Collard et al. 2005;Landjeva et al. 2007;Collard and Mackill 2008;Buerstymayr et al. 2009). However, these approaches have limitations due to the difficulty in detecting significant markers within gene regions that are involved in the expression of complex traits influenced by many genes at different levels (Xu 2003). The most important traits involved in breeding are complex. Therefore, other strategies that take into account thousands of markers at one time in a model to predict complex traits have recently been developed.
Genomic selection (GS) is a recent approach that is being applied in crop breeding to make decisions for advancing germplasm from one generation to the next. GS was first proposed in animal breeding by Meuwissen et al. (2001). The development of high-throughput sequencing platforms, yielding a vast amount of information for each genotype allows the application of GS. For GS to be applicable in commercial breeding, genotyping methods need to be cost-effective. Genotyping-by-sequencing (GBS) is a high-throughput genotyping method that has been shown to be very useful for complex genomes like wheat (Poland et al. 2012a;Poland and Rife 2012). GBS costs are directly linked to the decreasing cost of sequencing driven by global research into developing new low-cost sequencing technologies and platforms. The wheat genome is very large at16 Gb (i.e., five times the human genome) and very complex with 80% repeated regions and 25-30% of its genes duplicated (Bennett and Smith 1976;Dubcovsky et al. 1996;Akhunov et al. 2003). Furthermore, wheat is a hexaploid species with three genomes (A, B, and D) per chromosome (Sarkar and Stebbins 1956;Dvorak and Zhang 1990;Dvorak et al. 1993). GBS uses methylation sensitive enzymes, which results in the elimination of most of the repeated regions and reduces the genome representation to increase the efficiency of sequencing (Elshire et al. 2011;Sonah et al. 2013).
The development of novel statistical approaches for GS is a crucial step, where all the genotypic information is taken into account to be associated with phenotypic data by adjusting the parameters in a prediction model. The parameters of the model are currently being adjusted with linear models (Ridge regression [RR]), Bayesian approaches (Bayes A and Bayes B), and semiparametric strategies (reproducing Kernel Hilbert spaces [RKHS] and neural networks) (Meuwissen et al. 2001;Gianola et al. 2003Gianola et al. , 2011de los Campos et al. 2010;Endelman 2011;Goddard et al. 2011;Poland et al. 2012b). RKHS is defined by a "reproducing kernel," which is a function of the relationship between pairs of genotypes. RKHS is a semiparametric approach, which could be represented as a parametric model by choosing the appropriate kernel . Two kernels that are commonly applied are known as RR and Gaussian (GAUSS). The relationships between genotypes in RR are established by the use of an additive model, and in GAUSS, the relationship between individuals is calculated with Euclidean distances that take into account epistatic interactions (Gianola and Van Kaam 2008;Endelman 2011).
Because genotyping is taking a more routine and accepted approach, improvements in model predictions are focused on precision phenotyping. Physiological differences and environmental conditions that affect the precision of the measured phenotype need to be taken into account to have an accurate GS prediction model. The breeder needs to know with accuracy the fields in which selections will occur; therefore, high-throughput phenotyping technologies are being implemented before planting (characterization of field heterogeneity) and during crop development to help reduce nongenetic variation (Crossa et al. 2006;Cabrera-Bosquet et al. 2012;Masuka et al. 2012;White et al. 2012). Otherwise, traditional methods are applied in which field design takes care of most of the variation and model correction mechanisms take into account the field heterogeneity that produces spatial correlation errors. These spatial trends can be eliminated with postdata treatment. Different strategies exist, including model variance-covariance matrixes, row-columns, and moving-means (Cullis et al. 1998;Peiris et al. 2008;Müller et al. 2010;Leiser et al. 2012).
The objectives of this study were (1) to validate the GBS technology as a tool for genotyping germplasm with complex genomes; and (2) to create an optimized training model for GS with a germplasm to be bred in a Mediterranean climate environment of central Chile, using a diverse set of wheat genotypes. Our study confirms that GBS is an inexpensive, robust, and useful tool to obtain genomewide information for breeding programs that work with complex genomes, such as wheat. Furthermore, we evaluate that spatial adjustment of the phenotypic data in each trial is very important to reduce error in the model and increase prediction accuracy. Here we evaluate spatial variation across the field, while also exploring fundamental variations that take into account environmental and genotypic interactions.

Germplasm and growth conditions
The germplasm consists of 384 advanced lines from two different breeding programs, including 55 lines from the wheat breeding program at Instituto Nacional de Investigación Agropecuaria (INIA) in Chile, 143 from the International Wheat and Maize Improvement Centre (CIMMYT) that were previously selected for adaptiveness to Chilean environments (these lines share common ancestors with the INIA-Chile breeding program), and 186 lines from INIA in Uruguay. The objective with this set of lines was to create a germplasm base to breed for drier areas in Chile and subsequently other countries within the projects involved.  The breeding germplasm was evaluated in the Mediterranean environment Santa Rosa (36°329 S, 71°559 W; 217 m.a.s.l.) under two levels of water supply, mild water stress and fully irrigated, in 2011. In 2012 the lines were evaluated in Santa Rosa under the two levels of water supply and also evaluated in Cauquenes (35°589 S; 72°179 W), a traditionally dry-land agricultural region with lower yield potential. Cauquenes has a granitic soil (Alfisol) with low fertility; the minimum average temperature is 4.7°(July), the maximum is 27°(January) and the long-term average annual precipitation is 695 mm. Santa Rosa has a volcanic soil (Andisol) with adequate fertility for wheat; the minimum average temperature is 3.0°(July), the maximum is 28.6°(January), and long-term average annual precipitation is 1270 mm (del Pozo and Del Canto 1999).
The experimental design was an alpha-lattice with 20 incomplete blocks, with each block containing 20 genotypes. Two replicates were used at both trials of Santa Rosa in 2011 and 2012 and at Cauquenes in 2012. Plots consisted of five rows of 2 m long and 0.2 m distance among rows. Sowing dates were on August 31st and September 7th at Santa Rosa and Cauquenes, respectively, and the sowing rate was 20 g m 2 . Plots were fertilized with 260 kg ha 21 of ammonium phosphate (46% P 2 O 5 and 18% N), 90 kg ha 21 of potassium chloride (60% K 2 O), 200 kg ha 21 of sulpomag (22% K 2 O, 18% MgO, and 22% S), 10 kg ha 21 of boronatrocalcita (11% B), and 3 kg ha 21 of zinc sulfate (35% Zn). During tillering, an extra 80 kg ha 21 of N was applied. Weeds were controlled with MCPA at 750 g a.i. ha 21 + Metsulfuron Metil 8 g a.i. ha 21 . Furrow irrigation was used at Santa Rosa: one irrigation (at tillering) for the mild water stress trial and four irrigations (at tillering, flag leaf emergence, heading, and middle grain filling) of ca. 50 mm each for the fully irrigated trial.
For phenotyping, all lines were evaluated for grain yield (GY), thousand kernel weight (TKW), number of kernels per spike (NKS), and days to heading (DH) in 2011. In 2012, only GY was evaluated. For the yield components (TKW and NKS), 25 spikes were randomly selected from each plot. For GY, the whole plot was harvested. DH  was recorded as the number of days from sowing till 50% of spikes emerged.

SNP identification
Genomic DNA was extracted using DNeasy Plant Maxi Kit (QIA-GEN). Library construction was followed by using the PstI-MspI GBS protocol described by Poland et al. (2012a). This step and sequencing was performed twice. The libraries were made in collaboration with the Kansas State University, Manhattan, Kansas, and the Institut de Biologie Intégrative et des Systèmes at the Université Laval, Quebec, Canada. The sequencing was performed on an Illumina Hi-Sequation 2000 at the DNA core facility at the University of Missouri, Columbia, Missouri, and the McGill Univesity-Génome Quebec Innovation Centre (Montreal, Canada) for each set of libraries. The sequences were analyzed, in relation to base quality and distribution of sequence in different samples, using the Galaxy (http://galaxy.psu. edu/) software.
Single-nucleotide polymorphism (SNP) calls were made using the Tassel Pipeline (TP; http://maizegenetics.net), with modification for nonreference SNP calling by Poland et al. (2012a). The TP handles the sequence information coming from next-generation sequencing. Tags are defined, which are a set of identical sequences, and then the number of sequences per tag are counted. To handle tags with sequencing errors, the parameter to eliminate tags was established at less than 15 sequences. Tags are then defined individually by the lines that it came from. A pairwise alignment between tags to call some set of potential SNPs is then established. The TP has different filters for calling SNPs. In this study inbred lines that are in a highly homozygous state were used; therefore, the "inbreeding coefficient" filter was used and set to 0.9 to eliminate high amounts of heterozygotes. The minor allele frequency filter was set to 0.01, and the minimum locus coverage was set to eliminate SNPs with more than 80% missing data. Once the complete SNP matrix was established (supporting information, File S1), the missing data were imputed using the realized relationship matrix method multivariate normal expectation maximization (MVN-EM) described by Poland et al. (2012b) with the R environment (R Development Core Team 2008) package rrBLUP (Endelman 2011). To further verify that the imputed SNPs were not affected, we correlated the genetic relationship matrix with and without imputed SNPs. Then, to further verify the quality of the imputed SNPs, marker-based kinship matrixes between random subsets of SNPs were calculated and compared with the rrBLUP software package (Endelman 2011).
Genetic and phylogenetic comparisons SNP data from 384 diverse wheat genotypes were used to calculate the kinship (A) matrix of genotypes using the EMMA package (Kang et al. 2008) in R environment (R Development Core Team 2008). The dissimilarity matrix (1 2 similarity matrix) was analyzed by principal coordinate analysis (PCoA) by use of the ape package (Paradis et al. 2004) in R.

Sequence alignments
The SNP tags were BLASTed against the sequence database available from the Synthetic x Opata map by Poland et al. (2012a) (available at http://www.wheatgenetics.org/index.php/download/viewcategory/10synop) using blastn from package NCBI-BLAST+ (Altschul et al. 1990) setting the parameters, maximum target, and number of threads at 1 and percent of identity at 95%. Linkage disequilibrium (LD) between each pair of mapped SNPs was calculated as r^2 using the trio package (Schwender et al. 2012) in R. SNPs were ordered following the bin map order presented in the database used by Poland et al. (2012a). After SNPs were ordered, the LD values were plotted using the LDheatmap package (Shin et al. 2006) in R.

Phenotypic predictions
Phenotypic data were analyzed using the lme4 (Bates 2007) and mvngGrAd (Technow 2012) packages in R. The analysis was performed individually for each condition and year. Two different mixed models and one linear regression model (Table 1), defined as Row-by-Column (RC), Random Complete Block model with moving means (RCB_MVNG), and linear regression model with moving means as covariable (MVNG), were considered to account for spatial correlations. Two of the four models use a covariable to correct for spatial variation in the field. The covariable (x i ) was calculated as the value of phenotypic plot minus mean phenotypic value of neighbors plots, x i = y imean (y 1 , y 2 , y 3 , y 4 , y 5 , y 6 ; Figure 1).
The three mixed models have a general expression as follows: where X is the design matrix for fixed effects b, Z is the design matrix for random effects, and u and e are the residual matrix that follows the distribution e~N(0,s e 2 I). After the analysis of the residuals from each phenotypic model was established, the best linear unbiased predictors (BLUPs) were obtained to calculate genomic predictions by genomic best linear unbiased predictor (GBLUP) using the rrBLUP package with two different kernels, RR and GAUSS (Endelman 2011). The predictions were validated with 100 replications using the cross validation method described by Crossa et al. (2010). The samples were subdivided in seven similar sets. The training population was composed of six of the sets (86% of the samples) and the validation was performed on the remaining set (14% of the samples). For predictions between environments, adjusted data from two environments were used to train predictions in the remaining three environments.

SNP identification
GBS SNPs were identified among sequences tag pairs by allowing one to three mismatches between tags. Two library replicates for the 384 samples were analyzed jointly for SNPs, producing a total of 102,324 SNPs. Similarity matrices calculated with and without imputation shown high correlation (r = 0.990, p , 0.001).

Genetic and phylogenetic comparisons
To test the predictability of the markers in constructing a genetic relationship matrix, the 102,324 SNPs were divided into two randomly assigned identical sets of 51,162 SNPs in each group. Two genetic similarity matrices were constructed independently with each of the sets of SNPs. The Pearson correlation between matrices was 0.997 (p = 0.001). A genetic similarity matrix was calculated with the complete set of markers to perform a principal coordinate analysis. The germplasm was separated in two groups, representing each breeding program (CIMMYT-INIA Chile and INIA Uruguay). The first two principal coordinates explained 12.9% of variation ( Figure 2). n

Sequence alignments
When comparing sequences using BLAST against the Poland et al.
(2012a) GBS-based SNP database, we found that the sequences in common showed a good coverage throughout all 21wheat linkage groups. Of all the SNPs, 13% (13,357) found high-quality matches. Although a good coverage was observed, the D genome presented fewer SNPs than the A and B genomes ( Figure 3). As expected, LD analysis between mapped SNPs indicated high LD between closely linked SNPs along all chromosomes (File S1).

Phenotypic analysis
Phenotypic data were collected under different soil water availability in 2011 and 2012. The traits under study (GY, TKW, NKS, DH) were analyzed adjusting for field design and for spatial variation using linear mixed models. The residuals for the adjusted traits in 2011 were heterogeneous due to spatial correlations (Figure 4 and Figure 5). Other models (RC, RCB_MVNG, and MVNG) were considered to reduce correlations between residuals. The RC model was inadequate to correct the residual heterogeneity, because the same spatial correlation was observed (Figure 4 and Figure 5). The other two (RCB_MVNG and MVNG) models adjusted, which include the moving means as a covariable, presented homogeneous residuals along the field (Figure 4 and Figure 5). In addition, the broad sense heritability was calculated for each model. The greatest heritability value was for the MVNG model for all traits (Table 2). In 2012 the fields had minimal spatial variation; therefore, after we adjusted the phenotypic data with the intended field design, the residuals we observed were homogeneous throughout the field. The heritability for yield was greater for the 2012 trials than for the 2011 trials. Cauquenes, which was a dry-land condition with more drought stress, presented a more pronounced field variation (Table 2) resulting in a lower heritability than the other two fields in 2012.
Phenotypic predictions A GS model using the GBLUP approach was fitted for each trait, using the line BLUPs of the best-fit phenotypic model. In all cases the 86% of genotypes were used to train the genomic model and the predictions were assessed in the other 14%. The predictions were evaluated using 100 cross validations with the training and prediction sets randomly partitioned.
We determined the standard deviation across all correlation for each model. The RR and GAUSS kernels were tested for each GS model. In most of the cases GAUSS performed better than RR. In general, the phenotypic data adjusted with MVNG, resulted in greater prediction accuracies, although high standard errors were observed ( Table 3). The prediction accuracies for 2012 were greater than for 2011 (Table 3 and Table 4).
All possible combinations of the trials were used to test as training sets to observe the prediction accuracy from year to year or from one environment to the next. Because of the high variation from year to year, the best predictions were obtained when adding information in the model that included both years (Table 5).

DISCUSSION
The challenge in wheat breeding is to accelerate the adaptation of germplasm for more efficient and rapid results, if possible, to increase yield, and to adapt to future climate change. Tools are now available that allow taking up this challenge and relate the complex genetic mechanism involved in phenotypic expression in relation with environmental interactions (Heffner et al. 2009).

Genotypic analysis
We detected 102,324 SNPs using two library replicates, which were used to calculate a dissimilarity matrix, which was analyzed by PCoA. As described by Poland et al. (2012a), we didn't expect big differences between dissimilarity matrixes calculates with and without imputation. For PCoA, the lines were grouped by breeding program of origin. This was expected as the lines from the same breeding program should be more similar genetically because they share a common parental background and were bred under similar developmental conditions (Rauf et al. 2010).
When BLASTing against tags in the SynOpDH genetic map, we found that 13% of the tags aligned with high similarity. These tags were used to test LD between SNPs (File S1). High LD groups were identified, which is expected if the genotypes in each sample are true calls. In the same way, the distribution and concentration of SNPs in n  (1210) different chromosomes was in agreement with other assessments of LD in elite wheat breeding lines (Chao et al. 2010). Although 13% does not seem like a high proportion, the resulting number of SNPs is 13,357, which are well distributed and is a high number of SNPs, allowing us to calculate LD across individuals and confirming high correlations between closely linked SNPs per linkage group. This evidence allows us to trust the quality of the SNPs identified through GBS for genomic predictions.

Phenotypic predictions
Before predictions were made, the phenotypic data were adjusted and residuals in the model were analyzed. The analysis showed a strong spatial effect across the field for the 2011 data. The introduction of a spatial correction model enabled a better adjustment in measurements, showing an increment in the heritability of measurements. A greater heritability indicates that a greater proportion of the variance in the experiment is due to a difference between genotypes (Holland et al. 2003). The models that showed a greater heritability also showed an increase in the mean accuracies of genomic predictions. The importance of taking into account appropriate field design prior to the experiment has been demonstrated before (Federer 2003;Federer and Crossa 2012). Strategies for data analysis also have been described in cases in which the variability in the field has not been measured before applying the field design and has made the field design ineffective (Cullis et al. 1998;Piepho and Williams 2010). In this study we found that these postdata adjustments improved the quality of our data. As genotyping becomes less and less of a constraint for developing genomic prediction training sets, proper treatment of phenotypic observations is the key to increasing the accuracy of predictions. There are studies that include sets of environmental variables in training models, which are measured during experiments, with the objective to control these sources of variation and consider genotypic · environmental interactions (Chapman 2008). If the different sources of variation could be measured, together with genotypic · environment interactions, prediction accuracies should improve. Another interest is the ability to predict phenotypes across environments, or from year to year (Burgueño et al. 2012). However, when predicting new lines in previously untested environments, most prediction power is lost. Therefore, it is important to generate a growing database for model training, with corresponding sets of genotypes for the breeding program and target environments and continually increase the number of environments tested. In the present study, greater prediction accuracies were observed when we used data from 2 years to train the model (Table 5). When training the model with data from only one year, the accuracies of predictions were low because the year-to-year correlation was low ( Table 5).
The accuracies of predictions were comparable with other work in wheat, which have used different fingerprinting approaches Heffner et al. 2011), adding confidence to the GBS approach.
Two different statistical models were tested using the GBLUP approach. The greatest prediction accuracies were achieved with a Gaussian kernel. This model considers epistatic effects in addition to additive effects modeled presented in RR models (Gianola and Van Kaam 2008).
This study is part of a long-term objective to adapt wheat germplasm to Mediterranean climate environments of central Chile and subsequently to other regions in South America that are in similar needs. Previous data (I. Matus, unpublished data) and experience suggest that the 384 lines present traits incorporated from CIMMYT, show adaptation to drier areas in Chile, and that both germplasm groups (INIA-Chile and INIA-Uruguay) have shown different degrees of adaptation to these Chilean environments. We believe that this dataset contains the necessary genetic diversity for a germplasm base to start a breeding program guided toward drier areas in Chile.
The development of new genotyping tools has been the framework to the practical implementation of GS. There are many crops with different genomic characteristics that should be considered when identifying the most suited genotyping methodology. In this study, GBS was successfully applied for the genome-wide characterization of wheat breeding lines. GBS is a low-cost approach that can be used to genotype thousands of lines per year in a commercial breeding program.
The challenge in leveraging genomic assisted breeding approaches in applied programs now remains in obtaining high-quality and longterm accumulation of phenotypic data from multiple years and targeted environments. This study showed an increase in prediction accuracy with proper treatment of phenotypic data from field trials. High-throughput and high-precision phenotyping tools are being tested and used that will be well suited for breeding program and increasing predictions. Understanding and predicting the complex interaction between genotype and environment will also be key to select lines based on genotypic information.