Genomic-Enabled Prediction Kernel Models with Random Intercepts for Multi-environment Trials

In this study, we compared the prediction accuracy of the main genotypic effect model (MM) without G×E interactions, the multi-environment single variance G×E deviation model (MDs), and the multi-environment environment-specific variance G×E deviation model (MDe) where the random genetic effects of the lines are modeled with the markers (or pedigree). With the objective of further modeling the genetic residual of the lines, we incorporated the random intercepts of the lines (l) and generated another three models. Each of these 6 models were fitted with a linear kernel method (Genomic Best Linear Unbiased Predictor, GB) and a Gaussian Kernel (GK) method. We compared these 12 model-method combinations with another two multi-environment G×E interactions models with unstructured variance-covariances (MUC) using GB and GK kernels (4 model-method). Thus, we compared the genomic-enabled prediction accuracy of a total of 16 model-method combinations on two maize data sets with positive phenotypic correlations among environments, and on two wheat data sets with complex G×E that includes some negative and close to zero phenotypic correlations among environments. The two models (MDs and MDE with the random intercept of the lines and the GK method) were computationally efficient and gave high prediction accuracy in the two maize data sets. Regarding the more complex G×E wheat data sets, the prediction accuracy of the model-method combination with G×E, MDs and MDe, including the random intercepts of the lines with GK method had important savings in computing time as compared with the G×E interaction multi-environment models with unstructured variance-covariances but with lower genomic prediction accuracy.

); however, these models do not incorporate marker information.
A Bayesian GBLUP regression model for assessing genomic-enabled prediction combining G·E introduces the main effects of environments and lines and the interaction effects of markers and environmental co-variables via random variance-covariance structures (Jarquín et al. 2014). The Bayesian regression model of López-Cruz et al. (2015) is similar to that of Jarquín et al. (2014) with one difference: that genomic values are partitioned into components that are stable across environments (main genomic effects) and others that are environment-specific (genomic G·E) . Although both models assume positive sample correlations among environments and can be fitted using the BGLR package (de los Campos and Pérez-Rodríguez 2016), the advantage of the model of López-Cruz et al. (2015) over the model of Jarquín et al. (2014) is that it can be implemented using both shrinkage methods and variable selection methods and is efficient when applied to sets of environments that have positive correlations because the genetic covariance between any pair of environments is the variance of the main effect, which makes the covariance between pairs of environments positive (López-Cruz et al. 2015). Cuevas et al. (2016) used the Bayesian model of López-Cruz et al. (2015) to compare methods that apply GS models with G·E using a linear kernel (GBLUP) (GB) and a non-linear Gaussian kernel (GK) for single-environment and multi-environment breeding data sets. The authors found the GK models had higher prediction accuracy than the GB models and explained that the GK models captured major and complex marker effects in addition to their interaction effects. Sousa et al. (2017) compared the prediction accuracy of the multienvironment, single variance G·E deviation model (MDs) of Jarquín et al. (2014) with GK (MDs-GK) and the prediction accuracy of the multi-environment environment-specific variance G·E deviation model (MDe) of López-Cruz et al. (2015) with the GK method (MDe-GK). Then, Sousa et al. (2017) compared the models including the GK method with the prediction accuracy of their counterpart models using the GB methods (MDs-GB and MDe-GB). In addition, Sousa et al. (2017) also compared the accuracy of the four previous models with the accuracy of the multi-environment, main genotypic effect (MM) of Jarquín et al. (2014) using the GB and GK methods (MM-GB, and MM-GK). Results show that for grain yield, a notable increase in prediction accuracy of GK over the GB methods ranged from 9 to 49% in one data set and from 34 to 70% in another data set.
In general, the previous linear mixed multi-environment models assumed the environments as fixed or random effects, and lines as random effects incorporating into the model the random slope of the genetic effect of the lines distributed as a normal random variable with zero mean and variance-covariance structure constructed from markers or pedigree; also, the genetic effect (intercept) of the lines can be considered as having a normal distribution with zero mean and constant variance (Mota et al. 2016). The random intercept of the lines is often not included in the model when no exchange of information occurs, assuming the intercepts are independent . However, recent studies have incorporated random intercepts (Mota et al. 2016;Cuevas et al. 2017;Sukumaran et al. 2017;Jarquín et al. 2017) in order to achieve higher genomic-prediction accuracy in cases where lines were observed in some environments but not in others (random cross-validation 2, CV2 of Burgueño et al. 2012); this is because the posterior distribution of the intercept generates a variancecovariance structure that allows exchanging information between the lines of the training and testing sets. When newly developed lines have never been observed (untested) (random cross-validation CV1, Burgueño et al. 2012), models do not improve the prediction accuracy with or without random intercept when compared with the singleenvironment model. One limitation of these multi-environment genomic G·E models for achieving relatively high genomic-enabled predictions is that correlations among environments should be positive. Also, none of the applications of the models of Jarquín et al. (2014), Sukumaran et al. (2017), and Jarquín et al. (2017) compared genomicenabled prediction accuracy with GB kernel vs. GK kernel.
The previous Bayesian regression models of Jarquín et al. (2014) and López-Cruz et al. (2015) use the Hadamard product for modeling G·E and show that the exchange of information between environments is achieved by means of the variance-covariance matrix of the main effects. Thus, the variance component of the main effects measures the stability across environments and the variance component of the specific effects measures the deviations from the main effects due to specific combinations of lines in environments (G·E). This approach has the advantage that it can be used when the number of lines in each environment is the same, but also when there is an unbalanced number of lines in environments, as shown by Sousa et al. (2017).
On the other hand, GBLUP methodology (together with pedigree) can incorporate and model G·E effects, by means of the Kronecker product of the variance-covariance matrices of the genetic relationship between environments and the genomic or pedigree relationship between the lines (Burgueño et al. 2012;Oakey et al. 2016) where the structure of the models allows estimating negative genetic correlations between environments. Based on this, Cuevas et al. (2017) recently compared a Bayesian regression model for the genetic effects described by the Kronecker product of unstructured variance-covariance matrices of genetic correlations between environments and genomic kernels under the GB and GK methods. An extension includes an extra genetic residual component with random intercepts. Results of the analyses of five data sets indicated that including the random intercepts is still beneficial for increasing genomic prediction accuracy when lines have been tested in some environments. However, one drawback of the Bayesian regression models of Cuevas et al. (2017) is the computing time for the iteration required for the Monte Carlo Markov Chain (MCMC) method to achieve the convergence of the posterior and predictive distributions.
Recently Granato et al. (2017) proposed an R package called Bayesian Genomic G·E (BGGE) to obtain a rapid fit of Bayesian mixed linear models with homogeneous error variances for the models of Jarquín et al. (2014), López-Cruz et al. (2015 and also for the models used by Sousa et al. (2017) (MM, MDs, and MDE). The approach of Granato et al. (2017) uses an R library that saves time by using the structure of the block diagonal matrices with additional parameterizations to shorten the iteration time without losing precision.
Based on the above, the main objective of this study was to compute the prediction accuracy of 16 model-method combinations and compare their prediction accuracy for four different data sets (two maize and two wheat multi-environment trials) with an unbalanced number of lines in environments, and different complexity of the G·E interaction. The 16 model-methods comprise the multi-environment, main genotypic effect (MM), the multi-environment, single variance G·E deviation model (MDs) and the multi-environment environment-specific variance G·E deviation model (MDe) with the GB and GK kernel methods and with and without including random intercepts (12 model-methods) plus 4 Bayesian regression models for the genetic effects described by the Kronecker product of unstructured variance-covariance (MUC) matrices of genetic correlations between environments and genomic kernels under the GB and GK methods and their extensions, including an extra genetic residual component with random intercepts. We discuss the advantages and disadvantages of the different model-methods for sets of environments with different G·E characteristics and different degrees of unbalance among lines.

MATERIALS AND METHODS
This study uses four multi-environment plant breeding data sets with different characteristics. Two maize data sets used by Sousa et al. (2017) (HEL and USP) had different numbers of maize hybrids in each environment and positive correlations between environments, whereas the two wheat data sets used by Cuevas et al. (2017) (WHE1 and WHE5) had environments with negative or zero correlations but with the same number of wheat lines in each location.
In this study models 2 and 3 of Cuevas et al. (2017) are renamed as Multi-environment Unstructured Covariance (MUC) and Multi-environment Unstructured Covariance with random intercept vector f (MUCf ), respectively, each fitted with the GB and GK kernel methods. Therefore, 4 additional models are included, MUC-GB, MUC-GK, MUCf-GB, and MUCf-GK. These models were fitted with the MTM package (de los Campos and Grüneberg 2016) and their prediction accuracy was compared with the other 12 model-method combinations.
In the first step, the phenotypic data were fitted according to the experimental design employed for each experiment, and the Best Linear Unbiased Estimates (BLUE) of the lines or hybrids for each location or environments were computed. In the second step, the various genomic models were fitted to perform random cross-validation and compute the prediction accuracy of the 16 model-method combinations.

Experimental data
Maize data set HEL: This maize data set comprises 452 maize hybrids evaluated in 2015 at five sites in Brazil: Nova Mutum (NM) and Sorriso (SO) in the state of Mato Grosso; Pato de Minas (PM) and Ipiaçú (IP) in the state of Minas Gerais; and Sertanópolis (SE) in the state of Paraná. The experimental design was a randomized block with two replicates per genotype and environment. Different numbers of hybrids were planted in each environment. The HEL parent lines were genotyped with an Affymetrix Axiom Maize Genotyping Array of 616 K SNPs with standard quality controls removing markers with a Call Rate $ 0.95.
Maize data set USP: This data set comprises 740 maize hybrids evaluated at Piracicaba and Anhumas, each with two levels of nitrogen (N) fertilization: Ideal N (IN) and Low N (LN) for a total of four artificial environments (P-IN, P-LN, A-IN, and A-IN). The hybrids were evaluated using an augmented block design including two replicated commercial hybrids as checks. There was an imbalance because not all hybrids were evaluated in all locations. Similar to the maize data set HEL, the USP parent lines were genotyped with an Affymetrix Axiom Maize Genotyping Array of 616 K SNPs with standard quality controls removing markers with a Call Rate $ 0.95.
Wheat data set WHE1: A historical set of 599 wheat lines from CIMMYT's Global Wheat Program was evaluated in four mega-environments (Crossa et al. 2010;Cuevas et al. 2016) and genotyped using 1447 Diversity Array Technology (DArT) markers generated by Triticarte Pty. Ltd. (Canberra, Australia; http://www.triticarte.com.au). Markers with a minor allele frequency lower than 0.05 were not included.
Wheat data set WHE5: This data set is described by López-Cruz et al. (2015) and includes 807 wheat lines evaluated in five environments using an alpha-lattice design with three replicates in each environment at CIMMYT's wheat breeding station at Cd. Obregon, Mexico. The environments were three irrigation regimes (0i = zero irrigation, 2i = two irrigations, and 5i = five irrigations), two planting systems (B = bed planting and F = flat planting) and two different planting dates (N = normal and L = late).
Genotypic data consisted of genotyping-by-sequencing (GBS) data, and markers with a minor allele frequency (MAF) lower than 0.05 were removed. After editing the missing markers, a total of 14,217 GBS markers were available for analyzing this data set.
Availability of the phenotypic and genotypic experimental data: Sousa et al. (2017) describe the two maize data sets and Cuevas et al. (2017) give details of the two wheat data sets. The two maize data sets, HEL and USP, can be downloaded from the link http://hdl.handle.net/ 11529/10887, whereas the two wheat data sets can be found at the link http://hdl.handle.net/11529/10710, from where DATASET1. Wheat_GY.Rdata (Wheat data set WHE1) and DATASET5. Wheat_GY.Rdata (Wheat data set WHE5) were obtained.

Statistical models
The components of the 8 basic models are summarized in Table 1 and their full descriptions are given below and in Appendix 1. They include an overall mean (m) and the fixed effects of the environments (other effects can be incorporated) modeled with the incident matrix Z E and one vector of fixed effects b E for each environment. For the first group of six models (MM, MMl MDs, MDsl, MDe, and MDel), it is assumed that their genetic random components g have a normal distribution with mean zero and a variance-covariance structure comprising a known matrix K generated from markers (and computed using the GB or GK methods) multiplied by an unknown scaled parameter (variance component). Also 4 models in this group had different forms for modeling the G·E, MDs and MDe, with a variance-covariance structure constructed by the Hadamard product of the corresponding matrices and incorporating (or not) the random intercepts (l).
A second group of models (MUC) considers that their random components have a normal distribution with zero mean and a variancecovariance structure modeled by the Kronecker product of a matrix with unknown covariances among environments multiplied by a known K (computed using the GB or GK methods) and incorporating (or not) the random intercepts ( f ).
The multi-environment main genotypic effect model (MM): Model MM (1) (Appendix 1) is equivalent to the across-environment model of Jarquín et al. (2014) and when in the distribution of the random genetic effects g is used in model MM, K ¼ XX ' p is used in the covariance (de los Campos et al. 2013;VanRaden 2007VanRaden , 2008; the model is the GBLUP across environments (MM-GB), where X is the standardized matrix of molecular markers for the individuals of order n · p, where p is the number of markers.
However, markers can have a more complex function than the linear GBLUP. For example, the Gaussian kernel (GK) function (Cuevas et al. 2016) is computed as where h ¼ 1 and the median of the distances is used as a scaling factor (Crossa et al. 2010). When in the distribution of the random genetic effects g of the MM model (1) in the covariance the model is the Gaussian kernel across environments (MM-GK) (Sousa et al. 2017).
The genetic variation between lines that is not explained by g in (1) (Appendix 1) can be captured by the random vector l that is considered a random intercept for each line; thus when random effects l are added, model MM becomes model MMl where the random intercepts l $ Nð0; s 2 l IÞ with I being the identity matrix of size n · n, and s 2 l the variance component that indicates the influence of l; the incidence matrix Z g connects the genotypes to the phenotypes. As in MM, the kernel matrix K of the random effect g of model MMl can be fitted with GBLUP (MMl-GB) or with Gaussian kernel (MMl-GK).
The multi-environment single variance genotype 3 environment interaction deviation model (MDs): Model (2) (Appendix 1) (MDs) adds to model (1) the random interaction effect of the environments with the genetic information of the lines (ge ij ). When the random component l is added to model (2), the MDs model becomes MDsl: Each environment matrix K (Appendix 1) of models MDs and MDsl can be fitted with a linear kernel (MDs-GB, MDsl-GB) or a Gaussian kernel (MDs-GK, MDsl-GK).
Multi-environment environment-specific variance genotype 3 environment deviation model (MDe): The environment-specific variance genotype · environment deviation model (MDe) (López-Cruz et al. 2015) differs from MDs on how the interaction component is considered; g is the main genetic effect across environments and g E is the specific genetic effect in each environment. When the random component l is added to (3) (Appendix 1), the MDe model becomes MDel: where matrices K for g and K E for g E of models MDe and MDel can be fitted with a linear kernel (MDe-GB, MDel-GB) or with a Gaussian kernel (MDe-GK, MDel-GK).
Multi-environment With unstructured variance-covariance (MUC): This model considers that there is a genetic correlation between environments that can be modeled with matrices of order m · m (where m denotes the environment) . The MUC is expressed as where y ¼ ðy 1 ; . . . ; y j ; . . . y m Þ 9 is a vector with the observation y j belonging to the j th environment ðj ¼ 1; . . . ; mÞ; each of the same size (nÞ; the random vector u ¼ ðu 1 ; . . . ; u j ; . . . u m Þ 9 is the vector of genetic values, and e ¼ ðe 1 ; . . . ; e j ; . . . e m Þ 9 the vector of random errors both assumed normally distributed with u $ Nð0; U E 5KÞ and e $ Nð0; Σ5IÞ; where 5 is the Kronecker product. The variance-covariance matrix of u is the Kronecker product of one unstructured matrix with information between environments ðU E Þ that needs to be estimated and another known matrix with information between the lines based on K markers ðcomputed using the GB or GK methodsÞ. Then the m · m matrix U E is where the j th diagonal element is the genetic variance s 2 uj within the j th environment, and the off-diagonal elements are the genetic covariances s uju j' between environments j and j'. For a large number of environments, a factor analytical model usually performs better than the unstructured model (Burgueño et al. 2012;Oakey et al. 2016).
n Table 1 Components of the 8 models included in this study. Each of these models is fitted with the linear kernel (GB) and the Gaussian kernel (GK)

Multi-environment With un-structured variance-covariance and random intercepts (MUCf):
The MUC model can be extended by adding an extra variability to account for genetic variance among individuals across environments, that is, by adding the random vector f . Therefore, the extension of the previous random linear model is . . . ; f m Þ 9 with the random vectors f j being independent of u j and normally distributed f $ Nð0; F E 5IÞ. Matrix F E; is unstructured and captures genetic variance-covariance effects between the individuals across environments that were not captured by the U E matrix; matrix F E can be expressed as where the j th diagonal element of the m · m matrix F E is the genetic environmental variance s 2 f j within the j th environment, and the offdiagonal element is the genetic covariance s fjf j' between environments j and j'. Similar to the previous cases, models MUC and MUCf can be fitted using GB or GK kernels to generate the four model-method MUC-GB,MUC-GK, MUCf-GB, MUCf-GK.
Model implementation and random cross-validation for assessing prediction accuracy in the four data sets: For the two maize data sets, models MM-GB, MM-GK, MDs-GB, MDs-GK, MDe-GB, and MDe-GK were fitted with the new software BGGE (Granato et al. 2017). Models MMl-GB, MMl-GK, MDsl-GB, MDsl-GK, MDel-GB, and MDel-GK were also fitted with BGGE with the same random partitions used by Sousa et al. (2017) to make results comparable for random-cross-validation 1 (CV1) and random cross-validation 2 (CV2) (Burgueño et al. 2012). Models MUCf and MUC of Cuevas et al. (2017) were fitted using the software MTM (de los Campos and Grüneberg 2016) with the GB and GK kernel methods and with the same random partitions used for the 12 model-method combinations previously defined for random cross-validations CV1 and CV2. A fivefold random cross-validation was used assigning 80% of the observations to the training sets and 20% to the testing (validation) set. However, most of the results and discussion focus on cross-validation CV2. The two wheat data sets were fitted with the 12 model-method combinations (models MMl-GB, MMl-GK, MDsl-GB, MDsl-GK, MDel-GB, MDel-GK, MM-GB, MM-GK, MDs-GB, MDs-GK, MDe-GB, MDe-GK) using the BGGE software of Granato et al. (2017).
Two random cross-validations (CV1 and CV2) were generated; CV1 attempts to mimic a situation where a set of lines were never evaluated in a set of environments, whereas CV2 mimics a sparse testing scheme where some lines were evaluated in some environments but not in others. Results based on CV2 are shown in the main text, tables and figures. Results of random cross-validation CV1 are given in Tables S1-S4 of Appendix 2. To implement the proposed 12 model-method combinations, 50 random partitions were performed with 80% of the lines used for training and the remaining 20% of the lines used for testing. The metric for measuring the performance of prediction accuracy was the Pearson correlation calculated between the observed and predicted values of the testing sets.

RESULTS
The results are given in four sections, one for each data set. In each section, we provide the results of the variance component estimates and the prediction accuracy for each of the 12 model-method combinations.
Maize data set HEL This maize data set has a total of 452 maize hybrids with a different number in each of the five sites (n IP ¼ 247, n NM ¼ 330, n PM ¼ 452, n SE ¼ 367, n SO ¼ 330). The sample phenotypic correlations among locations are positive with intermediate-to-low values, where location SE has low correlations with all the other locations, and locations NM, IP, and PM show relatively high correlations with the other locations (Table A1, Appendix 3).
Models without the random component l always show a lower residual variance component in the GK models than in the GB models; for example, for model MDs-GK, s 2 = 0.278 and for MDs-GB, s 2 = 0.591 (Table 2). However, when the models include l, these differences become smaller; for example, for MDsl-GK, s 2 = 0.277 and for MDsl-GB, s 2 = 0.368, indicating that for method GB, the random component l explains the variation of the observations better, whereas for GK, including l does not have much influence on the residual. This is also reflected in the small value of s 2 l = 0.013 for MDsl-GK as compared with s 2 l = 0.243 for MDsl-GB. The size of the genetic component, s 2 g , is always much higher for MM-GK, MDs-GK, and MDe-GK than for models with the GB method. For models MMl, MDsl, and MDel, the sum of s 2 g and s 2 l is higher than the component s 2 g for models MM, MDs, and MDe. For example, for model MMl-GB, s 2 g þ s 2 l is 0.429, whereas for model MM-GB, s 2 g is 0.356; MDsl-GB summation s 2 g þ s 2 l is 0.415 vs. MDs-GB with s 2 g = 0.370, and for model MDel-GB s 2 g þ s 2 l = 0.430 vs. MDs-GB with s 2 g = 0.370. The variance explained by the G·E of MDs, s 2 ge , is higher for GK than for GB and slightly higher for models with the random component l than for models without l. The variance components for the specific environments show increases in MDel-GK compared to MDel-GB, and in MDe-GK compared to MDe-GK (Table 2).
Models including the random component l with GK did not improve the prediction accuracy of the locations as compared with the prediction accuracy of models without l with GK (Table 3 and Figure  1); however, models with l had consistently higher prediction accuracies than models with GB. In all cases, MMl showed lower prediction accuracy than models with G·E (MDsl and MDel). Similarly, model MM had lower prediction accuracies than models that incorporate G·E (MDs and MDe). These differences are smaller for locations that had higher sample phenotypic correlations with other locations than for locations with low phenotypic correlations. For example, location NM had prediction accuracies of 0.569, 0.589, and 0.588 for models MMl-GB, MDsl-GB, and MDel-GB, respectively, whereas location SE with low sample phenotypic correlations among locations had prediction accuracies of 0.372, 0.544, and 0.548 for models MMl-GB, MDsl-GB, and MDel-GB, respectively.
All models with kernel GK had higher prediction accuracies (with and without the random component l) than models with kernel GB (Table 3 and Figure 1). However, these differences are lower for models that include the random component l (Table 3). For example, for location SO, the prediction accuracies for models MDsl-GK and MDsl-GB were 0.673 and 0.639, respectively, whereas for MDs-GK and MDs-GB, the mean prediction accuracies were 0.666 and 0.466, respectively. Comparing models with kernel GB, with and without l, the predictions are always higher when the model includes l than when the model excludes l; for example, for location IP, the mean prediction accuracies were 0.778 and 0.683 for MDsl-GB and MDs-GB, respectively (Table 3). Note that the variance component of the random effect l s 2 l was 0.243 for model MDsl-GB (Table 2). Furthermore, model 3 from Cuevas et al. (2017) with the unstructured variance-covariance component f or model 2 without f did not show any clear superiority, in terms of n Table 2 MAIZE HEL data set. Estimated variance components for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs) and environment-specific variance G3E deviation model (MDe) with two kernels, GBLUP (GB) and Gaussian (GK), with l and without l, for grain yield (standard deviation in parentheses)  n Table 3 Maize HEL data set. Mean Pearson's correlation (50 partitions) of each location for random cross-validation CV2, for the multienvironment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses)

MDe-GB
Proposed models with random effects l and f  (Table 3 and Figure 1). Random cross-validation CV1 decreased the prediction accuracy as compared with results achieved for CV2 (Table S1, Appendix 2); the trends and patterns of the prediction accuracy of the locations between models and methods are similar to those found for CV2, including those found for models MUC and MUCf.
In summary, results from maize data HEL indicated that models with the random component l with GK including G·E (MDsl-GK and MDel-GK) show similar mean prediction accuracy as models excluding the random component l. However, this did not occur with GB models where including the random component l increased the prediction accuracy for all 5 locations. Prediction accuracy using GK was always higher than using GB with or without the random component l. Also, the differences between the models with and without l and between GK and GB were smaller for locations that had higher sample phenotypic correlations with other locations. Finally, the differences in prediction accuracy were negligible between the proposed models including G·E with GK and GB and with and without the random effect l and models MUCf and MUC for all locations.

Maize data set USP
This maize data set is comprised of 739 maize hybrids with different numbers of lines in each of the four sites (n P2LN ¼ 731, n P2IN ¼ 732, n A2LN ¼ 731, n A2IN ¼ 737). Locations P-IN and A-IN had relatively high correlations with the other locations, whereas A-LN had low ones (Table A1, Appendix 3). The residual variance components for GK are smaller than those for GB for models MM, MDs and MDe; for instance, MM-GK had s 2 = 0.589 while MM-GB had s 2 = 0.854. Similarly, the residual variance components for MDsl and MDel with GK are lower than for MDsl and MDel with GB. The variance components of the random intercept (s 2 l ) of GK methods are not negligible (as in data set HEL) and are always lower than for the corresponding GB methods ( Table 4).
The estimated genetic variance components s 2 g for GB in models MM, MDs and MDe were 0.214, 0.209, and 0.206, respectively (Table 4), increasing the genetic environmental stability (s 2 g þ s 2 l Þ of models MMl-GB, MDsl-GB, and MDel-GB to 0.511, 0.513, and 0.514, respectively. The specific components for each environment of models MDe-GB and MDel-GB were negligible. The variance component (s 2 ge ) of the G·E models MDs and MDsl for GB and GK was also negligible.
In general, models with l-GB had similar prediction accuracy as models with l-GK, whereas the increase in prediction accuracy of models without l-GK over models with GB is clear. For example, for P-LN, models MDsl-GK and MDsl-GB had prediction accuracies of 0.545 and 0.546, respectively, whereas for MDs-GK and MDs-GB, the prediction accuracies were 0.524 and 0.325 (Table 5 and Figure 2). Models with l-GB showed significant improvement in prediction accuracy compared to models GB without l; for example, for location P-IN, the mean prediction accuracies of MDsl-GB and MDs-GB were 0.591 and 0.368, respectively (due to the influence of s 2 l = 0.349 for model MDsl-GB; see Table 4). All models with GK with the random intercept l and with high values of s 2 l gave higher prediction accuracies than GK models without l. There are no clear differences between model MUCf and the proposed model with the random component l with GK and GB in all the locations. Similar results were found for model MUC when compared to models without l. For this data set, results from CV1 (Table S2, Appendix 2) were all similar and lower than those obtained for CV2.
In summary, results from maize data USP indicate that models with the random component l (MDsl-GK and MDel-GK) show higher mean prediction accuracy than models without l and using the linear kernel GB. The G·E variance component of models MDs and MDsl with GK and GB had negligible s 2 ge , indicating less complex G·E than that found for maize data set HEL. The differences in the mean prediction accuracy between the proposed models with or without the random effect l and models MUCf and MUC are small for models with GK and not clearly superior to the proposed models with GB. Wheat data set WHE1 For this data set, environment E1 had negative correlations with the other environments (E2-E4), whereas environments E2-E4 had high correlations among themselves (Table A1, Appendix 3). Models with GK fitted the WHE1 data better than models with kernel GB (low residual variances of GK models as compared to GB models). Also, models with random component l had lower residual variance components than models without l. As opposed to the previous two maize data sets, where the magnitude of the variance components determines the prediction ability, the presence of environments with negative correlations with other environments makes interpreting the variance components in relation to their predictive ability not as straightforward as in the previous two data sets (Table 6). For example, models MMl and MM with GK and GB had estimates of the random error variance that were much higher ($0.8) than those of the other models; thus the prediction accuracy of these models is expected to be low for at least the environments with negative correlations.
n Table 4 Maize USP data set. Estimated variance components for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs) and environment-specific variance G3E deviation model (MDe) with two kernels, GBLUP (GB) and Gaussian kernel (GK), with l and without l for grain yield (standard deviation in parentheses)  n Table 5 Maize USP data set. Mean Pearson's correlation (50 partitions) of each environment for random cross-validation CV2, for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses)

Mde-GB
Proposed models with random effects l and f  The genetic variance component s 2 g varied for models MM-GB, MDs-GB, and MDe-GB (0.192, 0.219, and 0.414, respectively) as well as for the GK models (0.599, 0.752, and 1.404, respectively). The contribution of l measured in s 2 l was small for MDsl-GK and MDsl-GB (0.101 and 0.107) ( Table 6) and negligible for the other models with l. On the other hand, the G·E interaction variance components s 2 gE for GK and GB are important (MDsl-GK s 2 gE = 1.637, MDsl-GB s 2 gE = 0.42; MDs-GK s 2 gE = 1.349, MDs-GB s 2 gE = 0.349) and much higher than in the two maize data sets. Models MDel-GK and MDel-GB showed high specific variance components for E1 (3.356 and 1.058, respectively) and for E4 (1.147 and 0.3) causing most of the interaction in this data set (these are the environments with the lowest sample correlations with the other environments) and contributed the least to genetic environmental stability.
Models with G·E (MDs and MDe) had mean prediction accuracies higher than MM models with lower mean prediction accuracy in E1 and E4 as compared with E2 and E3 (Table 7 and Figure 3). The exceptions are models MM-GB and MM-GK, which had higher prediction accuracy than models MDs-GB and MDS-GB in E3. Models MDel-GK and MDe-GK had higher prediction accuracy than models MM, MDs and MDe with and without l for GB and GK in all locations, except MDe-GK in E1. However, in all cases and environments, models MUCf and MUC had better prediction accuracies than all 12 genomic model-method combinations (Figure 3). Lower prediction accuracies n Table 6 Wheat WHE1 data set. Estimated variance components for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs) and environment-specific variance G3E deviation model (MDe) with two kernels, GBLUP (GB) and Gaussian (GK), with l and without l for grain yield (standard deviation in parentheses) were found for CV1 (Table S3, Appendix 2) than for CV2; however, the decrease in prediction accuracy of CV1 was lower than for the two wheat data sets. In summary, G·E for this data set is more complex than for the two previous maize data sets. This is expressed by higher values of s 2 gE (given by models MDsl and MDs) compared to those computed for the maize data sets, as well as the higher values of the variance components specific to environments (s 2 1 and s 2 4 ) compared to those computed for other environments in this data set, as well as in the maize data sets. For the 12 model-method combinations, the models with the n Table 7 WHEAT WHE1 data set. Mean Pearson's correlation (50 partitions) of each environment for random cross-validation CV2, for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses)

Mde-GB
Proposed models with random effects l and f Environment Ã  highest prediction accuracy for the environments were MDel and MDe. However, models MUf and MUC had the highest prediction accuracy for each environment and for both methods, GK and GB.

Wheat data set WHE5
This data set has sample phenotypic correlations between environments that are close to zero or negative (Table A1, Appendix 3). Only one high phenotypic correlation was observed between environments 5iBN and 5iFN (0.546). Table 8 shows the high residual variance components of models MMl-GK, MMl-GB, MM-GK and MM-GB, whereas for models incorporating G·E (MDs and MDe with GK and GB and with and without l), the residual variance components were much smaller.
The variance components of the genetic main effects with GB and l were low (0.064 and 0.061 for MDsl-GB and MDel-GB, respectively), indicating low exchange of information between environments. The most influential variance components were related to the G·E, s 2 gE . For example, for models MDs-GB, the variance component s 2 gE is 0.618 and 0.636 for MDsl-GB, whereas it increases to s 2 gE = 1.482 for MDs-GK and to s 2 gE = 1.49 for MDsl-GK (Table 8); this result indicates the importance of G·E interaction. The influence of the random component l in this data set is negligible. The variance components related to specific environments are similar for the five environments and for MDe models with and without random component l.
n Table 8 WHEAT WHE5 data set. Estimated variance components for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs) and environment-specific variance G3E deviation model (MDe) with two kernels, GBLUP (GB) and Gaussian kernel (GK), with l and without l (Sousa et al. 2017), for grain yield (standard deviation in parentheses)  n Table 9 WHEAT WHE5 data set. Mean Pearson's correlation (50 partitions) of each environment for random cross-validation CV2, for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses)

Mde-GB
Proposed models with random effects l and f Among the 12 model-method combinations, the best predictive models were MDel-GK and MDe-GK in all locations (Table 9, Figure 4). However, models MDsl-GK and MDs-GK also had relatively high prediction accuracies that were very similar to those of models MDel-GK and MDe-GK. Similar results were found for models with linear kernel GB (Table 9). Models with the random intercept l showed no increase in prediction accuracy (values of s 2 l close to zero) as compared to models without l.
The comparison of the prediction accuracy of these 12 modelmethod combinations with the mean prediction accuracy of models MUCf and MUC (Figure 4) indicated the higher mean prediction accuracy of MUCf and MUC over the mean prediction accuracy of the proposed models with (or without) the random effect l. For this data set, the prediction accuracies of CV1 were similar to those found under CV2 (Table S4, Appendix 2).
In summary, the complex G·E interaction in this data set is expressed by the large variance component s 2 ge . Models with random component l did not increase the prediction accuracy of the corresponding models without l (reflected in their values of s 2 l close to zero). Of the 12 modelmethod combinations, models MDel-GK and MDe-GK gave the highest prediction accuracies. However, the best predictive models overall and for each environment were MUf and MUC.

DISCUSSION
Effect of random component l From a statistical perspective, the mixed models can better explain the variation among lines in environments (G·E) by considering two factors: environments and lines. The environmental effects (b E ) are considered as fixed effects with the relationship Z E b E ; however, the effects of the lines are considered random in Z g g þ Z g l for model MMl. The g is the common random effect of each line derived from the markers and l is considered the random intercept for each line. If we make the transformation g Ã ¼ Z g g, as in López-Cruz et al. (2015), then g Ã $ Nð0, s 2 g Z g KZ 9 g Þ; where matrix Z g KZ 9 g comprises submatrices (or blocks) where the submatrices off the block diagonal generated the exchange of information between environments with positive correlations. As discussed by López-Cruz et al. (2015), this exchange of information is not effective when there are negative correlations between sites (or environments) due to the fact that they are based on s 2 g . Similarly, if l Ã ¼ Z g l, then l Ã $ Nð0, s 2 l Z g IZ 9 g Þ and the exchange of information occurred in the submatrices off the block diagonal between the environments with positive correlations and when s 2 l is not zero. On the other hand, in models MDsl and MDel, the component l has influence only when there is exchange of information across environments and the G·E is simple; otherwise, as in the WHE5 data set, the contribution of l is negligible when the G·E is complex.
The random effects l are independent and identically distributed (iid) thus do not have the possibility of exchanging of information from tested lines to untested lines and therefore do not have any estimate of these values if no evaluation data on a line exists (CV1). Then, when trying to predict values of untested lines, only available information between lines come from the g part of the model. In a number of cases, substantial variation for the l effects were found suggesting that the additive part of the model (g) is not capturing the total genetic value very well. In these cases, since usually the GK method did as well as the GB with l model, there is a major advantage to the GK method in that it can better predict untested genotypes since the marker information is being used in a way that captures more of the genetic variation. On the other hand, if the breeder is concerned about gain from selection following intermating and generating a new population, the breeder should only be selecting based on the additive breeding values and realizing that the breeding values are not the complete genotypic value (commercial value), such that response to selection after intermating will be less than expected based on total genetic variance.

Effects of including G3E interaction
In general, results show that when GBLUP is used for prediction under random cross-validation CV2, models MDsl-GB and MDel-GB that incorporate G·E had higher prediction accuracy than models MDs-GB and MDe-GB also with G·E. This improvement depends on s 2 l and the magnitude of the correlations between environments. For maize data sets (HEL and USP) with positive sample correlations between environments, models MDsl-GB and MDel-GB had higher prediction accuracy than models MDs-GB and MDe-GB, whereas in wheat data set WHE1, models MDsl and MDel had better prediction accuracy than models MDs and MDe only in environments with positive correlations. Finally, for environments in wheat data set WHE5 with negligible s 2 l , the accuracy of models MDsl-GB and MDel-GB did not improve much over that of models MDs-GB and MDe-GB without l.

Effects of including the Gaussian kernel
In general, models MDs and MDe with the Gaussian kernel (GK) had higher prediction accuracy than models with GB, although these differences were smaller for models MDsl and MDel. When GK models were better than GB models, results show that s 2 l was negligible for GK models and when the prediction accuracy of MDsl and MDel was only slightly superior to that of models MDs and MDe (as in maize data set HEL). On the contrary, when using GK, the prediction accuracy was not better than when using GB, as in the case of maize data set USP; then the contribution of s 2 l was important and the prediction accuracy of MDsl and MDel was superior to that of their counterparts MDs and MDe. These results indicate that models with random intercepts are useful when used with the linear kernel (GB) but not when used with the Gaussian kernel (GK). This is because the GK method without l explains most of the genetic variance (additive and epistasis effects) between lines with negligible genetic residuals that are not picked up by the l.
The effect of the sample covariance among environments The behavior of the covariance between observations of the ith line in the jth and j'th environments explains some of the results obtained in the four data sets. The covariance between y ij and y ij ' of models MM, MDs and MDe is the same; it is determined by the genetic variance component s 2 g . It would be expected that the estimate of s 2 g would be proportional to the sample covariance of the observations. This only occurred when the sample covariances were positive because s 2 g can take only positive values; when the sample covariances between some environments are negative, this distorts the estimations of the genetic variance component (s 2 g ) and therefore affects the prediction accuracy of the unobserved phenotypes of the lines in the testing set.
On the other hand, when the sample covariance between y ij and y ij ' of models MMl, MDsl and MDel is determined by the summation s 2 g + s 2 l , then the higher s 2 l , the higher the estimated sample covariance (association) of the lines in environments and, therefore, the higher the prediction accuracy compared with those achieved by models MM, MDs and MDe (without the random effect lÞ. Again, the presence of negative sample covariances distorts the behavior of the estimated genetic variance components and this negatively affects the prediction accuracy of these models.
Models With G3E With the Kronecker product vs. models With G3E With the Hadamard product Less restrictive G·E genomic-enabled prediction models that allow any covariance value between environments had better prediction accuracy than models with more restrictive assumptions at the level of association between lines in environments affecting the estimation of the genetic variance components. Less restrictive models consider variance-covariance matrices represented by the Kronecker product of the variances and covariances of the environmental and genetic values (with the linear or non-linear kernels constructed with the markers) (Burgueño et al. 2012;Cuevas et al. 2017). When a random intercept (f Þ is added to these models based on the Kronecker product , the genomic-enabled prediction accuracy increased for random cross-validation CV2 and for environments with negative sample covariance. These advantages of the G·E genomic-enabled prediction models using the Kronecker product for defining variance-covariance environmental matrices with negative or zero environmental relationship over the Hadamard product defined by models MDsl and MDel are less when sample covariances between environments are all positive. The disadvantages of models with Kronecker products are that defining and measuring environmental stability is not clear, plus they demand higher computing resources compared to G·E genomic-enabled prediction models using the Hadamard product.

Required computing time for fitting the models
We performed all the analyses in an Ubuntu Linux server with 256 GB of RAM and 32 CPUs core. To compare the computing time, we counted the mean computing time in seconds for fitting one random partition for random cross-validation CV1 for the maize data set HELIX with the same number of 50 partitions and the same number of iterations in the model. For the models with G·E without l or f, the mean computing time for one random partition was 290, 319, and 3110 for models MDs, MDe, and MUC, respectively. For models with G·E with random intercept l or f, the mean computing time for one random partition was 489, 541, and 4938 for models MDsl, MDel, and MUCf, respectively. The differences in computing time between models MDs and MDe are low, but for model MUC, the required mean computing time needed to fit the model increased 10 times for one random partition.
Advantages and disadvantages of the proposed models In general, G·E genomic-enabled prediction models MDsl and MDel had similar prediction accuracy and, in both cases, environmental stability and G·E can be assessed and measured. Furthermore, in models MDsl and MDel, when the sample correlation among environments is positive, their prediction accuracy is similar or slightly higher than the accuracy achieved with the more flexible Kronecker product models (Burgueño et al. 2012;Cuevas et al. 2017) for the variance-covariance matrices. The advantage of models MDsl and MDel with the Hadamard product for the variance-covariance is that they can perform highly dimensional matrix operations very fast and, therefore, save time when fitting these models. The BGGE software developed by Granato et al. (2017) is indeed an example of this efficiency for fitting models MDsl and MDel by means of the Hadamard product.
When the main objective is prediction accuracy, we recommend checking for sample covariance (or correlations) between environments before using MDsl and MDel G·E genomic-enabled prediction models. Models MDsl, MDel, MDs and MDe are recommended when the sample correlations are positive and not close to zero. We also recommend fitting models MDsl, and MDel to the training set and estimating the variance component of the random intercept s 2 l ; if it is negligible, only models MDs and MDe should be used. When the number of lines in each environment is not the same, models MDsl, MDel, MDs, and MDe can be efficiently fitted with the BGGE software, whereas models MUCf and MUC of Cuevas et al. (2017) with an unbalanced number of lines in each environments require intensive computational resources.

CONCLUSIONS
Results indicate that when the sample phenotypic correlations between environments were intermediate to moderate (HEL, USP), models with G·E with random intercept l (MDsl, MDelÞ and Gaussian kernel (GK) had the advantages of other models without their disadvantages. These models allow: (i) finding regions of the chromosomes with environmental stability (Jarquín et al. 2014;López-Cruz et al. 2015), (ii) the fitted computing time is fast (Granato et al. 2017), and (iii) increasing the prediction accuracy in the CV2 to a level of the Gaussian kernels of Cuevas et al. (2016) and Sousa et al. (2017) or other more flexible models such as those used by Burgueño et al. (2012) and Cuevas et al. (2017). For sample low or negative phenotypic correlations like in data sets WHE1, WHE5, the prediction accuracy of model MUCf with GK of Cuevas et al. (2017) is the one that should be used.
Including the random intercept l for each line made it possible to capture some extra genetic variability. Models MDs and MDe assessed the complexity of the genomic G·E present in the two maize data sets (with all environments with positive correlations) by means of the Hadamard product between markers and environments as in models from Jarquín et al. (2014) (MM, and MDs) and López-Cruz et al. (2015) (MDe). For the two maize data sets with positive sample correlations among environments, the Hadamard models MM, MDs and MDe with l had similar prediction accuracies as models MUCf and MUC that use a Kronecker product for assessing G·E. The advantage of models MMl, MDsl, and MDel over models MUCf and MUC is shorter computing time when the number of lines in different environments is very unbalanced, as in the case of the two maize data sets.
For the two wheat data sets, the number of lines in each environment is the same. However, in view of the fact that the sample correlation among environments is not positive for all pair-wise environment combinations, using models MM, MDs and MDe with or without l is less favorable than using models MUCf and MUCwith a Kronecker product for modeling G·E. The reduced prediction accuracy of the Hadamard product models vs. the Kronecker product models indicated the flexibility of models MUCf and MUC for assessing complex G·E multi-environment data sets. Regardless of: (i) whether l is included or not, and (ii) the type of data set at hand (with more or less complex G·E) and the balanced or unbalanced data structure, the prediction accuracy of the Gaussian kernel was better than the prediction accuracy of the linear kernel GBLUP for all four data sets.

APPENDIX 1
The multi-environment main genotypic effect model (MM) The multi-environment model (MM) considers the fixed effects of environments (b E ), as well as the random genetic effects across environments (g) where y ¼ ðy 1 ; . . . ; y j ; . . . y m Þ 9 is a vector with the observations y j of the j th environment ðj ¼ 1; . . . ; mÞ; each of size n j , such that one line in one environment represents the y ij observation of the i th line ði ¼ 1; . . . ; n j Þ in the j th environment. The scalar m is a general mean and the vector 1 is of size P m j¼1 n j · 1: The fixed effects of the environment for the data used in this study are modeled with the incidence matrix of the environments Z E of order P m j¼1 n j · m, where the parameters to be estimated are the intercept for each environment (b E ) with the vector b E of order m · 1. Incorporating other fixed effects into the model is straightforward.
The random vector of genetic effects g follows a multivariate normal distribution with mean zero and a covariance matrix K, that is, g $ Nð0; s 2 g KÞ, where the vector g of order n · 1 represents the genetic random effects across all environments for each line; and the kernel matrix K is a symmetric semidefinite positive matrix constructed with molecular markers of order n · n. If the number of lines is the same in each environment, then n ¼ n 1 ¼ . . . ¼ n j ¼ . . . ¼ n m ; otherwise, when there are different numbers of lines in each environment, n represents the number of unique lines included in the model in some environments. The incidence matrix Z g connects genotypes with phenotypes for each environment, with order P m j¼1 n j · n. Variance component s 2 g is the genetic variance of the lines across all environments and represents the sensitivity or environmental stability. Finally, the random errors are assumed to be homoscedastic and independent, e $ Nð0; s 2 IÞ, where s 2 is the error variance.
The multi-environment single variance genotype 3 environment interaction deviation model (MDs) This model adds to the MM model the random interaction effects of the environments with the genetic information of the lines (ge ij ) (Sousa et al., 2017;Jarquín et al., 2014): The vector of random effects G·E interaction, ge, has a multivariate normal distribution, ge $ Nð0; ½Z g KZ g 9∘½Z E Z E 9s 2 ge ), where (∘Þ is the Hadamard product operator, and s 2 ge is the variance component of the G·E interaction. Matrix ½Z g KZ g 9∘½Z E Z E 9 is a block diagonal constructed with the matrices K (K 1 . . . K j . . .K m Þ for each environment; therefore, there is no exchange (borrowing) of information between environments: n Table S1. Maize HEL data set. Mean Pearson's correlation (50 partitions) of each location for random cross-validation CV1, for the multienvironment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses) Proposed models with random effects l and f n Table S2. Maize USP data set. Mean Pearson's correlation (50 partitions) of each location for random cross-validation CV1, for the multienvironment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses) Proposed models with random effects l and f  n Table S4. Wheat data set WHE5. Mean Pearson's correlation (50 partitions) of each location for random cross-validation CV1, for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses) Proposed models with random effects l and f Environment Ã MMl-GK n Table S3. Wheat data set WHE1. Mean Pearson's correlation (50 partitions) of each location for random cross-validation CV1, for the multi-environment models, main genotypic effect model (MM), single variance G3E deviation model (MDs), environment-specific variance G3E deviation model (MDe), multi-environment unstructured covariance models (MUC and MUCf) with two kernels, GBLUP (GB) and Gaussian kernel (GK) for grain yield with the proposed random effect l and without the random effect l (standard deviation in parentheses)

MDel-GK
Proposed models with random effects l and f  In WHE5 data set, environments are described by a sequence of codes: 0i, 2i and 5i denote the number of irrigations; B/F denotes whether the planting system was 'bed' (B) or 'flat' (F); N/H denotes whether planting date was normal (N) or late (H, simulating heat).