## Abstract

Animal models are generalized linear mixed models used in evolutionary biology and animal breeding to identify the genetic part of traits. Integrated Nested Laplace Approximation (INLA) is a methodology for making fast, nonsampling-based Bayesian inference for hierarchical Gaussian Markov models. In this article, we demonstrate that the INLA methodology can be used for many versions of Bayesian animal models. We analyze animal models for both synthetic case studies and house sparrow (*Passer domesticus*) population case studies with Gaussian, binomial, and Poisson likelihoods using INLA. Inference results are compared with results using Markov Chain Monte Carlo methods. For model choice we use difference in deviance information criteria (DIC). We suggest and show how to evaluate differences in DIC by comparing them with sampling results from simulation studies. We also introduce an R package, AnimalINLA, for easy and fast inference for Bayesian Animal models using INLA.

To estimate the additive genetic variance (and thus the heritability) of different kinds of traits, biologists and animal breeders often use a generalized linear mixed model (GLMM), called an *animal model*. In an animal model individual *i*’s trait *y _{i}* has a genetic part,

*u*. The value

_{i}*u*is known as the breeding value of individual

_{i}*i*. From the assumption that the breeding value is the sum of effects of many genes and from the central limit theorem, the breeding values are assumed to have a Gaussian distribution with a dependence structure given by the pedigree. Since early 1980s, animal breeders have successfully used a frequentist approach with restricted maximum likelihood (REML), to for example increase meat or milk yield in cattle (Simm 1998). However, inference with REML is not trivial for GLMM models. Models for non-Gaussian traits especially are challenging in regard to uncertainty in breeding values and other parameter estimates (Tempelman and Gianola 1994; Sorensen and Gianola 2002; Bolker

*et al.*2009; Fong

*et al.*2010). A popular approach is best linear unbiased prediction (BLUP) (Henderson 1950) for calculating breeding values (Wilson

*et al.*2009). However, BLUP ignores all the uncertainty associated with the estimation and are not suitable for hypothesis testing in evolutionary questions (Postma 2006; Wilson

*et al.*2009; Hadfield

*et al.*2010). Another approach is to perform modeling in a Bayesian framework. All parameters are then considered random variables, and it is (in theory) straightforward to account for all uncertainty jointly in parameter estimates. This solves the problems making inference for non-Gaussian traits (Tempelman and Gianola 1994; Fong

*et al.*2010) and accounting for estimation uncertainty in the breeding values. Further, Bayesian modeling also solves many of the issues regarding analysis of breeding values discussed in Postma (2006), Wilson

*et al.*(2009), and Hadfield

*et al.*(2010) as both breeding values and functions of breeding values (

*e.g.*, mean breeding values over hatch years) are considered random variables, and hence both uncertainty and dependencies are accounted for. This flexibility of the Bayesian framework has made Bayesian animal models increasingly popular. They have been used in animal breeding since early 1990s, whereas they only recently have been introduced to evolutionary biology (Kruuk

*et al.*2008; O’Hara

*et al.*2008; Ovaskainen

*et al.*2008; Hadfield 2010; Steinsland and Jensen 2010). See Gianola and Fernando (1986) and Sorensen and Gianola (2002) for a discussion of Bayesian animal models.

Except in a few special cases, Bayesian models do not have closed form analytic expression for quantities of typical interest, *e.g.*, posterior means. Hence, numerical approximations are needed. The traditional approximation procedure for Bayesian models is Markov Chain Monte Carlo (MCMC) (Sorensen and Gianola 2002, Van De Wiel *et al.* 2013). MCMC is a very flexible methodology that can be used to make inference for any Bayesian model, and we can get posterior estimates for any random variable or parameter; marginally, jointly, or functions of them. Setting up a good MCMC algorithm (quick convergence, good mixing, and computationally fast) and evaluating it (convergence and mixing) is challenging for a nonspecialist. Recently, this has improved for animal models as there are now packages available for doing inference for these models with MCMC in R (MCMCglmm; Hadfield 2010) and in BUGS (Lunn *et al.* 2000).

For hierarchical latent Gaussian Markov random field models, a nonsampling-based numerical approximation procedure, the Integrated Nested Laplace Approximation (INLA) has recently been introduced (Rue *et al.* 2009). Using INLA we can calculate marginal posteriors for all parameters and each random effect, as well as the posterior for linear combinations of random effects. INLA is based on direct numerical integration instead of simulations. Rue and Martino (2007) show for several models and datasets that INLA is much faster than MCMC and more accurate for a given computation budget. Faster inference encourages applied researchers to explore more models. Furthermore, this also opens new opportunities to do simulation studies, which for example can be used to explore identifiability issues and to set up tests of specific hypotheses. To perform model selection between GLMMs is not a trivial task (Skrondal and Rabe-Hesketh 2004), and using difference in deviance information criterion (DIC) has been questioned (Fong *et al.* 2010). We suggest using a simulation study to evaluate whether DIC is an appropriate measure for model selection. Fast simulation and inference methods are essential for simulation studies to be computationally feasible. INLA has been used in several fields of statistics, *e.g.*, survival analysis (Martino *et al.* 2011), for spatial GLMM (Eidsvik *et al.* 2009) and in disease mapping (Roos and Held 2011, Schrödle *et al.* 2011). This paper contributes to easier and faster Bayesian inference for both Gaussian and several non-Gaussian animal models by demonstrating that these models fit the INLA-framework and by providing an R-package, AnimalINLA, for doing the inference.

In the section *Materials and Methods*, we introduce the data used in the case studies. Then we briefly revise relevant requirements for using INLA and the possibilities INLA gives, and fully specify the animal models we use. We also present a framework for simulation based testing of the ability of difference in DIC to choose between models with and without genetic effects. Next, results from the synthetic case studies and the house sparrow case studies are presented. Inference is carried out using INLA and for some cases results are compared with MCMC. The article ends with a *Discussion* and *Conclusion*, where the results as well as opportunities and limitations of the INLA framework in quantitative genetics are discussed.

## Materials and Methods

### Data

For the case studies we use data from a natural metapopulation of house sparrow (*Passer domesticus*) on six islands off the coast of Helgeland, Northern Norway (66°N, 13°E). From adults and juveniles (*i.e.*, birds born the same summer) a small blood sample was collected and from adults several morphological traits were measured (including bill depth). The blood samples were used to determine genetic parenthood, and a genetic pedigree for the birds on the study islands could be established. This study system has many qualities for providing data on morphology and fitness-related traits as more than 90% of all birds on the six main study islands were individually ringed. Intensive observation and capture protocols each year gave good estimates of the lifespan of individual birds (a bird was considered dead when it was no longer captured or observed). For a more thorough description of the field work, study area and genetic parenthood analyses, see (Ringsby *et al.* 2002; Jensen *et al.* 2008; Pärn *et al.* 2009) and references therein.

For all case studies we used 1993 to 2002 as our study period, and we used the same pedigree, which consisted of the *n _{p}* = 3574 individuals that were present on the six main study islands in this period. The pedigree spanned up to seven generations. For our case studies we used individual data on (1) bill depth, (2) breeding season success, and (3) average reproductive intensity (ARI). For all birds, sex, hatch year, and hatch island were available. The animal model implicitly assumes that missing phenotype observations are missing at random, and hence we (implicitly) have assumed this. However, if the process behind the missing data are connected to the trait of interest, this could result in biased inference of additive genetic variance (Hadfield 2008).

In case study one we considered bill depth. This trait was measured each year (*i.e.*, age) for most individuals over the course of a lifetime. The proportion of individuals that have more than one measurement was 0.4, ranging from two to nine measurements in total. Bill depths were approximately Gaussian distributed (see Supporting Information, Figure S1), and we have measurements for *n _{d}* = 1025 birds. Many individuals in the pedigree had missing data for this trait because the bird did not survive until it was 1 year of age. We standardized the data to have mean 0 and variance 1.

In case study two, we considered breeding season success. If at least one of the offspring of an adult bird produced in a given breeding season survived until recruitment, we defined its breeding season a success. A recruit is an offspring that survives up to adult age, *i.e.*, 1 year of age in the house sparrow. Otherwise the breeding season was a failure. The breeding season could be a failure either because the bird did not produce any offspring, or because all its offspring died before recruitment. The data consist of pairs of values (*n _{i}*,

*y*), where

_{i}*n*is the number of breeding seasons individual

_{i}*i*had during the study period (

*i.e.*, it was alive and adult) and

*y*is the number of successful breeding seasons,

_{i}*y*≤

_{i}*n*. Individuals that died before their first breeding season (did not recruit) or that emigrated to an island not among the six main study islands have no data. There are

_{i}*n*= 1182 individuals with data. Of these approximately 71% did not have any successful breeding seasons.

_{d}In case study three, we considered data on ARI, *i.e.*, the average number of recruits an individual produced over its lifetime. Data take the form (*n _{i}*,

*y*) where

_{i}*n*is identical to

_{i}*n*in case study two, and

_{i}*y*is the total number of recruits produced in the study period. For this trait we had data for the same

_{i}*n*= 1182 individuals as in case study two.

_{d}*y*ranged from 0 to 10, with mean 0.64. 71% produced no recruits, and about 46% of the 344 individuals that produced one or more recruits produced only one. The datasets are available in File S5.

_{i}### Latent Gaussian models and INLA

In this section we give a brief introduction to latent Gaussian models and how INLAs can be used to make approximations for posterior marginals for these models. In general, latent Gaussian models are hierarchical models in which we assume a *n _{p}*-dimensional latent field

**to be point-wise observed through**

*x**n*≤

_{d}*n*data

_{p}**, . The latent field**

*y**x*includes both random and fixed effects and is assumed to have a Gaussian density conditional on hyperparameters

**:**

*θ*_{1}**|**

*x***∼ (0,**

*θ*_{1}

**Q**^{−1}(

**)).**

*θ*_{1}The data ** y** are assumed to be conditionally independent given the latent field

**and, possibly, some additional hyperparameters**

*x**θ*

_{2}. The model definition is completed by assigning a prior density to the hyperparameters

**= {**

*θ*

*θ*_{1},

*θ*_{2}}. In addition, some linear constraints of the form

**=**

*Bx**e*, where the matrix

**has rank**

*B**k*, may be imposed (Rue

*et al.*2009).

INLA provides a recipe for computing in a fast and accurate way, approximations to marginal posterior densities for the hyperparameters and for the latent variables . Such approximations are based on a smart use of Laplace or other related analytical approximations and of numerical integration schemes. As a by-product of the main computations INLA can also compute the DIC (Spiegelhalter *et al.* 2002). DIC is calculated as the expectation of the deviance over the posterior distribution (*E*^{x}^{,}**^{θ}**(

*D*(

**,**

*θ***))) plus the effective number of parameters (**

*x**p*), DIC =

_{D}*E*

^{x}^{,}

*(*

^{θ}*D*(

**,**

*θ***)) +**

*x**p*. For the class of latent Gaussian models INLA is constructed for the deviance can be expressed as , where the sum is over all observations. The posterior mean of the deviance can therefore be calculated using INLA (Rue

_{D}*et al.*2009).

*p*is approximated to

_{D}*n*−

*tr*{

**(**

*Q**)*

**θ**^{m}***(**

*Q**)*

**θ**^{m}^{−1}}, where

*n*is the number of observations and

*denotes the posterior median of*

**θ**^{m}*π*(

**|**

*θ***). It is worth to note that another common definition of DIC is to use the posterior mean (instead of median) for the parameters in the deviance. This definition is used in MCMCglmm.**

*y*For the INLA methodology to work in a fast and efficient way, latent Gaussian models have to satisfy some additional properties. First, the latent Gaussian model ** x**, often of large dimension, admits conditional independence properties. That is, it is a Gaussian Markov random field (GMRF) with a sparse precision matrix

*Q*(Rue and Held 2005). The efficiency of INLA relies, in fact, on efficient algorithms for sparse matrices. Second, because INLA needs to integrate over the hyperparameter space, the dimension of non-Gaussian

**should not be too large, say ≤ 14, due to the numerical integration scheme and optimization methods used. Finally, each data point**

*θ**y*depends on the latent Gaussian field only through the linear predictor

_{i}*η*=

_{i}*g*(

*μ*) where

_{i}*g*(⋅) is a known link function and

*μ*=

_{i}*E*(

*y*|

_{i}**,**

*x***),**

*θ**i.e.*,

*π*(

*y*|

_{i}**,**

*x***) =**

*θ**π*(

*y*|

_{i}*η*,

_{i}**). INLA presents several advantages over MCMC based inference: it provides accurate results in just a fraction of the time needed by smart MCMC algorithms, and it does not require convergence diagnostics. Moreover, the R-INLA package (available at www.r-inla.org) makes inference from GRMF models using the INLA methodology easy.**

*θ*### Animal models

In this section we show that animal models are latent GMRF models, which fits into the INLA framework (see the section *Latent Gaussian models and INLA*). Moreover, we describe in detail the different versions of animal models for which we are interested. Because the focus of this article is to emphasize INLA as a method of inference for animal models, the models in the case studies and synthetic studies are kept simple. For a more in depth introduction to animal models, see (Sorensen and Gianola 2002).

In general, an animal model is a generalized linear mixed model; the observed trait *y _{i}*,

*i*= 1, …,

*n*belongs to an exponential familywhere the expected value

_{d}*μ*=

_{i}*E*(

*Y*) is linked to a linear predictor

_{i}*η*through a known link function

_{i}*g*(⋅), so that

*g*(

*μ*) =

_{i}*η*. The linear predictor

_{i}*η*accounts for the effects of various covariates and the breeding value in an additive way; (1)where is an intercept, are

_{i}*fixed effects*,

*u*individual

_{i}*i*’s breeding value,

*ε*its residual effect, and a known incidence vector. The

_{i}*fixed effects*(in a frequentist’s framework) account for group-specific effects such as

*e.g.*, sex, year of birth, and locality or subpopulation. In a Bayesian framework all parameters are treated as random variables, but out of convenience we refer to

**s as fixed effects. The breeding values are genetically linked random effects also known as additive genetic effects. The residual effects are unstructured Gaussian random effects, often named the environmental effect in quantitative genetics. We assign a Gaussian prior to**

*β***: , where**

*β***is the identity matrix. The residual effects are , where is often referred to as environmental variance.**

*I*The breeding values for the population, , are assumed to have a dependency structure corresponding to the pedigreewhere ** A** is the relationship matrix and is the additive genetic variance (see

*e.g.*, Lynch and Walsh 1998, Sorensen and Gianola 2002). A GMRF is a multivariate Gaussian model with a conditional independence structure, also known as a Markov structure. The pedigree imposes a Markov structure. If we are interested in individual

*i*’s breeding value, and we know its parents, offspring and the other parent(s) of its offspring, other individuals’ breeding value do not give us any extra information. Because of the fact that the breeding values forms a GMRF, the inverse of the relationship matrix,

**, is a sparse matrix (Steinsland and Jensen 2010).**

*A*^{−1}**can be calculated from the pedigree (Quaas 1976).**

*A*^{−1}Note that there might be more individuals in the pedigree than individuals with observations, *n _{d}* ≤

*n*, and we have assumed an indexing such that

_{p}*u*corresponds to

_{i}*y*. The hyperparameters are assigned inverse gamma priors. Furthermore, to avoid identification problems we include a common intercept and constrain the levels of each factors to sum to zero (see Steinsland and Jensen 2010).

_{i}The animal model as described previously is a latent GMRF model where the latent field is ** x** = (

**,**

*β***) and the hyperparameter vector**

*u**θ*includes the variances and, possibly, the parameters in the likelihood function. The precision matrix for the latent field

*x*is sparse because the inverse of

**is sparse. Moreover, the likelihood of each data point depends on the latent field only through the linear predictor**

*A**η*defined in equation 1. Therefore, INLA can be applied to the animal model.

_{i}In our analyses we might be interested in marginal posterior for individual breeding values, *u _{i}*, fixed effects

**, the additive genetic variance , the residual variance , the heritability**

*β**h*

^{2}, or to evaluate the model using DIC. The heritability is loosely speaking the proportion of the variability the genes account for in a phenotypic trait. Precise definitions of heritability are given in subsequent subsections. In addition, it might be interesting to look at linear combinations of breeding values , where

*w*are weights, for example the mean of breeding values for different cohorts.

_{i}#### Animal model for Gaussian data:

For many continuous traits, such as the bill depth of house sparrows, it is natural to assume a Gaussian likelihood with an identity link function, *η _{i}* =

*μ*. The animal model can then be written as: , where the linear predictor is modeled as in (1) and the variance is the variance of the residual effects. The model can be formulated in the INLA framework with likelihood and latent field .

_{i}In many datasets there are repeated measurements for some individuals. A common modeling approach is then to include an individual specific random effect *ind _{i}* for each individual. Let

*y*denote measurement

_{ij}*j*of individual

*i*. The likelihood is , and the latent field . This redefines the variance interpretation, and we can interpret as a special environmental variance (variation of the individuals’ trait values through life), and as the measurement error or unexplained (environmental) variance (Lynch and Walsh 1998).

In general, and in the Gaussian case, the narrow sense heritability is defined as the proportion of the phenotypic variance that is caused by additive genetic variance (Lynch and Walsh 1998) (2)Although it is easy to use the INLA methodology to compute posterior marginals of hyperparameters, posteriors for functions of more than one hyperparameter, *e.g.*, *h*^{2}, become computationally inconvenient. This can be solved by reformulating the model in the INLA framework (see File S1 for this model formulation, and how to use it).

#### Animal model for binomial data:

With binomial data, the animal model is defined as: *y _{i}* ~ Bin(

*n*,

_{i}*p*)

_{i}*i*= 1, …,

*n*, where

_{d}*n*is the numer of trials and

_{i}*p*is the probability of success. Moreover, we assume a logit link function, so that the linear predictor is defined as: . The linear predictor is then modeled as in (1). In the binary case (

_{i}*n*= 1) the variance of the non-structured random effect is confounded with the link, and is not identifiable (Sorensen and Gianola 2002) because the individual effects are already accounted for through the link and the likelihood. Therefore, we omit

_{i}*ε*from the linear predictor and use this linear predictor for all binomial models

For binomial data, it is not immediately obvious how to define the heritability of the trait. The most common definition is derived from the idea that there exists a latent (unobserved) continuous trait called liability *l _{i}* such that we register a success if

*l*< 0 and a failure if

_{i}*l*> 0 (Dempster and Lerner 1950). The definition of heritability depends also on the type of the link function and in the case of the logistic function it is (4)were is the variance of a logistic variable (see Vazquez

_{i}*et al.*2009). Note that the heritability on the latent scale does not correspond to the proportion of explained variance in the phenotype,

*e.g.*, the binomial data. For a discussion on heritability for non-Gaussian traits, see (Dempster and Lerner 1950, Visscher

*et al.*2008). The binomial animal model is a latent Gaussian model with only one non-Gaussian hyperparameter, . The heritability, as defined in (4), is a function of only one random variable, , and can therefore easily be calculated from ’s marginal posterior distribution.

#### Animal model for (zero-inflated) Poisson data:

Count data are often modeled as Poisson distributed: *y _{i}* ~ Poisson (

*μ*) with

_{i}*μ*=

_{i}*E*, where

_{i}λ_{i}*E*is the known exposure,

_{i}*e.g.*, the lifetime, and

*λ*is the intensity,

_{i}*e.g.*, the annual reproductive success. We assume an exponential link function

*η*= log(

_{i}*λ*), and model the linear predictor

_{i}**as in (3).**

*η*Datasets which are almost Poisson, but have too many zero-observations, often occur. Then a zero-inflated Poisson (ZIP) distribution might be useful. ZIP models are a mixture of a Poisson distribution and a distribution with point mass one at zero. There are several versions of zero-inflated Poisson, we will use *ZIP*(*p*, *μ _{i}*) defined as: Prob(

*y*|…) =

*p*× 1

_{[}

_{y}_{=0]}+ (1 −

*p*) × Poisson(

*y*;

*μ*), where 1

_{i}_{[}

_{y}_{= 0]}is an indicator function and Poisson(

*y*;

*μ*) indicates the Poisson likelihood with mean

_{i}*μ*, and

_{i}*p*is the proportion of extra zeros. Poisson and zero-inflated Poisson animal models are latent Gaussian fields with hyperparameter vectors and , respectively.

In the Poisson case it has been proposed that the heritability on the log scale can be defined as (Foulley *et al.* 1987, Matos *et al.* 1997, Vazquez *et al.* 2009) (5)where *λ* is the average intensity; .

The heritability (5) is then a function of one hyper-parameter and the random variable *λ* which is a linear combination of functions of predictors *η _{i}*. Such a quantity is (at least currently) not possible to compute using R-INLA. An approximated estimate of

*h*

^{2}can be computed by using a point estimate for

*λ*together with the marginal posterior of . The point estimate can either be calculated directly from data, or by plugging in point estimates for the predictors

**. With this model we calculate the heritability of the intensity,**

*η**e.g.*, ARI, although the data are individual lifetime reproductive success. If the heritability of lifetime reproductive success (LRS) is of interest, this can be estimated by setting the exposure

*E*= 1 (and only using individuals that are uncensored at either end of the study period).

_{i}#### Simulation-based evaluation of DIC:

We often want to check whether an additive genetic effect should be included in the model or not. In a classical approach to hypothesis testing, this corresponds to a null hypothesis of no genetic effects (no heritability), and an alternative hypothesis of some genetic effect (heritability). This can be seen as a model selection between a model without genetic effects and an animal model. That is, in our formulation from the section *Animal Models*, between a model with some likelihood and latent field, (6)or

To evaluate the ability of the difference in DIC to correctly choose between these models we suggest the use of simulations. We use the pedigree, explanatory variables, priors, and missing structure of the dataset and a model we want to compare. To find an estimate of the probability of type I error, concluding that there are genetic effects when in truth there is none, we sample *S* new data sets from a model without genetic effects, *i.e.*, with (6), and impose the same missing structure as in the original data set. For each of these *S* data sets we fit both models, *i.e.*, with and without genetic effects, and find the difference in DIC, ΔDIC = DIC model (6) − DIC model (7). The resulting *S* values of ΔDIC is an approximation to the sampling distribution of ΔDIC under the null hypothesis. If we use the recommended limit of ΔDIC > 10 to reject the null-hypothesis, we can find an estimate of the corresponding significance level from the proportion of the *S* ΔDIC values larger than 10. We can also chose a significance level *α*, and find the corresponding limit of ΔDIC from the simulations.

The other important quantity regarding hypothesis tests is the power function of the test, *i.e.*, the probability the *H*_{0} is rejected when there are some genetic effects. To estimate the power for a specific value of or *h*^{2}, we follow the same simulation approach, except that simulations are now done from a model with genetic effects, *i.e.*, from (7) with as chosen. The proportion of these *S* Δ*DIC* values larger than our chosen limit is an estimate of the power for a given .

## Results

### Synthetic case studies

In this section we illustrate the INLA methodology using a series of synthetic case studies for the models (described in the section *Animal models*). We report here results for the Gaussian and the Binomial model. For corresponding results for the Poisson model see Table S1.

To make our simulated data set as realistic as possible we do the following: first, we simulate data based on the pedigree of the house sparrow dataset with *n _{p}* = 3574 individuals (as described in the section

*Data*). Second, we replicate in the simulated data set the same missing data structure that we find in the house sparrow data set. Inference is done using the AnimalINLA package. See File S3 for R code. As priors for , , and we use inverse gamma distribution InvGamma(

*a*,

*b*) (parametrized such that it has mean and variance ). InvGamma(0.5, 0.5) for and and InvGamma(

*e*,

*e*

^{−10}) for .

#### Synthetic Gaussian case study:

In our first experiment we simulated Gaussian data from: (8) (9)where , and ** A^{−1}** is computed from the house sparrow pedigree.

We simulate data for *β*_{0} = 0 and values of and between 0 and 1 such that . Moreover, we assume as missing all measurements that are missing for bill depth in the house sparrow data set. We follow Steinsland and Jensen (2010) and fit the model assuming a sum to zero constraint on the breeding values, . In this experiment we are interested in the variance parameters and use the model formulation described in the section *Animal model for Gaussian data* (*i.e.*, MF1 in File S1).

Figure 1A shows the estimated posterior mean together with standard deviations, and the 95% credible interval (CI) for and . The posterior means are quite close to the true values of and , with small standard deviations and 95% CIs that contain the true value. However, for small values for (<0.1) the posterior mean of the genetic variance seems to be systematically different from the parameter value used in the simulations. We will refer to this as a systematic error. It is caused by influential priors (discussed in the house sparrow case studies). As the likelihood is Gaussian, the Laplace approximation is exact and hence its accuracy comes from the numerical integration scheme.

For each simulated data set we also fit a model without genetic effect, hence where the model in (8) and (9) simplifies to: (10) (11)We now test, using ΔDIC whether a model with or without genetic affects will be chosen. The ΔDIC results are presented as stars (∗) in Figure 1B, and we find that using a limit ΔDIC > 10 we chose the animal model for .

To evaluate the ability of ΔDIC to choose between models, we use the simulation methodology suggested in the section *Simulation-based evaluation of DIC*. We use the same sets of as in the synthetic case study and simulated *S* = 100 synthetic data sets for each parameter sets. Boxplots of the corresponding ΔDIC are found in Figure 1B, whereas the power of the test for the different sets of are given in Figure 1A. We first notice that the significance level (the power for ) is approximately 0.18. The power rapidly increases, and for it is 0.77, and already for it is 0.98. Hence, for Gaussian data with pedigree and missing structure as the house sparrow case study using difference in DIC as model selection criteria we have a good chance of identifying additive genetic variance above 0.1.

#### Synthetic binomial case study:

Binomial data can be challenging to analyze, especially when the number of trials *n _{i}* is very low (Fong

*et al.*2010). To analyze the performance of INLA for binomial data we have carried out different simulation studies and compared the estimates obtained with INLA with those obtained using MCMC (MCMCglmm; Hadfield 2010). We simulate data from the model

*y*|

_{i}*p*~ Bin(

_{i}*n*,

_{i}*p*) with a logit link function

_{i}*η*= logit(

_{i}*p*) = α +

_{i}*u*. Where , and

_{i}**is computed from the house sparrow pedigree. We simulate data for**

*A*^{−1}*α*= 0 and values of such that the corresponding heritability, computed as in equation (4), varies between 0 and 1.

In our first experiment we let *n _{i}* = 1, ∀

*i*= 1, …,

*n*, hence we have binary data for all the individuals in the pedigree. This case is, in general, particularly difficult, because with no replicates for any of the individuals the genetic variance is not identifiable (Sorensen and Gianola 2002). When we look at the posterior estimate for , we see that the performance of INLA is quite bad (see Figure 2A). Looking at Figure 2A, we find that the posterior of the heritability differs between MCMC and INLA, MCMC follows the true value, INLA does not. INLA is based on a Gaussian approximation of the log-likelihood functions which, in this case, has a very non-Gaussian distribution, and the Laplace approximation is poor. Although MCMC follows the true value is has quite large credible intervals. The dependency structure induced by the house sparrow pedigree is not strong enough to allow for a precise estimation of the genetic variance. In practice, it is almost impossible to distinguish between cases with high and low heritability of the binary trait. We further find that for small heritability the estimates have systematic errors, as in the Gaussian case.

_{p}The performance of INLA improves very quickly with increasing number of trials. In our second experiment we let *n _{i}* = 2, ∀

*i*= 1, …,

*n*, hence we have two trials for each individual in the pedigree. In this case, the presence of replicated measures makes it possible to estimate the genetic variance more accurately. Figure 2B shows that the posterior means computed by INLA are very close to those computed using MCMC and close to the true value of

_{p}*h*

^{2}. We still see a small systematic error for small values of

*h*

^{2}but not such that it would be problematic in a real data scenario. Even better estimates are obtained in the third experiment where the number of trials

*n*changes from individual to individual in the pedigree and is randomly sampled between 1 and 9 (see Figure 2C).

_{i}In the last experiment the number of trials *n _{i}* is as in the house sparrow breeding season success data set (see the section

*Data*). Moreover, in the simulated data we reproduce the same missing structure as in the real data set. In this experiment the number of trials is sampled uniformly between 1 and 9, and there are 1182 individuals with data. That is, for more than 65% of the individuals in the pedigree the trait under consideration was not recorded. Results shown in Figure 2D, are similar to those for the two previous cases. The estimates seem to be rather accurate with a small systematic error for very small values of the heritability. Moreover, results from INLA agree well with those from MCMC. In this experiment we have larger CI around the posterior mean when compared to the one in Figure 2C. This is attributable to the presence of missing data.

To test prior sensitivity, we performed a sensitivity analysis for all three likelihoods and for both no heritability (*h*^{2} = 0), and high heritabilities. Each dataset was analyzed with five different priors for and, when relevant, ; InvGamma(*a*, *b*) with a = b = 0.0001, 0.01, 0.5, 1, 10. The sensitivity analyses are described in File S2 and results are shown in Figure S3. The general findings are that for low heritabilities the results are very prior sensitive, while for higher heritabilities only extremely informative priors (*a* = *b* = 10) change the posterior considerably.

### House sparrow case studies

In this section we analyze the data introduced in the section *Data* using the animal models in the section *Animal models*. To perform inference we use INLA, described in the section *Latent Gaussian models and INLA*. We have three case studies; bill depth, breeding season success, and ARI. For each case study, we first do model comparison using DIC to choose which fixed effects (sex, hatch year, and hatch island) and random effect (additive genetic effect) to include in our model. For the best model we do further analysis according to the chosen model and the case study. This includes estimating parameters, heritability and mean breeding values for each cohort.

For all models we use these priors: , and (when needed) . To choose the best model, we start with the full model and remove one variable at the time in a stepwise manner. In each step all nested models are examined, but only the one with lowest DIC (*i.e.*, the best one at each step) is reported in Table 1.

### Bill depth

Bill depth is a Gaussian trait, and we use the animal model described in the section *Animal model for Gaussian data*. We first looked at bill depth when first (time) measured and age of the individual at that time. The full model can be written as; *η _{i}* =

*β*

_{0}+

*β*

_{sex(}

_{i}_{)}+

*β*

_{year(}

_{i}_{)}+

*β*

_{island(}

_{i}_{)}+

*β*

_{age}

*x*

_{age}(

*i*) +

*u*, where

_{i}*sex*,

*year*, and

*island*are group-level factors and

*x*

_{age}(

*i*) is a linear covariate with same prior as for

**. The results from the model selection procedure are presented in Table 1. We see that the full model without age as a linear effect turns out to be the best; (12)Our further analyses for bill depth are based on this model.**

*β*We find the marginal posterior distribution for the variances; has posterior mean 0.24 (SD 0.05) and 95% CI 0.15−0.35. For we get a posterior mean 0.47 (SD 0.05) with 95% CI 0.39−0.58. We also calculate the marginal posterior of *h*^{2} using MF2 (see File S1); mean 0.35 (SD = 0.07) with 95% CI 0.21−0.48. The posteriors for , , and *h*^{2} are plotted in Figure 3, A and B.

To explore interesting features and evolutionary processes that may influence our study system we investigate trends in the breeding values over years (Sorensen *et al.* 1994), by estimating linear combinations of breeding values. We find the posterior distribution of average breeding values for each hatch year *year* (*i.e.*, cohort); , where *n _{year}* is the number of individuals with hatch year

*year*, and the sum is over all these individuals. We fit the model specified in equation 12 (which includes hatch year as a factor), and the results are given in Figure 3C (posterior of

*β*) and Figure 3D (posteriors of mean breeding values for each cohort). We also calculate the posterior of the difference in average breeding values between the first (1993) and last (2002) cohorts in the study;

_{year}*a*=

_{diff}*a*

_{1993}−

*a*

_{2002}, which gave a posterior mean −0.025 (SD 0.068) and 95% CI −0.161 – 0.108. The posterior marginal of the difference is given in Figure S5. From Figure 3, C and D we see that almost all the differences in the phenotype seems to be explained by the year specific fixed factor

*β*, and from Figure 3D there seems to be no trends in the breeding values, and hence no microevolution is going on. This is further supported by the posterior of the difference in breeding values between 1993 and 2002 cohorts

_{year}*a*

_{diff}|

*y*, where 0 is well within a 95% credible interval.

We also model an individual specific random effect *ind _{i}*, to include repeated measurement

*j*for individual

*i*, with latent field

*η*=

_{i}*β*

_{0}+

*β*

_{sex(}

_{i}_{)}+

*β*

_{year(}

_{i}_{)}+

*β*

_{island(}

_{i}_{)}+

*u*+

_{i}*ind*, where

_{i}*ind*is a independent random effect and . When including individual as a random effect, we find that the marginal posterior distribution for the variances; has posterior mean 0.25 (SD .05) and 95% CI 0.17−0.37. For we get a posterior mean 0.42 (SD 0.02) with 95% CI 0.38−0.46. For we get the posterior mean 0.13 (SD 0.04) and 95% CI 0.07−0.21.

_{i}### Breeding season success

Breeding season success is the number of breeding seasons that is a success, *i.e.*, results in at least one recruit. These data are in nature binomial, and are analyzed using the animal model in the section *Animal model for Binomial data*. We start with the full model: (13)where the elements are described under equation 12. Results from the model selection procedure are reported in Table 1. We find that the best model does not include linear additive genetic effects, and hence that the inherited part of breeding season success probably is very close to zero.

However, if we use the full model (equation 13) to estimate we get posterior mean 0.13, SD 0.05, and 95% CI 0.07−0.24. Furthermore, using (4) gives posterior heritability with mean 0.04 (SD 0.01) and 95% CI 0.02−0.07. These estimates are similar to those from the synthetic dataset when heritability is equal or close to zero in the section *Synthetic binomial case study* (Figure 2).

### Average reproductive intensity

ARI for an individual is the average number of recruits it produces during its lifetime. Thus, we are modeling data on LRS in such a way that lifetime is controlled for and the likelihood of any estimate of heritability will be for the ARI. This is count data, and we analyzed this trait by using the animal model in the section *Animal model for (zero-inflated) Poisson data* with *E _{i}* =

*n*, where

_{i}*n*is the number of breeding seasons individual

_{i}*i*was alive during the study period. Because of the large number of zeros, we suspected that we needed a model that accounts for overdispersion. Therefore, we first fitted the full model as in equation 13 with two different likelihood models; Poisson (DIC = 2421.465) and zero-inflated Poisson (DIC = 2275.140). Because zero-inflated Poisson gave lowest DIC, we proceeded with this likelihood when choosing which fixed and random effects to include in the model. Also the histogram of LRS divided by lifespan indicated a zero-inflated Poisson distribution (see Figure S6). The model with lowest DIC is the full model (equation 13), although very close in DIC to the model without additive genetic effects (Table 1). We proceeded with the full model in our analysis of ARI. Hence, the results suggest that annual reproductive intensity might be heritable. Accordingly, the posterior for is 0.11 (SD 0.03) with 95% CI 0.06−0.18. To obtain the posterior of

*h*

^{2}defined as in (5), we plugged in the point estimate . This gives a posterior mean of the heritability of 0.03 (SD 0.01), 95% CI 0.02−0.05. Posterior distributions for and

*h*

^{2}are given in Figure 4.

The models used in the case studies can be extended. For example when modeling breeding season success, we might want year as a specific explanatory variable. This can be done by modeling it as repeated binary trait (each breeding season). When one uses binomial and Poisson likelihoods overdispersion is often a challenge. Common solutions are to include a random effect to the latent field or to use beta-binomial and negative binomial likelihoods, respectively, instead. All these options are available in INLA and R-codes for how to implement a random effect for Gaussian, binomal and Poisson likelihoods are presented in File S4. For small values of we observed systematic errors and prior sensitivity in all case studies. Priors for variances are discussed in Gelman (2006), and this topic should be further investigated, but it is outside the scope of this work.

### Computation time

To compare the computation time used by INLA and MCMC, inference for the Gaussian case study and for the synthetic Gaussian case study for a large pedigree was performed with both MCMCglmm and INLA. All computation times reported are for a dual-core 2.67-GHz laptop. We visually inspect the posteriors of and of INLA and MCMCglmm for an increasing number of iterations ((10.000, 100.000, 200.000) for the Gaussian case study and (10.000, 100.000, 500.000) for the synthetic Gaussian case study). MCMCglmm gave the same estimates as INLA (see Figure S2 and Figure S4). For the Gaussian case study, the computation time for INLA was 7 sec for both model formulations for Gaussian data. For 200,000 iterations MCMCglmm used 17 min to achieve about the same accuracy as INLA (the Monte Carlo error is, however, still visible).

To demonstrate the fast inference of INLA, we created a large pedigree from the existing house sparrow pedigree by merging 28 identical pedigrees and simulated data based on this pedigree with *n _{p}* = 100,072 individuals. We simulated Gaussian data for

*β*

_{0}= 0 and for and as in section

*Synthetic Gaussian case study*, with data for all individuals as in the pedigree. The computation time for INLA was 7.4 min. To achieve approximately the same accuracy as INLA, 500,000 iterations in MCMCglmm were needed; this took 17.9 hr, which is 145 times longer than INLA. As shown in Figure S4, we found that even for 500,000 iterations, the Monte Carlo error was still visible.

## Discussion

We compared inference obtained for animal models by using INLA and MCMC. The general conclusion is that INLA is a fast and accurate approximation method that gives us the opportunity to perform simulation studies to explore models and identifiability issues. However, INLA is less flexible than MCMC methods, and we experienced this in case study three (ARI-data, Poisson likelihood), for which we were not able to calculate the heritability as defined in (5) using INLA. Although an approximated estimate could be obtained, in the Gaussian case heritability estimates can be obtained with INLA by using a tailored reparametrization. In the synthetic case study of binary traits, we experienced that INLA performed poorly for the additive genetic variance (for the pedigree we have used). Hence, we recommend that one should not use INLA for a binary animal model unless a simulation study suggests that INLA gives correct results for the pedigree and missing data structure of the case in question.

Our study of average breeding value for bill depth (Figure 3D) did not indicate any change across cohorts. The posteriors of the levels of the factor *year*; (*β _{year}*, Figure 3C) suggests that the observed phenotypic change is influenced by changes in the environment. Note that the estimates of linear combinations we obtain here take into account dependencies and uncertainties of breeding values and parameters. Hence, they do not suffer from the same systematic errors as when using regression on BLUP estimates obtained from REML-based analyses as discussed in Wilson

*et al.*(2009) and Hadfield

*et al.*(2010).

When we extend the animal model for bill depth to account for repeated measurements and use individual as a random effect, we find that decreases. Some of the variance can now be explained by the individual effects, and the residual error can be interpreted as a measurement error and/or individual changes in bill depth during a house sparrow's lifetime. The additive genetic variance did however remain the same.

The evolutionary process of selection may act on multiple (genetically linked) traits simultaneously (Lande and Arnold 1983). Hence, genetic correlations may impose constraints on the evolution of a given trait. A multivariate animal model which incorporates multiple traits simultaneously to estimate the genetic correlation between traits (*i.e.*, the additive genetic variance-covariance matrix) is therefore often of interest (Meyer 1991; Lynch and Walsh 1998; Kruuk 2004; Jensen *et al.* 2008). This extension of the animal model is in principle possible using the INLA methodology, but is limited by the number of hyperparameters and is not yet implemented in the software.

A growing interest in quantitative genetics is for instance the use of genome-based data, such as high-density single-nucleotide polymorphism, in combination with a pedigree to be able to do better predictions of phenotypes (*e.g.*, Goddard and Hayes 2009). The model by Yang and Tempelman (2012) uses pedigree information together with a single-nucleotide polymorphism model that takes into account the within chromosome dependency. With some adjustments it will fit the INLA framework, but this requires further research.

Both breeding season success and annual reproductive success are traits closely related to fitness. Fitness related traits have previously been found to be largely influenced by the environment and thus have low heritability (Merilä and Sheldon 2000). In our study the breeding season success was not found to be heritable, and annual reproductive success had very low heritability. Consequently, our results coincide with other studies in natural populations (see Jones 1987; Merilä and Sheldon 2000), finding low heritability for fitness-related traits. Note however that the environmental variance of fitness related traits usually is large, and that this will result in a low heritability of such traits even if they have some additive genetic variance (*e.g.*, Foerster *et al.* 2007). Consequently, despite their low heritability these traits may still have evolutionary potential (Hansen *et al.* 2011).

In this work we demonstrated that INLA provides a suitable methodology for performing inference for a range of animal models. We have showed that because of the fast and accurate inference of INLA, we are able to use simulation studies for large pedigrees to examine different models and demonstrated the applicability of examining the difference in DIC as a method for model selection when using simulation studies. In our case studies we considered models with additive genetic effects (breeding values ** u**), individual effects (environmental effects

**), and several fixed effects. These case studies required animal models with Gaussian, Binomial (with logit link), Poisson, and zero-inflated Poisson (with log link) likelihoods. Animal models might have a range of likelihoods. The R-INLA software also support different zero-inflated Gaussian and Binomial likelihoods, survival models (exponential, Weibull, and Cox likelihoods), Student’s**

*ε**t*, and skew-normal likelihoods (see www.r-inla.org). It is also straightforward to make inference with INLA for animal models extended with other additive random effects, such as maternal effects or litter effects, as well as covariates.

The R-package AnimalINLA has been developed for performing inference using INLA for animal models with likelihoods applied in this paper. It can be downloaded at www.r-inla.org. This package includes functionality for calculating the inverse of the relationship matrix ** A** from a pedigree and simulation of data from pedigree. Furthermore, there are tailored functions for finding posteriors for , , the heritability for Gaussian, binomial and Poisson likelihoods and linear combinations such as . These functions use R-INLA with suitable default settings. The R-INLA code is also included to give a good starting point to users who wants to make modifications,

*e.g.*, other likelihoods or more random effects. Through providing easy to use software which gives results fast we hope Bayesian animal models become accessible to a wider audience of biologists and animal breeders.

## Acknowledgments

We thank Håvard Rue for customizing the R-INLA and Håvard Rue, Gregor Gorjanc, and Andrew Finley for fruitful discussions. We also thank colleagues at the Centre for Biodiversity Dynamics for help with collecting the house sparrow data. Part of the work was performed while A.M.H. and I.S. visited the Statistical and Applied Mathematical Sciences Institute. This study was supported by Norwegian Research Council (Storforsk, Strategic University Program in Conservation Biology, and grants no. 191847 and 221956).

## Footnotes

Supporting information is available online at http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.113.006700/-/DC1

*Communicating editor: D.-J. De Koning*

- Received January 30, 2013.
- Accepted May 13, 2013.

- Copyright © 2013 Holand
*et al.*

This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.