Immune-Related Functions of the Hivep Gene Family in East African Cichlid Fishes

Immune-related genes are often characterized by adaptive protein evolution. Selection on immune genes can be particularly strong when hosts encounter novel parasites, for instance, after the colonization of a new habitat or upon the exploitation of vacant ecological niches in an adaptive radiation. We examined a set of new candidate immune genes in East African cichlid fishes. More specifically, we studied the signatures of selection in five paralogs of the human immunodeficiency virus type I enhancer-binding protein (Hivep) gene family, tested their involvement in the immune defense, and related our results to explosive speciation and adaptive radiation events in cichlids. We found signatures of long-term positive selection in four Hivep paralogs and lineage-specific positive selection in Hivep3b in two radiating cichlid lineages. Exposure of the cichlid Astatotilapia burtoni to a vaccination with Vibrio anguillarum bacteria resulted in a positive correlation between immune response parameters and expression levels of three Hivep loci. This work provides the first evidence for a role of Hivep paralogs in teleost immune defense and links the signatures of positive selection to host–pathogen interactions within an adaptive radiation.

in lakes) (Scharsack et al. 2007), which is furthermore supported by a correlation between major histocompatibility complex (MHC) genotype and foraging habitat in benthic and limnetic ecomorphs (Matthews et al. 2010). Given the recognized evolutionary importance of the immune system (Janeway et al. 2001;Rodríguez et al. 2012) and the range of available functional and theoretical knowledge, the next step would be to assess to which degree immune genes contribute to, or even trigger, macroevolutionary events such as divergence, rapid speciation, and adaptive radiation.
East African cichlid fishes are a classic example of adaptive radiation (Schluter 2000). Because of their great numbers of closely related endemic species and their high levels of phenotypic and ecological diversity, cichlids are an important model system to study the genetic basis of diversification, adaption, and speciation (Kornfield and Smith 2000;Kocher 2004;Seehausen 2006;Salzburger 2009;Santos and Salzburger 2012). Previous studies of cichlid adaptive radiations have mainly focused on the understanding of ecologically important traits (and their genetic basis), such as the feeding apparatus (Terai et al. 2002b;Albertson et al. 2005;Fraser et al. 2008;Fraser et al. 2009;Muschick et al. 2011;Muschick et al. 2012), as well as on sexually selected traits such as coloration and pigmentation (Terai et al. 2002a;Terai et al. 2003;Salzburger et al. 2007;Roberts et al. 2009). Fewer studies have addressed the evolution of the immune system or, more generally, physiology in relation to diversification and rapid speciation in cichlids (Blais et al. 2007;Gerrard and Meyer 2007;Dijkstra et al. 2011). Dijkstra et al. (2011), for example, showed that divergence in coloration is accompanied by differentiation in immune function in Lake Victoria cichlids, and divergence in alleles of the MHC has previously been proposed as trigger of speciation in Lake Malawian cichlids through MHC-mediated mate choice (Blais et al. 2007). Several genes related to the immune system, including MHC loci, have been found to show signs of positive selection in East African cichlids (Gerrard and Meyer 2007), suggesting a role for immune genes during cichlid adaptive radiations.
In this study, we focused on the function and molecular characterization of a novel family of immune genes in (cichlid) fishes, which have been implicated to have immunological and developmental functions in mammals and insects (Seeler et al. 1994;Wu et al. 1996;Torres-Vazquez et al. 2001). In a previous study that focused on a candidate gene family for neural crest-derived structures in cichlids (i.e., the endothelin family of ligands and receptors), we detected strong signatures of positive selection in a gene adjacent to one of the focal loci, the zinc finger protein human immunodeficiency virus type I enhancer-binding protein 1 (Hivep1) (Diepeveen and Salzburger 2011). Hivep1 is a transcription factor with functions in a variety of biological and developmental processes, e.g., HIV-1 gene expression (Maekawa et al. 1989;Muchardt et al. 1992;Seeler et al. 1994), in the Decapentaplegic signaling pathway important for cell fate specification during embryogenesis (Grieder et al. 1995;Dai et al. 2000;Marty et al. 2000;Torres-Vazquez et al. 2001), in V(D)J recombination of immunoglobulins (Wu et al. 1993;Wu et al. 1996), and in MHC enhancer binding (Baldwin et al. 1990;William et al. 1995). Although a single copy of this gene is found in Drosophila, mammals are typically characterized by three copies (Hicar et al. 2001;Dürr et al. 2004). Teleost fish, however, possess up to five duplicates (see Braasch et al. 2009), which is in accordance with the 3R hypothesis of a teleost-specific genome duplication event after the 2R duplications in the vertebrate linage (Sidow 1996;Taylor et al. 2003;Meyer and Van De Peer 2005;Volff 2005).
The goal of the current study was threefold. First, we characterized the signatures of selection (i.e., d N /d S ratios) in the five Hivep paralogs in 40 East African cichlid fish species to determine whether adaptive protein evolution is commonly observed in the Hivep gene family. To this end, we performed phylogenetic analyses of the Hivep loci and estimated d N /d S ratios on both codon sites and in individual cichlid lineages. Second, we examined the role of the Hivep paralogs in the immune defense in the cichlid Astatotilapia burtoni. We evaluated the functional connection between Hivep expression levels and several cellular immune parameters after an experimental vaccination with Vibrio anguillarum bacteria. This fish pathogen was chosen to simulate a novel immune challenge, as the host was expected to be immunologically naïve against these Vibrio bacteria. Finally, we examined putative pleiotropic developmental functions through analyses of cisregulatory regions to obtain insights into other functions of the Hivep paralogs in teleosts that could be linked to the observed signatures of adaptive protein evolution and related our findings to the explosive speciation events in East African cichlid fishes.

MATERIALS AND METHODS
Sampling, DNA and RNA extraction, and housing conditions Samples for the DNA analyses were collected during two expeditions to Lake Tanganyika in 2007 and 2008 using a standard operating procedure described by Muschick et al. (2012). In total, 40 different cichlid species from 14 different lineages, including all major cichlid lineages in East Africa (so-called tribes) (Muschick et al. 2012) were examined (Supporting Information, Table S1). RNA for the gene expression assays was extracted from gill, brain, and liver tissue of adult A. burtoni (laboratory strain, both sexes; see Experimental Vaccination section). DNA and RNA extractions were performed as described elsewhere (Diepeveen and Salzburger 2011), with one exception: the tissue homogenization during the RNA extraction was performed on a BeadBeater (FastPrep-24; MP). Animals being part of the experimental vaccination study were kept under standard conditions (12 hr light, 12 hr dark, 25°) in the animal facilities at the Zoological Institute in Basel before transportation to the Helmholtz Centre for Ocean Research Kiel, where they were kept under the following conditions: 14 hr light; 10 hr dark; and 25°for $38 hr before the start of the experimental vaccination.
Subsequent PCR and sequencing reactions were performed as described elsewhere (Diepeveen and Salzburger 2011). PCR products were visualized with GelRed (Biotium) on a 1.5% agarose gel. Sequences were aligned and visually inspected using Codon Code Aligner 3.7.1 (CodonCode Corporation) and exon/intron boundaries were determined based on homology with the obtained other teleost sequences. Total sequenced regions (TSR), protein-coding regions, and concatenated (TSRs of all five loci) data sets were constructed. All generated cichlid Hivep sequences have been deposited into GenBank (GenBank KF049218-KF049416) (Table S1).
Phylogenetic analyses and tests for selection pressure Phylogenetic analyses and tests for selection pressure were performed as described elsewhere (Diepeveen and Salzburger 2011;Diepeveen et al. 2013). In short, models of nucleotide substitution were chosen based on likelihood ratio tests (LRTs) conducted in jModeltest 0.1.1 (Guindon and Gascuel 2003;Posada 2008) and used in maximum likelihood searches in PAUP Ã (Swofford 2002) and Bayesian Inference in MrBayes 3.2 (Huelsenbeck and Ronquist 2001;Ronquist and Huelsenbeck 2003) for each individual paralog and for the concatenated dataset. Bootstrap analyses with 100 replicates were performed in PAUP Ã and MrBayes was run for 10,000,000 generations. Tylochromis polylepis and/or Oreochromis tanganicae were used as the outgroup in these analyses (Salzburger et al. 2002). The consensus tree based on the concatenated dataset was used as a common input tree in the subsequent analyses.
Both site and branch-site models, as implemented in Codeml, Phylogenetic Analysis by Maximum Likelihood (PAML) 4.2 (Yang 1997;Yang 2007), were used to test for selection pressure. The nonsynonymous/synonymous substitution rate ratio, v or d N /d S , the proportion of sites assigned to an v category, the p 0,1,2 , and the p and q parameters of the b distribution were estimated for all five datasets under different models. LRTs of the following model comparisons were performed to detect sites under positive selection: M1a (nearly neutral) with M2a (positive selection); M7 (b) with M8 (b and v s $1); and M8a (b and v s = 1) with M8. The comparison between M0 (one ratio) and M3 (discrete) was used as a test of variable v among sites. Posterior probabilities for site classes were calculated with the Bayes empirical Bayes (BEB) (Yang et al. 2005). Next, LRTs between the null model (v s = 1) and the alternative model (v s $1) were performed to determine if focal, or foreground, lineages evolved under non-neutral evolution. These foreground branches were chosen based on the results from the phylogenetic and PAML analyses.
Subsequent sliding window analyses (window size = 20) were performed with TreeSAAP (selection on amino acid properties) 3.2 (Woolley et al. 2003) for the four loci for which positively selected sites were observed with the PAML analyses (Hivep1, Hivep2b, Hivep3a, and Hivep3b). Amino acids were categorized based on 31 physicochemical properties to identify regions of positive selection. Selection on amino acids was subsequently screened for positive destabilizing selection by means of categorizing the substitutions into eight categories (categories 1-8) based on the magnitude of radicality (i.e., 1 is the most conservative amino acid substitutions and 8 is the most radical). The three highest categories (6-8; P # 0.001) were used as indicative of radical amino acid substitutions. Next, these substitutions were analyzed with the program SIFT (sorting intolerant from tolerant) (Ng and Henikoff 2003) to screen for possible effect on protein function.
Analyzing cis-regulatory regions The five Hivep sequences from A. burtoni were compared with the obtained teleost sequences of O. niloticus, O. latipes, T rubripes, T. nigroviridis, and D. rerio with mVISTA (Mayor et al. 2000;Frazer et al. 2004). Sequences were globally aligned with Shuffle-LAGAN (Brudno et al. 2003) and the minimum sequence similarity was set to 50%. Intragenic conserved noncoding elements were predicted and analyzed with rVISTA (Loots et al. 2002) to identify potential transcription factor binding sites.

Experimental vaccination and immune response measurements
To examine the expression patterns of the Hivep paralogs after an experimental vaccination, we exposed adult cichlid fish of the species A. burtoni to Vibrio bacteria following Roth et al. (2011). V. anguillarum was physically isolated from the stomach of freshly caught broad-nosed pipefish (Syngnathus typhle) . Strain confirmation was performed via sequencing of the 16S rRNA, recA, and pyrH loci (GenBank reference numbers provided in Roth et al. 2012). On day 1 of the experiment, fish of both sexes were randomly assigned to either the experimental treatment (12 individuals) or the control treatment (11 individuals) and injected intraperitoneally with either 50 ml heatkilled (60 min at 65°) V. anguillarum (phylotype S6M4; 10 6 cells/ml dissolved in phosphate-buffered saline (PBS), i.e., experimental treatment) or 50 ml PBS (i.e., control treatment), respectively, according to the methods of Roth et al. (2012). Fish were tagged subcutaneously with visible implant elastomer tags (Northwest Marine Technology) according to treatment and kept in a single aquarium system. After $21 hr of exposure, fish were killed with MS222 and weight and standard length were noted as in Birrer et al. (2012). Blood was collected from the caudal vessel in heparinized capillaries (Na-heparinized; Brand GMBH + Co. KG), followed by extraction of the head kidneys and spleen, which were forced through a 40-mm nylon sieve to prepare cell suspensions for subsequent cellular immune measurements. All steps were performed on ice. Cells were washed twice with RPMI medium (10 min, 600 rpm, 4°) and resuspended in a final volume of 450 ml.
The number of lymphocytes and monocytes (as proxies for immune response in the form of inflammation and/or stress to the treatment) were measured in blood, head kidneys, and spleen tissues by means of flow cytometry (FACSCalibur; Becton Dickinson) with pre-assessed cichlid-specific settings for each tissue type. The proportions of monocytes, lymphocytes, and the lymphocyte/monocyte ratio were calculated. Furthermore, the activity of the relative number of lymphocytes in the G 2-M and synthesis (S) phases of the proliferation cycle compared to the relative number of lymphocytes in the G 0-1 phase was measured by killing cells in ethanol and subsequent labeling of the DNA with propidium iodide (Sigma Aldrich) as described by Roth et al. (2011). Lymphocytes were identified by their characteristic FSC/SSC pattern (i.e., cell volume and inner complexity). Proliferating cells in the G 2-M phase were distinguished from G 0-1 and S phase cells by a more intense red fluorescence because of their higher DNA content. To test whether the obtained data were normally distributed, D'Agnostino and Pearson omnibus normality tests as implemented in GraphPad Prism version 5.0a for Mac OS X (http://www.graphpad.com) were conducted. Outliers with values outside 2 SDs from the mean were removed (i.e., up to three individuals per treatment group and tissue type).
The experiment was performed according to current national legislation of the Ministerium für Landwirtschaft, Umwelt und ländliche Räume des Landes Schleswig-Holstein (project entitled "Effects of global change on the immunological interaction of pipefish and cichlids with their natural bacteria communities"). One fish from the control treatment died during the experiment.

Gene expression assays and analyses
Gill, brain, and liver tissues of the 22 experimental animals were extracted and directly stored in RNA later (Invitrogen). RNA extraction and reverse-transcriptase were conducted as described elsewhere (Diepeveen and Salzburger 2011). Subsequent gene expression analyses were performed by means of quantitative PCR on a BioMark HD system at the Genetic Diversity Centre of the ETH Zurich, following the manufacturer's protocol. Levels of gene expression were measured in 48.48 dynamic array integrated fluidic circuits with EvaGreen DNA binding dye. Primers were designed and tested for the five focal Hivep loci, two housekeeping loci [i.e., elongation factor 1 (EF1) and ribosomal protein SA 3 (RpSA3)] (Colombo et al. 2013) and four control loci with demonstrated immune-related functions [i.e., allograft inflammatory factor 1 (AIF1), anti-inflammatory response (Watano et al. 2001); coagulation factor II receptor-like 1 (F2RL1), inflammation and immunity (Rothmeier and Ruf 2012); interleukin 10 (IL10), immunosuppression (Jankovic and Trinchieri 2007); Toll-like receptor 5 (TLR5), pathogen recognition (Janeway and Medzhitov 2002;Akira et al. 2006)] (Table S3). Data were visualized, amplification plots were checked, and outliers were removed with the Fluidigm Real-Time PCR analysis software version 3.1 (Fluidigm). Further comparative analyses were performed with the qBase PLUS2 software package (Biogazelle). EF1 and RpSA3 were used as reference targets for the multiple reference gene normalization approach as implemented in qBase PLUS2 software (Biogazelle). Three different positive controls were included in this study; RNA was extracted from whole nonexperimental A. burtoni juveniles and two mixes composed of nine samples (gill, liver, and brain tissues of three randomly chosen individuals) each for the control group and the experimental group separately. Variation between PCR replicates and deviation of normalization factors were checked and outliers with values outside 2 SDs from the mean were removed. Data were controlled for inter-run variation.
Unpaired t tests were performed for the control immune genes expression levels between the control and experimental treatment groups with the qBase PLUS2 software (Biogazelle). To test for a correlation between the expression of Hivep paralogs [i.e., quality-controlled and normalized relative quantities (CNRQ, here just RQ)] and the immune response measurements, Pearson correlations were calculated in GraphPad Prism in the control and experimental treatment groups.

Phylogenetic analyses of cichlid Hivep sequences
To examine the molecular evolutionary history of the Hivep paralogs, we performed maximum likelihood and Bayesian Inference phylogenetic analyses based on the total sequenced region per locus, and the concatenated dataset including all loci. The phylogenetic topologies of the obtained partial cichlid Hivep gene sequences and the concatenated dataset of 13.5 kb are displayed in Figure 1, A-F. Generally, the observed topologies of the gene trees, and the concatenated tree in particular, correspond with the available species trees (Salzburger et al. 2002;Takahashi 2003;Salzburger and Meyer 2004;Clabaut et al. 2005;Salzburger et al. 2005;Muschick et al. 2012), with T. polylepis, O. tanganicae, B. graueri, B. microlepis, and T. nigrifrons as most basal species, followed by the Lamprologini, the Eretmodini, and the species belonging to the "C-lineage" (Salzburger et al. 2002;Clabaut et al. 2005;Day et al. 2008) (Figure 1). As previously observed (Diepeveen and Salzburger 2011), E. cyanostictus was found at a different position within the C-lineage, whereas this species has been commonly resolved outside the C-lineage in previous studies (Salzburger et al. 2002;Clabaut et al. 2005;Day et al. 2008). Also, the relationships between the individual lineages of the C-lineage altered between the individual gene trees. Long branches were observed for T. polylepis, the lamprologines, individual lamprologine species, different branches within the ectodines in several gene trees, and for the haplochromines in Hivep2b and T. nigrifrons in the Hivep3a gene tree.

Selection pressure on sites and branches
To investigate signatures of selection pressure in the Hivep paralogs, we used site and branch-site models (Yang 1997(Yang , 2007 to obtain, e.g., nonsynonymous/synonymous substitution rate ratios (d N /d S ). The maximum likelihood parameter estimations for v, p 0,1,2 and p, and q under different evolutionary models can be found in Table 1 for all five Hivep loci. Estimations of v under the one ratio model (M0) suggest that the Hivep genes evolved under purifying selection, with v ranging from 0.093 (Hivep2a) to 0.303 (Hivep1). A small proportion of sites, 1.4% (Hivep2a) to 11.8% (Hivep3b), was estimated to have evolved neutrally (v = 1) under the neutral model (M1a). By using models that allow v to vary among sites (M2a, M3, and M8), up to 4.3% of sites were detected to have evolved with v . 1 in Hivep1, Hivep2b, Hivep3a, and Hivep3b, with more than 89.4% of sites evolving under purifying selection.
Likelihood ratio tests of several model comparisons (Table 2) were performed to detect positively selected amino acids. This approach resulted in the rejection of the null models in the M1a vs. M2a, M7 vs. M8, and M8a vs. M8 comparisons for all loci except Hivep2a. Positively selected sites were detected with the BEB for Hivep1 (16 sites), Hivep2b (2 sites), Hivep3a (18 sites), and Hivep3b (13 sites).
LRTs of the branch-site analyses were performed to test whether focal lineages evolved under non-neutral evolution. Significant LRTs were only observed for Hivep3b (Table 3), indicating that although the v ratios do vary among sites for three of the other four Hivep loci, they do not seem to vary significantly among lineages. For Hivep3b, the following two branches were observed with v . 1: the derived lineages (excluding the five basal species; P , 0.001) and the haplochromines (P = 0.031) ( Table 3 and Figure 1).

Sliding windows and amino acid substitution characteristics
To visualize regions with elevated d N /d S values and to connect such regions with the physiochemical properties of the respective amino acid substitutions, we performed sliding window analyses. The sliding window plots of Hivep1, Hivep2b, Hivep3a, and Hivep3b are depicted in Figure 2 .50% are shown, respectively, above and below the branches. Cichlid lineage names and a color key for the six cichlid lineages with more than one species included in this study are provided in the gray box in (A). Abbreviations of species names consist of the first three characters of the genus name followed by the first three characters of the species name (Table S1 shows full species names). Branch lengths of T. polylepis were shortened by 50% in all phylogenies and for T. nigrifrons in (E). z-scores for Hivep2b, and the most numerous regions with a z-score $ 3.09 observed for Hivep3a. Interestingly, not all of these retrieved regions of positive selection correspond with the obtained positively selected sites as identified by the PAML analyses and vice versa. Relative few regions of positive selection are observed in the ZAS domains that contain the zinc fingers that bind specific DNA motifs. Notable exceptions are the ZAS-N domain of Hivep2b and the ZAS-C domain of Hivep3b (Figure 2); the latter is furthermore characterized by a positively selected site identified by the PAML analyses. Most commonly observed positively selected amino acid properties among the four paralogs affect the alpha-helical tendencies, the compressibility, the equilibrium constant (ionization of COOH), and the surrounding hydrophobicity. The SIFT analyses of the observed substitutions to screen for possible effect on protein function showed that all substitutions are tolerant and thus have no predicted damaging effect on function (data not shown).
Analyzing cis-regulatory regions We investigated noncoding regions within the Hivep paralogs for potential cis-regulatory elements to determine possible binding sites for transcription factors, indicative of putative functional involvement in signaling pathways. Vistaplots of Hivep1, Hivep2a, Hivep2b, and Hivep3b are depicted in Figure S1, A-D. Because of a limited number of retrieved teleost sequences for Hivep3a, the Vistaplot was not informative for this locus and was therefore excluded from further analyses. For all four analyzed loci, conserved noncoding elements (CNEs) were observed in A. burtoni. Interestingly, a similar pattern of two CNEs surrounding a single exon was observed in all loci (arrows in Figure S1). Although this pattern seems common among teleost fish for Hivep1 and Hivep2a, for Hivep2b and Hivep3b this pattern seems to be restricted to cichlid fishes (O. niloticus is the reference sequence in these analyses). A third cichlid-specific CNE was observed in a subsequent intron in both Hivep1 and Hivep2a, whereas for Hivep3b two more cichlid-specific CNEs were identified.
Because the particular pattern of two CNEs surrounding an exon was observed in all four analyzed loci, the subsequent search for potential transcription factor binding sites was mainly focused on these regions to determine any overlap in possible function of these regions. The analyses resulted in similar hits among Hivep paralogs and suggested a possible association between the Hivep paralogs and different types of signaling pathways involved in, e.g., sex determination [androgen receptor (AR); pre-B-cell leukemia transcription factor 1 (PBX1); sexdetermining region Y protein (SRY)], immune system [B-cell lymphoma 6 protein (BCL6); H2.0-like homeobox protein (HB24); signal transducer and activator of transcription1,3,5a (STAT1,3,5a)], developmental patterning [homeobox protein Hox-A3 (Hoxa3); homeobox protein MSX-1 (Msx1)], and several members of the paired box protein Pax (PAX) and bone morphogenetic protein (BMP) pathways.

Experimental vaccination and immune response measurements
We performed experimental vaccinations to test whether the Hivep paralogs are involved in an induced immune response. The experimental vaccination was realized by exposure to heat-killed V. anguillarum for $21 hr, following the methods of Birrer et al. (2012). Several immune response measurements were performed to determine induction of immune defense dynamics. The lymphocyte/monocyte ratio and the relative number of lymphocytes in the G 2-M and S phases of the proliferation cycle were measured in blood, spleen, and head kidney (Figure 3). Data were normally distributed. The experimental treatment resulted in an elevated lymphocyte/monocyte ratio in blood n  (P = 0.008; unpaired t test) and spleen (P = 0.018), indicative of a higher proportion of cells from the adaptive immune system (i.e., immune response). A higher proportion of lymphocytes in the S and G 2-M phases was found in the head kidney of the experimental group (P = 0.005), indicative of induced lymphocyte proliferation.

Gene expression assays
We measured the expression levels of four control immune loci, AIF1, F2RL1, IL10, and TLR5 in liver and gill tissues. These relative expression levels are depicted in Figure 4. For AIF1 and TLR5, we found significantly higher levels of relative expression in liver (P = 0.014 and P , 0.001; unpaired t test) and gills (P , 0.001 and P = 0.006) in the experimental treatment group.
To analyze the effect of Vibrio exposure on the expression levels of the Hivep paralogs in detail, we assessed their expression levels in relation to an immune response parameter (i.e., lymphocyte/monocyte ratio) per treatment group (i.e., control and experimental). Four correlations were significant between the lymphocyte/monocyte ratio of the spleen and the relative expression of Hivep1 (liver; Pearson r = 0.798, P = 0.018), Hivep1 (gills; Pearson r = 0.794, P = 0.011), Hivep2a (Pearson r = 0.745, P = 0.021), and Hivep3b (Pearson r = 0.852, P = 0.007) ( Figure 5). In these cases, the expression level of the Hivep paralogs thus correlates positively with the level of the immune response parameter.

DISCUSSION
In this study, we examined the molecular evolutionary history of the Hivep gene family members in relation to their presumed immunerelated function in a renowned model system for evolutionary biology, the East African cichlid fishes. We performed comparative phylogenetic n n analyses and detailed screens of d N /d S ratios, analyzed putative cisregulatory regions within the loci, and, in particular, investigated the expression levels of the Hivep paralogs after an experimental vaccination with V. anguillarum. We show, for the first time to our knowledge, that the Hivep paralogs play a putative role in the response to vaccination in fish, and that they are characterized by signatures of long-term positive selection. Our findings regarding the Hivep paralogs indicate that they are important candidate genes for immune-related functions in teleost fish and suggest broader implications in relation to speciation events, such as the adaptive radiations in East African cichlid fishes.

Exposure to V. anguillarum causes an immune response in A. burtoni
To test whether the exposure to a vaccination with heat-killed V. anguillarum resulted in an upregulation of the cellular fish immune response, we measured lymphocyte/monocyte ratios, the proportions of proliferating lymphocytes, and the expression levels of four control immune genes with demonstrated functions in the inflammatory response and immunity (Watano et al. 2001;Janeway and Medzhitov 2002;Akira et al. 2006;Jankovic and Trinchieri 2007;Rothmeier and Ruf 2012) (see Materials and Methods section). Consistent with an elevated immune response upon Vibrio vaccination, we found an increased lymphocyte production in the head kidney, the organ where clonal lymphocyte production takes place (Rombout et al. 2005;Abdel-Aziz et al. 2010). We also found a higher proportion of lymphocytes both in blood and spleen, indicating lymphocyte migration toward peripheral organs. Although lymphocytes are transported via blood, the spleen is the major lymphoid tissue associated with the clearance of blood-borne antigens (Press and Evensen 1999;Whyte 2007). Finally, the significant upregulation of both AIF1 and TLR5 in the Vibrio-exposed group indicates activation of the immune system (Watano et al. 2001;Janeway and Medzhitov 2002;Akira et al. 2006). However, we did not find a significant upregulation for two other immune genes with demonstrated functions in the immune response,  Figure S2 for details). Black diamonds (♦) represent positively selected amino acid sites obtained by the PAML analyses and red circles around them represent positively selected radical nonsynonymous substitutions (category 6-8).
F2RL1 and IL10 (Déry et al. 1998;Savan et al. 2003;Rothmeier and Ruf 2012). These were possibly missed by the choice of our measuring time point (21 hr after vaccination), as suggested by Savan et al. (2003), who found elevated IL10 expression in carp liver tissue after LPS stimulation only within the first 6 hr of incubation.
Hivep expression levels correlate with cellular immune response parameters in an East African cichlid fish Although several functions of the Hivep paralogs have been demonstrated in the fruitfly, Xenopus frog, human, and mouse (Wu 2002), the Hivep paralogs have, so far, not been examined in teleost fishes. We tested, for the first time, whether there is a correlation between immune response parameters and the expression level of the Hivep paralogs in fish as an indicator of putative function(s) in the immune response. Although our study does not determine the exact function of the Hivep paralogs within the immune response, the positive correlations between the lymphocyte/monocyte ratio and the expression levels of three Hivep paralogs indicate that the expression of -at least-Hivep1, Hivep2a, and Hivep3b is upregulated upon the experimental vaccination. This implies that Hivep paralogs play a role during the immune response of fish. These results provide, to our knowledge, the first indication of an immunological function of the Hivep paralogs in teleost fish, which is congruent with preliminary findings in pipefish (O. Roth, personal communication). The Hivep gene family thus offers a potential novel family of immune genes for teleost fish that, when their functions are characterized in more detail, can be used in future immunological screens.

Other functional implications
The experimental vaccination did not lead to upregulated expression levels of all five Hivep paralogs. We found no correlation between the expression levels of Hivep2b and Hivep3a and the immune response measurements. These paralogs either have immunological functions beyond the scope of our experimental design or are not involved in the immune response in teleost fishes. Previously, it had been shown that Hivep genes are involved in functions other than the immune system in insects and vertebrates, e.g., in murine osteoclastogenesis (Liu et al. 2011), in BMP/Dpp signaling (Grieder et al. 1995;Dai et al. 2000;Marty et al. 2000;Torres-Vazquez et al. 2001;Yao 2006;Saita et al. 2007;Yin et al. 2010), and in the development of the nervous system (Campbell and Levitt 2003;Takagi et al. 2006). Interestingly, several of the potential transcription factor binding sites identified within the observed CNEs correspond with these known functions of Hivep paralogs. For instance, we found multiple hits for components of the BMP signaling pathway, as well as several other developmental patterning loci, suggesting a putative role of the Hivep paralogs in developmental patterning and bone formation in cichlid fishes. Hivep paralogs have been found to play a role in the specification of Drosophila wing and halter discs (Torres-Vazquez et al. 2001), multiple dpp-dependent patterning events of both Drosophila ectoderm and mesoderm (Arora et al. 1995), and male tail patterning in Caenorhabditis elegans (Liang et al. 2007). The roles of Hivep in the BMP pathway, possibly through alternative splicing (Hicar et al. 2001;Hong and Wu 2005;Yin et al. 2010), together with several indications of functions in appendage specification and patterning make them candidate genes for fin patterning and anal fin egg-spot formation, a sexually selected trait involved in courtship and spawning behavior and intrasexual communication of haplochromine cichlid species (Wickler 1962;Hert 1989;Salzburger et al. 2005;Salzburger 2009;Theis et al. 2012). Future detailed expression and functional analyses should elucidate whether the Hivep paralogs are involved the development of egg spots in haplochromine cichlid fishes.
Implications of positive selection on Hivep paralogs, immune genes, and speciation events Positive selection, or adaptive sequence evolution, is the hallmark of evolutionary change and molecular adaptation. By comparing the nonsynonymous substitution rate (d N ) to the synonymous substitution rate (d S ) of protein coding genes, the selection regime (i.e., neutral, purifying or positive) per amino acid can be inferred (Yang and Bielawski 2000). This method is widely used and has led to the identification of many cases of positive selection (Yang and Bielawski 2000). Genes involved in (evading) defensive systems or immunity are typically found with signatures of positive selection (Endo et al. 1996;Yang and Bielawski 2000;Schlenke and Begun 2003;Nielsen 2005;Nielsen et al. 2005;Biswas and Akey 2006;Yang 2006;Jiggins and Kim 2007;Montoya-Burgos 2011). As discussed, several functions within the immune response have been described for the Hivep paralogs in other species, and our detailed inferences of the d N /d S ratios provide evidence for positive selection acting on four out of five Hivep paralogs. Interestingly, signs of positive selection have been found before in vertebrate Hivep paralogs. Hivep2 has been found with a signature of positive selection in Tetraodon (Montoya-Burgos 2011) and the cow lineage (Toll-Riera et al. 2011), whereas Hivep3 showed signs of positive selection in the human lineage (Vamathevan et al. 2008). At least for the human Hivep paralog, it has been suggested that the immune function is the cause for the signature of positive selection. Together, these results indicate that it is likely that the immune-related functions of the Hivep paralogs are the cause for the elevated d N /d S ratios observed across vertebrate lineages, including the 14 cichlid lineages examined here.  Positively selected genes are typically only loosely connected to reproductive isolation in Drosophila (Wu and Ting 2004). This is in contrast to the vertebrate MHC loci, known for their signatures of positive selection (Hughes and Nei 1988;Hughes and Nei 1989;Yang and Bielawski 2000;Montoya-Burgos 2011), which have been proposed as pleiotropic speciation genes (Eizaguirre et al. 2009). Because these genes are involved in adaptation to novel habitats in response to different pathogenic communities and assortative mating via mate choice, their pleiotropic effects are hypothesized to induce and accelerate speciation (Eizaguirre et al. 2009). Our work shows that several of the Hivep paralogs also have putative pleiotropic roles in immune defense and an important sexually selected trait-the anal fin egg-spot in East African haplochromine cichlids-subject to mate choice (Hert 1989;Couldridge 2002; but see Theis et al. 2012). Mate choice for the most attractive male anal fin could thus select a certain Hivep genotype and thereby facilitate adaptation to pathogenic environments at the same time. Similar to the MHC loci, the Hivep paralogs might have played important roles during the explosive speciation events of cichlid fishes and therefore are exciting new putative speciation genes.
Hivep3b: Selective patterns in haplochromines and other derived cichlid lineages That we found evidence of lineage-specific positive selection acting on Hivep3b indicates that this locus underwent adaptive protein evolution in both the derived cichlid lineages, including the lamprologines, eretmodines, and the C-lineage, and the most species-rich cichlid lineage, the haplochromines. Adaptive protein evolution underlies the adaptive evolution of traits and is thus ultimately responsible for species divergence and evolutionary innovation (Yang 2006). Interestingly, the elevated d N /d S ratios were observed in lineages that are characterized by explosive speciation and diversification events Day et al. 2008), which can be seen as further support for the hypothesis that the pleiotropic functions of the Hivep paralogs -Hivep3b specifically-can be linked to speciation events. During such events genes could have been recruited to perform altered functions to generate novel or modified traits, which ultimately may have played a role in the divergence between species. A lineage-specific amino acid substitution in Hivep3b was observed for all the haplochromines (position 112 G / R) and the derived lineages (position 87 E / D), as well as several substitutions in a subset of the species belonging to these lineages. Functional analyses are now needed to test whether these substitutions have a fitness advantage for these species and, above all, their function in these cichlid lineages.

ACKNOWLEDGMENTS
We thank past and current members of the Salzburger group for their contribution to sampling during fieldwork; Brigitte Aeschbach and Nicolas Boileau for assistance during laboratory work; and Verena Klein, Anne Beemelmanns, and Susanne Landis for help with the fish dissections and the experimental vaccinations. We also thank Jörn Scharsack for support in establishing the immune parameter measurements. The Inflammation Center at the University of Kiel kindly provided access to the flow cytometer facilities to perform immune measurements. Gene expression data analyzed in this article were generated in the Genetic Diversity Centre of ETH Zurich. We express our gratitude to Louis Du Pasquier, Christian Michel, Christophe Eizaguirre, Joost Raeymaekers, two anonymous reviewers, and the Editor for their constructive comments and suggestions. We also thank the BROAD Institute for sharing unpublished cichlid genome sequence data with the community. This work was supported by the European Research Council (starting grant ''INTERGENADAPT'' to W.S.), the Swiss National Science Foundation (grant 3100A0_122458 to W.S.), the Freiwillige Akademische Gesellschaft Basel (dissertation support grant to E.T.D.), the Volkswagenstiftung (grant to O.R.), and the German Research Foundation (grant to O.R. and W.S.).