Linkage and Physical Mapping of Sex Region on LG23 of Nile Tilapia (Oreochromis niloticus)

Evidence supports that sex determination (SD) in tilapia is controlled by major genetic factors that may interact with minor genetic as well as environmental factors, thus implying that SD should be analyzed as a quantitative trait. Quantitative trait loci (QTL) for SD in Oreochromis niloticus were previously detected on linkage groups (LG) 1 and 23. Twenty-one short single repeats (SSR) of >12 TGs and one single nucleotide polymorphism were identified using the unpublished tilapia genome sequence on LG23. All markers showed two segregating alleles in a mapping family that was obtained by a cross between O. niloticus male (XY) and sex-reversed female (ΔXY) yielding 29 females (XX) and 61 males (XY and YY). Interval mapping analysis mapped the QTL peak between SSR markers ARO172 and ARO177 with a maximum F value of 78.7 (P < 7.6 × 10−14). Twelve adjacent markers found in this region were homozygous in females and either homozygous for the alternative allele or heterozygous in males. This segment was defined as the sex region (SR). The SR encompasses 1.5 Mbp on a single tilapia scaffold (no. 101) harboring 51 annotated genes. Among 10 candidate genes for SD that were tested for gene expression, anti-Müllerian hormone (Amh), which is located in the center of the SR, showed the highest overexpression in male vs. female embryos at 3 to 7 days postfertilization.

sex region linkage mapping physical mapping Oreochromis niloticus microsatellite markers Sex determination (SD) can be controlled by one or more genetic factors, environment or their interactions, involved SD factors located on sex chromosomes and/or on either autosomes (Bull 1981). The sex chromosomes are characterized by both morphologically undifferentiated and differentiated homologs, in simple and multiple systems with male or female heterogamety. Studies on organisms with differentiated sex chromosomes, male heterogametic (mammals and fly) and female heterogametic (birds and reptiles), have shown interesting similarities between the two systems (XY/XX and ZW/ZZ) of sex determination (Ellegren 2011). Teleost species are an interesting model for SD research, with a variety of SD systems and capability of producing viable hybrids between closely related species having different SD systems (Mank et al. 2006). Teleost fish have diverged from land vertebrates more than 450 million years ago, after they diverged from birds and mammals, but before they diverged from each other (Bellott et al. 2010). Moreover, these fish species experienced whole-genome duplication (ancestral tetraploidy), followed by variable-rate reduction of ploidy that significantly complicates the identification of orthologs (Kasahara et al. 2007). Bellott et al. (2010) compared zebrafish, Tetraodon, pufferfish, and medaka genomes to mammalian X or avian Z chromosome and reported that most orthologs to Z and X genes occupy separate portions of each fish genome. Different aspects of tilapia SD have been explored because tilapias are an important aquaculture commodity (Devlin and Nagahama 2002). Their commercial production relies on all-male monosex culture, which so far has proved difficult to maintain in large-scale production facilities (Cnaani and Levavi-Sivan 2009). A better understanding of the genetic basis of SD in tilapia is needed to overcome these difficulties.
The differences in SD mechanisms among closely related tilapia species and the influence of the environment (Baroiller et al. 2009) suggest that SD should be analyzed as a quantitative trait using a markers-based QTL approach. Various sex-linked markers have been identified in O. niloticus and O. aureus (Lee et al. 2003(Lee et al. , 2004Shirak et al. 2002Shirak et al. , 2006Eshel et al. 2010) and mapped to different LG. In purebred O. niloticus and O. niloticus · O. aureus hybrids, the QTL were detected on LG1, LG23 (Lee et al. 2003;Eshel et al. 2010) and on LG3 (Lee et al. 2005), respectively. The SD QTL on LG23 was mapped within a confidence interval (CI) of 16-21 cM (Eshel et al. 2010; Figure  1B), which harbors the genes Amh and Dmrta2 that are involved in the vertebrate SD cascade (Shirak et al. 2006, Figure 1A). The first assembly version of the unpublished tilapia genome, consisting of 5900 scaffolds, was recently released (Accession no. PRJNA59571). Using this information enabled us to refine the confidence interval of SD QTL on LG23 and find positional candidate genes for the sex masterkey regulators in the Swansea stock of O. niloticus.
The critical period for SD in tilapias is 0-18 days postfertilization (dpf). During this period, embryos are sensitive to androgens, estrogens, and precursors of steroids through immersion and dietary exposure (Devlin and Nagahama 2002). In a more recent study, Ijiri et al. (2008) demonstrated that differential expression of genes in XX and XY gonads of O. niloticus during the period of 9-10 dpf is critical for differentiation of primordial germ cells (PGC) into either ovary or testis. Rougeot et al. (2008) applied temperature treatment on allfemale population embryos until hatching (3-4 dpf) and showed $20% phenotypic sex change of females to males. Furthermore, Palti et al. (2002) demonstrated by using genetic markers that sex-specific mortality occurs shortly after hatching. On the basis of these findings, we hypothesized that master regulation genes initiating the SD cascade should be expressed before the detectable differences in the PGCs.
Recent studies revealed that the major genes involved in the SD pathway are common to mammals and fish (Schartl 2004). Moreover, the upstream genes on the SD cascade may vary among organisms, but downstream components tend to be conserved (Charlesworth and Mank 2010). To study the onset of the SD cascade at early stages in embryonic development, we selected eight genes (Lhx9, Amh, Foxl2, Cyp19a, Dmrt1, DAX1, Sox9a, and Sox9b) with a known role in the SD pathway of other organisms (Birk et al. 2000;Shirak et al. 2006;Ijiri et al. 2008) and examined their expression from gastrula to late larval stage (2-9 dpf) in O. niloticus monosex progeny of XY males and XX females. Additionally, we analyzed two genes previously mapped to the SD region: Sox14 (Cnaani et al. 2007) and ELAVL1 (A. Shirak, unpublished data) with ambiguous similarity to known genes on the SD pathway. Gene expression profiles represent the primary level of integration between environmental factors and the genome, providing the basis for phenotypes, such as morphology and behavior. Therefore, we examined differences in candidate genes expression between genders during early embryonic development for initiation of the SD cascade in tilapia.
The objectives of this study were to refine the sex region on LG23 using both linkage and physical mapping and to identify candidate genes for SD in the region with differential expression at early embryonic development.

MATERIALS AND METHODS
Breeding of O. niloticus (Swansea stock) families used for this study was performed at the Agricultural Research Organization, Israel.

Mapping family
The inheritance of gender in a cross between O. niloticus male (XY) and a sex-reversed neofemale (DXY) that yielded 29 females (XX) and 61 males (XY and YY) was validated by segregation of the sex-linked marker UNH898 (Eshel et al. 2010).

Monosex groups
To obtain all-female (XX) and all-male (XY) progenies, eggs of a single O. niloticus female (XX) were divided into two groups, and each group was artificially fertilized with either milt of a sex-reversed male (DXX) or milt of a genetically modified male (YY). Sex was determined at age of three months by gonadal squash of at least 100 individuals per each full-sib group (Mair et al. 1997).
n Table 2 Primers for genes analyzed by qPCR repeats of .12 TG in scaffold 7 (6,500,000-9,485,422 bp), in scaffold 29 (3,291,196-5,141,938 bp), and in the entire scaffold 101. We entered the sequence of 200 bp upstream and downstream of the TG repeats core to Primer3 software and developed 21 novel SSR markers.
Polymorphism of these markers was tested in parents of the mapping family. To develop genetic markers in the vicinity of UNH216, we mapped in our family the marker UNH2190, which was derived from the Malawi cichlids hybrid Metriaclima zebra · Labeotropheus fueleborni and was mapped adjacent to UNH216 (Albertson et al. 2003).

Development of the SNP marker:
On the basis of partial cds sequence (GI: 93115149) of O. mossambicus ELAVL1 (embryonic lethal, abnormal vision, Drosophila-like 1), we identified the SNP polymorphism A/G (Table 1) at nucleotide 391 in our mapping family.
DNA extraction and genotyping of SSR and SNP markers: DNA was isolated from fin samples by the "salting out" high-throughput procedure (Zilberman et al. 2006). The concentration of the DNA was quantified with NanoDrop spectrometer (NanoDrop Technologies, DE), and each DNA sample was diluted to a final concentration of $10 ng/ml. PCR amplification was performed in a total volume of 10 ml with Super-Therm Taq DNA polymerase (JMR Holding, London), mixture of 2 mM dNTPs of each nucleotide, and primer concentration of 10 pmol/ml (Metabion GmbH, Germany). PCR conditions were 3 min at 94°; 40 sec at 94°, 40 sec at 61.5°, 1 min at 72°for 30 cycles and 10 min at 72°. The mapping family was amplified for SSR markers, and genes with primers taken from NCBI database or designed based on scaffold sequence (detailed list in Table 1), where one primer in each pair was 59 end-labeled by HEX, TET, or FAM fluorescent dyes (Operon Technologies, Alameda, CA). Size calling of PCR products was determined using ABI GeneMapper software version 4.0 (Applied Biosystems, Foster City, CA) after electrophoresis in a capillary gel on ABI-3130 apparatus. Sequencing and SNaPshot reaction for genotyping of SNP markers were also carried out on ABI-3130 according to the manufacture instructions using the primers specified in Table 1. (For additional data, see supporting information, File S1.) Linkage and interval mapping: The linkage map for LG23 using segregating markers in our mapping family was reconstructed by CRIMAP software (http://linkage.rockefeller.edu/soft/crimap/). The interval mapping was based on a nonlinear regression using the method of Knott et al. (1996), with the program developed by Spelman et al. (1996). The test statistic and locus effects were evaluated at 1 cM intervals. The 95% confidence intervals (CI) for the QTL location and effect were determined by generation of 200 bootstrap samples.
Identification of genes and annotation: Annotation of genes positioned in the SR was performed by combining three bioinformatics resources: (1)  RNA extraction and qPCR: A pool of 20-30 embryos from each gender were placed in RNAlater reagent (Qiagen) to stabilize the RNA and then stored at 220°until RNA extraction. Total RNA was Figure 2 Determination of boundaries of the SR on scaffold 101 based on genotyping data of SSRs for selected individuals. Heterozygous genotypes of females contributed to the reduction of the SR interval delimited by arrows. The two homozygous genotypes are denoted "11" and "22"; the heterozygous is denoted as "12." Females have the 11 genotype for all markers within the SR, whereas males have either 12 or 22. The genotypes' segments corresponding to gender for each individual are denoted with shading. SR, sex region; SSRs, short single repeats.
n Table 3 Level of normalized relative expression 6 SD and statistical significance of sex-specific differences for gene candidates for SD in embryos at 2 to 9 dpf Gene expression in SD-associated organs: To retrieve available gene expression data for all genes embedded within the SR on LG23 we used the "Gene Atlas" expression data for mammals at BioGPS (Su et al. 2004;Wu et al. 2009; http://biogps.org/#goto=welcome). Differential expression was determined for individual genes in organs relevant for the SD pathways in tilapia such as brain, testis and pituitary.

Mapping new markers on LG23
Two alleles were found for each of the novel 21 SSR markers in the parents of the mapping family. These markers were designated as ARO markers ( Table 1) that were physically mapped to scaffolds 7, 101, and 29 and linkage mapped to an interval of 30 cM of LG23 ( Figure 1C).

Linkage and physical mapping of the QTL on LG23
In Figure 1, the QTL interval mapping for SD on LG23 is presented based on the reference mapping family ( Figure 1A) and O. niloticus families ( Figure 1B). In the current study, 33 genetic markers were analyzed, including the new SSR markers added ( Figure 1C). Interval mapping analysis mapped the SD QTL region to 13-40 cM with a maximum F value of 78.7 (P , 7.6 · 10 214 ) at 22 cM ( Figure 1D). This region was localized to scaffolds 7, 101, and 29. Physical mapping of the scaffolds with the newly developed markers narrowed down the SR to scaffold 101 between markers GM597 and ARO124 from 990,577 to 2,468,000 bp ( Figures 1E and 2). The Amh gene is located between these markers. The scaffolds relating to LG23 and the physical map of markers are given in Figure 1E based on the unpublished tilapia genome sequence. The SR on scaffold 101 was inferred from genotypes for SSRs of selected individuals ( Figure 2); 12 adjacent markers found in this region were homozygous in females and either homozygous for the alternative allele or heterozygous in males. This segment was defined as sex region. Markers flanking the SR were heterozygous in females, thus reducing the SR interval to 1.5 Mbp between GM597 and ARO124. The boundaries of the SR are marked by arrows.
Gene expression at early developmental stages Expression analysis of 10 SD-related genes and two genes mapped within the SD QTL on LG23 was performed for embryos of known type (XX or XY) during 2-9 dpf. No significant differences between genders were found for cyp19a, Dmrt1, or Dax1. Significant sex-differential expression was detected for the remaining 7 genes as presented in Table 3. Figure 3A presents the continued elevation in significance for gender-specific differences for Amh expression from 3 dpf (P = 0.03) to 7 dpf (P # 0.01). The Y axis indicates normalized expression values, whereas each bar along the X axis indicates a sample. This gene showed the highest sex-differential expression among all 10 tested genes. Significant sex difference was found for Lhx9 expression (P = 0.0002) at 2 dpf, equivalent to the developmental stage of segmentation (Fujimura and Okada 2007) but not later in the embryonic development. Likewise, sex-differential expression was detected for ELAVL1 (P , 0.01) at age of 2 dpf but was attenuated at 5, 7, and 9 dpf (P # 0.05) ( Figure 3B). Gender-specific expression differences for the other 4 genes (Foxl2, Sox9a, Sox9b, and Sox14) were detected at later developmental stages (6-9 pdf).
Characterization of genes positioned within the SR Fifty-one genes were identified within the SR and are presented in Table 4. Thirty-nine genes had expression data in a variety of 91 tissues of mammals in the BioGPS database. We focused on three SD-related organs that are relevant in the tilapia SD cascade: brain, testis, and pituitary. Interestingly, 17 out of the 39 genes showed overexpression in the brain; expression of 15 of these genes exceeded the median expression by over 3-fold. Thirteen genes were found relevant to SD following a literature survey. After removing 3 genes with no expression data, 4 out of the remaining 10 genes showed overexpression in at least one SD-associated organ (Notch2, PIAS4, ZBTB7, or CELF5).

Comparative mapping
Comparative analysis of the genes positioned within the SR detected high level of orthology between tilapia and six different species. Within ,1.3 Mbp region of stickleback, Tetraodon,fugo,zebrafish,medaka,and human,40,39,29,29,29, and 21 orthologous genes, Figure 3 Normalized relative expression of Amh (A) and ELAVL1 (B) for allmale (gray) and all-female (white) pools at 2-9 dpf. Deviation bars represent standard errors and asterisks represent levels of significance for sex-specific expression differences: Ã P # 0.05, ÃÃ P # 0.01, and ÃÃÃ P # 0.001. dpf, days postfertilization. A is presented in the lower part of Figure 3. respectively, have been found in the same order. GO term enrichment analysis of these genes with DAVID software (Huang et al. 2009) yielded 4 genes, Notch2, ELL, Amh, and TCF, which are involved in biological processes of cell differentiation, cellular, and anatomical structure development based on zebrafish background of 8389 genes.

DISCUSSION
Different SD systems with remarkable variation have been observed in teleosts (Volff and Schartl 2002). Evidence supports that sex determination (SD) in tilapia is controlled by major genetic factors that may interact with minor genetic as well as environmental factors, thus implying that SD should be analyzed as a quantitative trait.
n Table 4 SD-related data for annotated genes in the SR on scaffold 101 between 990,577 and 2,468,000 bp QTL for SD in Oreochromis niloticus were previously detected on LG1 and LG23 (Lee et al. 2003;Cnaani et al. 2004;Eshel et al. 2010). In the present study, interval mapping analysis using 33 markers on LG23 detected the QTL peak between two adjacent genetic markers: ARO172 and ARO177. However, the confidence interval was still rather large between 5 cM (Eshel et al. 2010;156 individuals) and 30 cM in the current study. Thus, mapping QTL to confidence interval , 5 cM is not a viable option using genetic markers and segregating families of moderate size (Ron and Weller 2007). However, using physical mapping based on the unpublished tilapia genome sequence, all 26 markers in the QTL were physically mapped to three scaffolds on LG23. Furthermore, recombinations in two females were used to identify the boundaries of the SR between markers GM597 and ARO124 on a single scaffold (no. 101; Figure 2). This explains the lack of power of the interval mapping that is based on bootstrap analysis of a family of 90 individuals of which only two are informative. Figure 2 demonstrates the distinct contrast of genotypes for markers between genders in the specific sex region. The absence of recombination along a region of 12 genetic markers may reflect the moderate size of the family, but it also conforms to the theory that the evolution of sex chromosomes involves suppressed recombination between homologous chromosomes to maintain sex-related gene blocks (Bergero and Charlesworth 2009). The SR encompasses 1.5 Mbp harboring 51 annotated genes. Our assumption that the SR harbors sex-related or male-determining genes is strengthened by the conservation of this region in other teleost fish. Out of 51 genes that were positioned within the SR, 40 and 39 orthologous genes have been found within ,1.3 Mbp region of stickleback and Tetraodon, respectively. Information from the literature indicates the putative role of 13 out of the 51 genes in SD: 10 genes in transcriptional processes related to SD and 3 in gonadal development and function (Table 4).
We examined expression of genes in the SD pathways at early developmental stages of tilapia. Previous studies on SD-related gene expression in tilapia focused on brain, PGS, and gonads (Ijiri et al. 2008;Poonlaphdecha et al. 2011). The results from our study on expression data of 10 candidate genes indicate that the onset of the SD cascade begins at 2 dpf at the gastrulation stage, based on overexpression of Lhx9 and ELAVL1 in females. Lhx9 was found to be essential for mouse gonad formation (Birk et al. 2000). ELAVL1 is a member of CELF proteins implicated in cell-specific and developmentally regulated alternative splicing (Ladd et al. 2001). Additional SD-related genes were Sox9, which is necessary and sufficient to cause testicular differentiation in mammals (Vidal et al. 2001). Likewise, Foxl2 plays a role in ovarian sex differentiation and has been suggested to function as a repressor of the male pathway during ovarian development prophase (Ottolenghi et al. 2005). Significant differences in expression between genders for Sox9a, Sox9b, Foxl2, and Sox14 genes were detected in later stages of embryonic development and may indicate their downstream role in the SD cascade. We detected higher expression of SOX9 in females than in males at 7 dpf, in contrast to the results of Ijiri et al. (2008) of higher expression in male gonads at 37-70 days posthatching. Foxl2 was also highly expressed in females at 8 dpf as was previously reported (Ijiri et al. 2008). Among 10 ten candidate genes, Amh, which is located in the center of the SR, showed the highest expression in male vs. female embryos. Our observation was supported by Poonlaphdecha et al. (2011) who reported on dimorphic expression of Amh between genders in adult gonads and brains as well as in embryo heads at 10 and 15 dpf. GO term enrichment analysis detected 4 genes, including Amh, that are involved in biological processes of cell differentiation, cellular development, and anatomical structure development. Genes playing a role in SD initia-tion with dimorphic expression between genders may be considered as candidate genes and should be further investigated.
To test the role of Amh and other candidate genes in SD of tilapia, targeted strategies could be considered, such as (i) mutant detection in candidate genes, as performed in zebrafish (Demarest et al. 2011); (ii) gene silencing using siRNA technology, as applied in the giant freshwater prawn (Ventura et al. 2009); and (iii) transgenesis using the Tol2 system, which was demonstrated for Nile tilapia (Fujimura and Kocher 2011). Large-scale experiments might involve (i) genomic mutagenesis together with sex reversal, phenotypic mutant screening, and sequence analysis, as was applied in a medaka SD study (Otake et al. 2006); and (ii) a whole-transcriptome scan for gene expression at early embryonic development to identify the key regulators of SD. A complete computational approach was pursued to design a 44k features microarray (O. Eshel, unpublished data) based on the unpublished tilapia genome sequence annotation and EST libraries (Lee et al. 2010) for construction of the full tilapia gene list.

ACKNOWLEDGMENTS
We thank Berta Levavi-Sivan for providing us the DXX males developed in her laboratory. We thank the Broad Institute Genome Sequencing Platform and Genome Sequencing and Analysis Program, Federica Di Palma, and Kerstin Lindblad-Toh for making the unpublished genome sequence data for Oreochromis niloticus available. This work is a contribution from the Institute of Animal Science, Agricultural Research Organization, Bet Dagan, Israel, No. 592/11. This research was supported by Grant IS-3995-07 from the United States-Israel Binational Agricultural Research and Development (BARD) Fund and by Grant No. 801/11 from the Israeli Science Foundation.