Segregation of a Spontaneous Klrd1 (CD94) Mutation in DBA/2 Mouse Substrains

Current model DBA/2J (D2J) mice lack CD94 expression due to a deletion spanning the last coding exon of the Klrd1 gene that occurred in the mid- to late 1980s. In contrast, DBA/2JRj (D2Rj) mice, crosses derived from DBA/2J before 1984, and C57BL/6J (B6) mice lack the deletion and have normal CD94 expression. For example, BXD lines (BXD1–32) generated in the 1970s by crossing B6 and D2J do not segregate for the exonic deletion and have high expression, whereas BXD lines 33 and greater were generated after 1990 are segregating for the deletion and have highly variable Klrd1 expression. We performed quantitative trait locus analysis of Klrd1 expression by using BXD lines with different generation times and found that the expression difference in Klrd1 in the later BXD set is driven by a strong cis-acting expression quantitative trait locus. Although the Klrd1/CD94 locus is essential for mousepox resistance, the genetic variation among D2 substrains and the later set of BXD strains is not associated with susceptibility to the Influenza A virus PR8 strain. Substrains with nearly identical genetic backgrounds that are segregating functional variants such as the Klrd1 deletion are useful genetic tools to investigate biological function.

Read alignment and post-alignment processing. For Illumina reads, sequencing reads from each lane were trimmed off for the low quality bases and aligned to the C57BL/6J reference genome (mm10) using Burrows Wheeler Aligner(LI and DURBIN 2010) (version 6.1) and the parameters "-q 15". Quality scores were recalibrated at the lane level using Genome Analysis Toolkit (MCKENNA et al. 2010) (GATK version 2.7) 'TableRecalibration'. All lanes from the same library were then merged together into a single BAM file using Picard tools (version 1.8, http://picard.sourceforge.net/). PCR duplicates were flagged at the library level using Picard 'MarkDuplicates'. BAM files representing each library were merged together to create a single BAM file containing all the DBA/2J sequences. Finally, GATK 'IndelRealigner' was used to realign reads near indels from the Mouse Genome Project (KEANE et al. 2011) as well as potential Indels predicted by GATK.
For SOLiD reads, 5500xl SOLiD mate-paired reads were aligned using Life Technologies' proprietary Lifescope (version 2.1, http://www.lifetechnologies.com/us/en/home/technical-resources/softwaredownloads/lifescope-genomic-analysis-software.html) software. Reads with mapping quality of less than 10 were discarded. Aligned reads from different lanes were merged into a single BAM file and duplicate reads were removed using the Picard tools.
Structural variant discovery and annotation. For Illumina reads,structural variants were identified using three different methods including discordant mate-pair analysis (BreakDancer (CHEN et al. 2009)), read-depth (CND (SIMPSON et al. 2010)) and split-read analysis (Pindel (YE et al. 2009)). BreakdancerMax was run using the following parameters '-c 3 -m 10000000 -q 25 -r 3 -h -f'. CND was run using the default settings. Pindel were run using the following settings '-e 3 -f 1000 -sb -ss 1 -G'. In-house python scripts were used to annotate structural variants and the affected genes.
For SOLiD reads, Lifescope large indel variation detection module was used to call for structural variants. This module identifies large indels from 100bp to 100Kb with high confidence. CND tool was run using the default settings. In-house python scripts were used to annotate structural variants and the affected genes.

Sequence Analysis of DBA/2JRj (D2Rj)
Illumina Sequencing-HZI, Braunschweig. Library preparation for whole genome sequencing was done using TruSeq DNA Sample Prep Kit (llumina Inc., San Diego, CA, USA, Cat: FC-121-2001) following manufacturer's instruction. Briefly, 1 µg of genomic DNA was fragmented using a Covaris S2 (Covaris, MA, USA) to approximately 400 bp. The double stranded DNA fragments comprised of 3′ or 5′ overhangs were converted to blunt ends with an 'A' base using End Repair Mix 2 for adaptor ligation. The resulting fragments were purified using AMPure XP beads followed by adenlytion of 3`end. Adapters were then ligated to the adenylated fragments. The resulting ligation product was again purified from Agarose gel. Enrichment of DNA fragment was done for 10 PCR cycles. The final library was purified using AMPure XP beads. Quality control of the amplified libraries was validated using Agilent Bioanalyzer HS Chip (Agilent Technologies) following the manufacturer's instruction. Cluster generation was performed with cBot (Illumina) using TruSeq PE Cluster Kit v3-cBot-HS (Illumina). Sequencing was done on the HiSeq 2500 (Illumina) using TruSeq Rapid SBS Kit v3 -HS (Illumina) for 100 cycles in both directions in high output mode. Image analysis and base calling were performed using the Illumina pipeline v 1.8.resulting in approximately 143 Mio reads and 14,5 Gbases of sequence information. The average coverage of the murine genome (mm9 , reference genome: 2,6 Gbases) was 5.5 X. Alignment of short reads was performed with the Genomic Workbench 5.5 (Qiagen, Hilden, Germany).99.25% of total reads could be mapped to the murine reference genome (mm9). The results can be found in European Nucleotide Archive (EMBL-EBI) under accession number ERS538351.

Array expression data and eQTL mapping
For differential cis-eQTL and expression analyses between early and late epochs (Figure 4), we used an Affymetrix dataset consisting of spleen gene expression data for 81 genetically diverse BXD strains (GeneNetwork accession GN283). We performed robust multichip analysis (RMA) preprocessing and rescaled values to log 2 and stabilized the variance across samples. Expression levels below 6 are close to background noise levels. We used GeneNetwork.org, a web-based tool to map the expression variation in Klrd1 gene.