Abstract
The proPO system regulates melanization in arthropods. However, the genes that are involved in the proPO system in housefly Musca domestica remain unclear. Thus, this study analyzed the combined transcriptome obtained from M. domestica larvae, pupae, and adults that were either normal or bacteria-challenged by an Escherichia coli and Staphylococcus aureus mixture. A total of 54,821,138 clean reads (4.93 Gb) were yielded by Illumina sequencing, which were de novo assembled into 89,842 unigenes. Of the 89,842 unigenes, based on a similarity search with known genes in other insects, 24 putative genes related to the proPO system were identified. Eight of the identified genes encoded for peptidoglycan recognition receptors, two encoded for prophenoloxidases, three encoded for prophenoloxidase-activating enzymes, and 11 encoded for serine proteinase inhibitors. The expression levels of these identified genes were investigated by qRT-PCR assay, which were consistent with expected activation process of the proPO system, and their activation functions were confirmed by the measurement of phenoloxidase activity in bacteria-infected larvae after proPO antibody blockage, suggesting these candidate genes might have potentially different roles in the activation of proPO system. Collectively, this study has provided the comprehensive transcriptomic data of an insect and some fundamental basis toward achieving understanding of the activation mechanisms and immune functions of the proPO system in M. domestica.
- prophenoloxidase-activating system
- gene identification
- transcriptomic analysis
- Musca domestica
- genetics of immunity
The housefly Musca domestica is a worldwide insect vector that can transport numerous pathogenic organisms, including parasites, viruses, bacteria, and even antibiotic-resistant bacteria (Tan et al. 1997; Sasaki et al. 2000; Davari et al. 2012; Schuster et al. 2013; Wei et al. 2013). These pathogens can cause more than 100 serious diseases in human and animals, such as salmonellosis, typhoid fever, cholera, infantile diarrhea, and amoebic dysentery (Scott et al. 2014). In addition to the public health threat, the housefly can suppress milk and egg production in livestock and poultry farming, as well as reduce food conversion. Confusingly, this species can resist infections and maintain its growing prosperity even living in an environment full of pathogens. However, until recently, little has been known about the molecular mechanism of housefly immune response to these pathogens (Li et al. 2013; Liu et al. 2012; Scott et al. 2014; Tang et al. 2014).
Insects rely on their innate immune system as a defense against pathogens because they lack an acquired immune system (Kingsolver and Hardy 2012). The prophenoloxidase-activating system (proPO system) can produce melanin within a few minutes after pathogen invasion and participate in host innate immune responses, including killing, eliminating, or inhibiting invading pathogens (Rao et al. 2010; Qian et al. 2013). The proPO system has been extensively investigated in various insect species, such as Drosophila (An et al. 2013), Anopheles (An et al. 2011a), Tenebrio molitor (Tindwa et al. 2013), Manduca sexta (An et al. 2011b; Wang et al. 2011, 2014), and Bombyx mori (Chen et al. 2014), and the activation cascade has also been preliminarily summarized as follows (Cerenius and Söderhäll 2004; Cerenius et al. 2008, 2010). Generally, invaders are recognized by pattern recognition proteins of the host, such as peptidoglycan recognition proteins (PGRPs) or β-1,3-glucan recognition proteins, and then a cascade reaction of serine proteases is initiated in which many serine proteases are involved, including prophenoloxidase (proPO), prophenoloxidase-activating enzymes (PAPs), serine protease inhibitors (Serpins), and serine protease homologs (SPHs). Once activated, proPO is released into the plasma and converted into phenoloxidase (PO) via restrictive proteolysis. PO is the last and most important component of the proPO system, which oxidizes phenol into benzoquinone that is then polymerized into insoluble melanin by nonenzymatic reactions. The melanin is deposited at the injury site or on the invading pathogens to induce the blackening and healing of wound (Tindwa et al. 2013).
Studies on the contribution of melanization to the survival of dipterans obtained variable results. Two reports on Drosophila (Leclerc et al. 2006) and Anopheles (Schnitger et al. 2007) revealed that the proPO system exhibits no bactericidal activity, but recent work has demonstrated that Drosophila requires proPO activation to survive microbial infections (Binggeli et al. 2014). Therefore, it is crucial to identify the genes related to the proPO system in other dipterans, such as M. domestica for elucidating the activation mechanisms and immune functions of this system.
Although there were a few reports about PO (Sun et al. 2008), PO inhibitors (Tsukamoto et al. 1992), and proPO sequence (AAR84669) in the past, current knowledge of this proPO system is limited compared with antimicrobial peptides and pattern recognition proteins in M. domestica (Wang et al. 2006; Fu et al. 2009; Ai et al. 2013; Sun et al. 2014). In addition, no report is available on the PGRPs, PAPs, and Serpins of M. domestica.
We previously constructed a suppression subtractive hybridization (SSH) cDNA library of M. domestica larvae and identified a few cDNA segments of PAP and proPO (Li et al. 2010). However, the SSH library yielded limited genomic resources (Qiu et al. 2012); therefore, the main components and the activation mechanisms of the proPO system in M. domestica remain unclear.
For the past few years, the high-throughput technology RNA-Seq has been used to produce millions of short cDNA reads and cost-effectively assemble transcriptomes for nonmodel organisms with unknown genomes (Grabherr et al. 2011). This technology has opened a door for numerous and substantial studies on gene discovery. RNA-Seq has also been used for M. domestica transcriptomic analyses using insecticide-resistant adult flies (Li et al. 2013) or 3-d-old bacteria-infected larvae (Liu et al. 2012) or different developmental stages, including normal eggs, pupae, adults, and bacteria-challenged and nonchallenged larvae (Tang et al. 2014). Nevertheless, the key genes involved in the proPO system in M. domestica have yet to be identified.
In this study, we mixed 15 kinds of RNA samples from normal and bacteria-challenged (E. coli and Staphylococcus aureus mixture) larvae, pupae, and adults and conducted the RNA-Seq using Illumina paired-end sequencing. An enormous amount of transcriptomic data were generated. Furthermore, many candidate genes involved in the proPO system were identified and characterized through transcriptomic analysis and real-time quantitative reverse-transcriptase polymerase chain reaction (qRT-PCR) assay as well as the measurement of PO activity.
Materials and Methods
Housefly culture
Housefly (Musca domestica) strains were kindly gifted by Jinan City Center for Disease Control and Prevention in China. Larvae, pupae, and adult flies were maintained in our laboratory at 25° to 30°, 50 70% humidity, and 12 hr light/12 hr dark cycle. The larvae were reared on an artificial medium of bran and milk power until pupation, and adult flies were fed water with sugar. The challenged samples were collected by lightly piercing the postabdomen of healthy third instar larvae, pupae, and adult flies with 1-mL disposable syringe needles previously immersed into a mixed culture of E. coli and S. aureus (1.5 × 108 CFU/mL). The samples were maintained on fresh medium until RNA extraction.
RNA isolation and Illumina sequencing
Total RNA samples were extracted from housefly larvae, pupae, and adults that were either normal or bacteria-challenged (at 12, 24, and 48 hr) using TRIzol reagent (Invitrogen, USA) following the manufacturer’s instructions. After DNase I treatment, RNA integrity was examined using 2100 Bioanalyzer (Agilent Technologies, USA). Then, the qualified RNA samples were mixed equally and used to isolate poly(A) mRNA using magnetic beads with Oligo(dT). The mRNAs were fragmented (200 nt to 700 nt) with RNA fragmentation reagents. The first-strand cDNAs and then the second-strand cDNAs were synthesized using a random hexamer primer with the short mRNA fragments as templates according to the cDNA kit (BD Biosciences Clontech) manufacturer’s instructions. The cDNA fragments were purified and resolved with an EB buffer and then end-repaired with an adapter primer following the manufacturer’s protocol (Illumina). The suitable cDNA fragments were selected as templates by agarose gel electrophoresis to create the final cDNA library by PCR amplification. After quantification and qualification, the cDNA library was sequenced using Illumina HiSeq 2000 in the Beijing Genomics Institute (Shenzhen, China).
Transcriptome de novo assembly
Transcriptome de novo assembly was performed using the assembling program Trinity (Grabherr et al. 2011). Before assembly, the raw reads first were filtered to obtain high-quality clean reads by removing dirty reads that contain adapter sequences, or unknown nucleotides (N) larger than 5%, or low-quality bases (base quality ≤10) more than 20%. Then, Trinity combined the clean reads to form contigs with a certain length of overlap and further connected the contigs to obtain unigenes that cannot be extended on either end. The unigenes were divided into two classes through gene family clustering. The first class contained clusters with the prefix CL and the cluster ID, in which several unigenes demonstrated more than 70% similarity. The second class contained singletons with the prefix unigene. The contigs and the unigenes were longer than 200 and 300 nt, respectively.
Annotation of unigenes
On the basis of sequence similarity with known genes, each assembled unigene underwent protein functional annotation, pathway annotation, Cluster of Orthologous Groups (COG) functional annotation, and Gene Ontology (GO) functional annotation. Briefly, the unigenes were first aligned by BLASTx search against the protein databases of NCBI nonredundant sequence database (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and COG. None of the BLASTx hits were aligned to NCBI nonredundant nucleotide database (Nt) by BLASTn. All alignments were performed using a cutoff E-value ≤10−5 to determine the homology of sequences with known genes. The best alignment results were used to decide the sequence direction and the protein coding region prediction (CDSs) of unigenes. When the different databases yielded contradictory results, a priori order of Nr, Swiss-Prot, KEGG, and COG was followed. If a unigene was not aligned to any of the mentioned databases, then ESTScan (Iseli et al. 1999) was used to decide the sequence direction and to predict the coding regions. After the prediction of unigene CDSs, proteins with the highest ranks in BLAST results were analyzed to decide the coding region sequences of the unigenes. With the help of the KEGG database, the complicated biological behavior of the genes were further studied through pathway annotation. All unigenes were aligned to the COG database to predict and classify their possible functions. On the basis of Nr annotation, GO annotations of the unigenes were obtained using Blast2GO (Conesa et al. 2005). WEGO (Ye et al. 2006) was used for the GO functional classification of all unigenes to understand the distribution of gene functions at the macro level.
Identification of key genes related to the proPO system
Available key genes such as PGRPs, PAPs, Serpins, SPHs, and proPOs in other insect species were used as references to examine this M. domestica transcriptome. The reference species included Drosophila melanogaster, M. sexta, B. mori, T. molitor, Holotrichia diomphalia, and Rhipicephalus microplus. The potential candidate unigenes in the M. domestica transcriptome were confirmed by sequence analysis, protein domain prediction, and phylogenetic analysis. The sequence analyses of the hypothetical genes were performed using BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and DNAMAN 8.0. Meanwhile, the protein domains of putative genes were determined by SMART (http://www.smart.embl-heidelberg.de/). In addition, complete or partial amino acid sequences, signal peptides, and open reading frames from the available nucleotides of the unigenes were deduced using ExPASy (http://www.expasy.org). The sequence alignments and the phylogenetic analyses of putative genes were implemented using GeneDoc (Nicholas et al. 1997) and MEGA 5.0 (Tamura et al. 2011), respectively. The cDNA segments of defined genes were cloned using PCR primers designed according to the representative unigenes and subsequently sequenced by Sangon Company (Shanghai, China).
Real-time RT-PCR analysis
Real-time RT-PCR (qRT-PCR) was performed using iQSYBR Green Supermix (Bio-Rad) and CFX96 Real-Time System (Bio-Rad) to investigate the expression profiles of selected genes that might be related to the proPO system in M. domestica. Total RNAs were isolated from third instar larvae that were normal or bacteria-challenged by E. coli or S. aureus (at 4, 6, 12, and 24 hr), and then were synthesized into cDNAs using the aforementioned methods. The total volume of qRT-PCR mixture was 10 µL each including 5 µL of Supermix, 2 µL of each primer (1 µM), and 1 µL of diluted cDNA. The qRT-PCR program included an initial 95° for 5 min, 40 cycles of 95° for 10 sec, and 60° for 1 min, and then a melt period from 65° to 95°. Every sample was tested in triplicate and the qRT-PCR data were normalized twice. The first normalization was completed using actin internal reference to get a ∆Ct value (Cttarget − Ctactin), and then the second was done by the comparison between ∆Ct value of the experimental sample and that of the control sample (∆∆Ct = ∆Cttarget − ∆Ctcontrol). Results were expressed as the mean ± SD from three independent repeats using the 2−ΔΔCt data with Graph-Pad Prism (Swift 1997). The statistical significance (***P < 0.001, **P < 0.01, *P < 0.05) was detected by t-test. The primers are listed in Supporting Information, Table S1.
Measurement of phenoloxidase activity
The third instar larvae were divided into two clusters, one by sample and a separate one by control. The control larvae were all normal, whereas sample larvae were divided into four groups again, including group 1 in which the larvae were lightly pierced in the postabdomen with 1-mL syringe needles previously immersed into 1×PBS, group 2 with larvae injected with 5 µL anti-mdproPO serum (previously made in our laboratory) for 30 min, group 3 with larvae challenged with 1-mL syringe needles previously immersed into the mixture of S. aureus and E. coli (1.5 × 108 CFU/mL) for 30 min, and group 4 with larvae that first were injected with 5 µL anti-mdproPO serum for 30 min and then challenged by the mixture of S. aureus and E. coli for 30 min. The hemolymph of larvae (N = 30) in each group was collected in 200 µL anticoagulation buffer (NaCl 110.5 mmol/L, KCl 2 mmol/L, NaHCO3 2.4 mmol/L, NaH2PO4 0.083 mmol/L, EDTA 20 mmol/L; pH 7.0) on ice to reduce spontaneous activation of the proPO cascade in vitro, respectively. Phenoloxidase activity was assayed with L-DOPA (BBI) as a substrate using a previously published 96-well microplates method by our laboratory (Guo et al. 2014). Briefly, a mixture of 150 µL containing 30 µL treated hemolymph, 170 mmol/L CaCl2 (30 µL), and 1×PBS (KH2PO4 1.76 mmol/L, Na2HPO4 10.14 mmol/L, KCl 2.7 mmol/L, NaCl 140 mmol/L; pH 7.0) of 90 µL was preincubated at 30° for 5 min, after which 30 µL of the substrate solution (20 mmol/L L-DOPA) was added, and the mixture was continually incubated at 30° for 10 min. The increase in absorbance at 490 nm was measured using a microplate reader (Thermo Labsystems). The significant variation between control and tested samples was calculated by t-test.
Data availability
The M. domestica Illumina reads sequences produced in this study have been submitted to NCBI under project number PRJNA279218, BioSample number SRS885965, and SRA study accession SRP056620, whereas the Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GDAV01000000.
Results and Discussion
Illumina RNA-seq and de novo assembly
To identify the key genes related to the proPO system in M. domestica, the combined transcriptome was obtained by Illumina RNA-seq deep sequencing using 15 kinds of samples from housefly larvae, pupae, and adults that were either normal or bacteria-challenged (at 12, 24, and 48 hr) by the mixture of E. coli and S. aureus. A total of 63,490,654 raw reads were produced from a single run. After filtration, 54,821,138 high-quality clean reads remained and were de novo assembled by Trinity into 223,936 contigs. The contigs were further assembled and clustered into 89,842 unigenes with a mean length of 532 nt, which consisted of 30,105 distinct clusters and 59,737 distinct singletons (Table 1). The clean reads and contigs, respectively, were more than 219-fold and nearly 17-fold compared with their counterparts obtained by pyrosequencing housefly larval RNA samples only (Liu et al. 2012), whereas contigs and unigenes were all 1.9-fold more than those from another transcriptome constructed by nine kinds of housefly samples, including normal eggs, larvae, pupae, and adults, as well as bacteria-challenged larvae (at 6, 24, 48 hr) (Tang et al. 2014). Similarly, the contigs increased to 15-fold in comparison with that of the third transcriptome prepared only by insecticide-resistant adult flies (Li et al. 2013). Therefore, this M. domestica transcriptome possessed the most comprehensive data compared with previous transcriptomes (Figure 1, Table 2).
Comparison between different M. domestica transcriptomes. (A-D) comparisons of samples, clean reads, contigs, and unigenes of three M. domestica transcriptomes marked for 1, 2, and 3 on the abscissa, in which 1 stands for the housefly transcriptome published in 2012 (Liu et al. 2012), 2 stands for the transcriptome published in 2014 (Tang et al. 2014), and 3 stands for the transcriptome in this study.
Of the 89,842 unigenes, 67,349 (75%) had reliable CDS. Of them, 60,877 (67.8%) CDSs were predicted by BLASTx, and 6472 (7.2%) CDSs that had no hit BLAST were further predicted using ESTScan (Table 3).
In current research, most contigs ranged from 200 nt to 3000 nt, and the lengths of unigenes ranged mostly from 300 nt to 3000 nt after the short unigenes (<300 nt) are removed. The numbers of contigs and unigenes decreased as their sequence sizes increased (Figure 2). To illustrate, the amount of unigenes longer than 500 nt was 36,190 (40.28%), whereas that longer than 1000 nt declined to 12,408 (13.8%). Similarly, 36,005 CDSs ranged from 201 nt to 500 nt, whereas only 247 CDSs were longer than 3 kb. It is worth mentioning that 391 contigs (>3 kb) are approximately 49-times that in published transcriptome (Liu et al. 2012).
Length distribution of contigs, unigenes, and BLAST CDSs in M. domestica transcriptomic library.
The assembly quality of the unigenes in this transcriptome database was demonstrated by sequencing nine selected unigenes (Table 4), which showed that the coincidence rates between unigenes and sequencing segments were more than 98%, suggesting that the transcriptomic data were highly reliable.
Functional annotation of unigenes
All 89,842 unigene sequences were aligned by BLASTx to the protein databases of Nr, Swiss-Prot, KEGG, COG, and GO and to the nucleotide database of Nt using an E-value cut-off of 10−5. A total of 63,237 unigenes (70.4%) were annotated with known genes against Nr (60,353; 67.2%), Nt (32,818; 36.5%), Swiss-Prot (46,084; 51.3%), KEGG (41,578; 46.3%), COG (17,549; 19.5%), and GO databases (44,346; 49.4%) (Table 5). The remaining unigenes (26,605, 29.6%) that could not be annotated to the existing databases might be potential sources of novel genes. Compared to previous reports in which the annotated unigenes, respectively, were 8166 (24.19%) against Nr (Liu et al. 2012) and 27,021 (57.39%) in total against Nr, Swiss-Prot, KEGG, and COG (Tang et al. 2014), this study provides the most annotated unigenes of 63,237 (70.4%).
As shown in Figure 3, the 17,549 unigenes were classified into 25 COG functional categories, which was the same as two previous reports of D. antiqua (Zhang et al. 2014) and M. domestica (Tang et al. 2014). Among 25 COG categories, the largest one was general function prediction with the biggest gene numbers and percentage (6627; 37.76%), followed by carbohydrate transport and metabolism (3644; 20.76%) and transcription (3584; 20.42%). The smallest cluster was nuclear structure (17; 0.097%). The cluster of defense mechanisms (366; 2.09%) listed fourth from the bottom, which had considerable genes in comparison with that of D. antiqua (109; 1.7%) (Zhang et al. 2014) and M. domestica (279; 1.2%) (Tang et al. 2014).
COG annotations of putative proteins. All putative proteins were classified functionally into 25 categories marked as A-W and Y-Z based on their COG annotations. Of them, the R group was the largest category and accounted for 37.76%, followed by groups G (20.76%) and K (20.42%). In contrast, those marked as Y (0.097%), W (0.99%), and A (1.35%) belonged to the smallest groups.
Of the 60,353 Nr hits, GO terms were assigned to 44,346 unigenes (49.4% of total) for functional categorization, which were divided into 59 functional groups, and were allocated into three main categories, including biological process, cellular component, and molecular function (Figure 4). The top five belonged to the groups of cellular process (32,548; 73.40%), single-organism process (28,495; 64.26%), cell and cell part (26,014; 58.66%), binding (24,663; 55.61%), and metabolic process (24,372; 54.96%). Seven groups including nucleoid, virion, virion part, channel regulator activity, metallochaperone activity, morphogen activity, and nutrient reservoir activity had more than 10 but less than 100 genes. No more than 10 genes were clustered to the groups of protein tag, cell killing, and receptor regulator activity.
GO classification of predicted genes. Three GO categories of genes including biological process, cellular component, and molecular function were presented. The numbers of genes in a given functional group are shown to the right of the y-axis and the percentages are marked to the left of the y-axis.
By searching the KEGG database with the 89,842 unigenes, a total of 41,578 unigenes (46.3%) were mapped to 258 KEGG pathways (Table S2). The most representative pathways were metabolic pathways (5704 unigenes; 13.72%), followed by the cancer (5704; 3.8%), focal adhesion (1469; 3.5%), regulation of actin cytoskeleton (1289; 3.1%), and RNA transport pathways (1221; 2.9%). Notably, a subset of unigenes was assigned to some immune-related pathways, including coagulation cascades and signaling transduction pathways.
In comparison to previous housefly transcriptome published in 2014 (Tang et al. 2014) in which only 53.39% (27,021) unigenes were annotated and less functional annotation information were provided, including 25 COG categories, 48 GO groups, and 239 KEGG pathways, this transcriptome possesses a more abundant transcript information.
Identification of key genes related to the proPO system
On the basis of functional annotations of 89,842 unigenes and similarity search to known genes, a batch of unigenes that might be full lengths or parts of several putative key genes involved in the M. domestica proPO system was harvested. After artificial sequence splicing through the removal of very short unigenes, i.e., those without important structural domains or significant hits, a total of 66 unigenes remained and underwent further sequence analyses and phylogenetic analyses. As a result, these unigenes were clustered into different groups with known genes of PGRPs, proPOs, PAPs, and Serpins, suggesting M. domestica might have these analogs, called as mdPGRPs, mdproPOs, mdPAPs, and mdSerpins (Table 6).
MdPGRP genes:
Previous studies have demonstrated that PGRPs can sense and bind peptidoglycan (PGN) of invading microbial pathogens and subsequently initiate the proPO cascade and melanization in insects (Kurata 2014; Yoshida et al. 1996). So, it is very important to identify mdPGRP genes for exploring the activating mechanism of the proPO system in M. domestica.
Since PGRPs were first purified from the silkworm (B. mori) hemolymph (Yoshida et al. 1996), up to 19 PGRPs have been identified and categorized into two groups of short type (PGRP-S) and long type (PGRP-L) in insects. The short PGRPs have a signal peptide, whereas long PGRPs may lack an export signal but have some transmembrane domains. In addition, PGRP genes are conservative, with at least one PGRP domain similar to N-acetylmuramyl-alanine amidases from insects to mammals (Kurata 2014).
As expected, 21 putative mdPGRP unigenes all have one PGRP domain by SMART analysis in the present study. With the exception of the PGRP domain, the vast majority of unigenes also contain a predicted overlapped amidase domain at their C-terminus. Phylogenetic analyses showed that 21 mdPGRPs were clustered into eight groups with 14 DmPGRPs of D. melanogaster, in which each group contains at least one specific DmPGRP gene, suggesting M. domestica may possess eight different mdPGRP genes (Figure 5). Interestingly, the eight putative mdPGRP genes were also sorted into short type (mdPGRP-SA, -SB, -SC, and -SD) and long type (mdPGRP-LA, -LB, -LC, and -LE), as detected in D. melanogaster (Kurata 2014). Compared with previous studies including transcriptomes (Liu et al. 2012; Tang et al. 2014) and whole genome (Scott et al. 2014), in which the PGRP genes separately were one, 16, and nine in M. domestica, it is acceptable for M. domestica to have eight putative mdPGRP genes in this study.
Phylogenetic analysis of mdPGRP unigenes. The amino acid sequences from 21 mdPGRP unigenes of M. domestica and 14 PGRP genes of D. melanogaster including Dm-PGRP SAb (AHN59598), Dm-PGRP SB1 (AAF49420), Dm-PGRP SB2b (AGB94663), Dm-PGRP SC2 (AAF59051), Dm-PGRP SD (AAF50530), Dm-PGRP LAf (AAF50303), Dm-PGRP LBd (AFH06370), Dm-PGRP LCa (AAF50302), Dm-PGRP LCg (ACZ94669), Dm-PGRP LDc (AAO41277), Dm-PGRP LDd (ACL83248), Dm-PGRP LEa (AAF48519), Dm-PGRP LEb (AFH07408), and Dm-PGRP LF(AAF50301) were used to build the NJ phylogenetic tree by MEGA 5.0 with 1000 bootstraps. The 21 mdPGRP unigenes were clustered into eight groups, with each group containing at least one specific Dm-PGRP gene (marked diamond block).
It is notable that the mdPGRP-LE gene may not have amidase activity, because the sequence alignment indicated that both mdPGRP-LE and DmPGRP lacked the equally critical catalytic residues in their amidase domain (data not shown), suggesting mdPGRP-LE might be the initiator of the proPO activation cascade, as determined in other insects (Kurata 2014; Tindwa et al. 2013).
MdproPO genes:
Since the proPO sequences of M. sexta and B. mori were reported in the 1990s, the number of proPO genes has been more than 20 in insects (Zhu et al. 2011). Most insect species contain two proPO genes, as in M. sexta and B. mori, but in D. melanogaster the number of proPO genes gets three (Ross et al. 2003) and in A. gambiae it becomes nine (Christophides et al. 2002). According to whether the encoded proteins contain the representative domains like Hemocyanin M, Tyrosinase, Hemocyanin C, and Hemocyanin N, eight proPO unigenes were picked out from this housefly transcriptomic database and clustered into two groups with 21 known proPO genes of other insect species through phylogenetic analyses. Of them, six selected unigenes and the SbproPO1 gene of Sarcophaga bullata were assigned to the same group, and the other two unigenes and the SbproPO2 gene of S. bullata were assigned to another group (Figure 6), suggesting that M. domestica at least has two sequential diverged mdproPO genes, defined as mdproPO1 and mdproPO2. In recent years, M. domestica had been reported to have seven or eight proPO transcripts, respectively, in transcriptome (Tang et al. 2014) and whole genome (Scott et al. 2014). Therefore, this study is consistent with previous reports.
Phylogenetic analysis of mdproPO unigenes. The amino acid sequences from eight unigenes of mdproPO in M. domestica and 21 proPO genes in other species including M. domestica (Md, 1), Sarcophaga bullata (Sb, 2), Drosophila melanogaster (Dm, 3), Anopheles gambiae (Ag, 2), Anopheles stephensi (As,1), Armigeres subalbatus (Ars, 3), Aedes aegypti (Aa, 1), Manduca sexta (Ms, 2), Bombyx mori (Bm, 2), Tenebrio molitor (Tm, 1), Holotrichia diomphalia (Hd, 2), and Eriocheir sinensis (Es, 1) were used to build the NJ phylogenetic tree by MEGA 5.0 with 1000 bootstraps. Six unigenes (circle) were clustered with Sb-proPO1 (disc) into group 1, and the other two unigenes (triangles) were clustered with Sb-proPO2 (triangle block) into group 2.
MdPAP genes:
In insects, PAPs are the final key activators in the proPO system and can directly activate proPO into PO by cleaving proPO at an Arg-Phe bond at approximately residue 50 (An et al. 2013; Pang et al. 2014). At present, several PAPs have been identified from B. mori (Satoh et al. 1999), M. sexta (Jiang et al. 2003), D. melanogaster (An et al. 2013), Anopheles gambiae (An et al. 2011a), H. diomphalia (Kim et al. 2002), and T. molitor (Kan et al. 2008), which typically contain a C-terminal catalytic domain (Tryp_SPc) and one or two N-terminal clip domains, therefore called as clip-domain serine proteases (Jiang and Kanost 2000). In this M. domestica transcriptomic database, 14 unigenes were annotated as clip-domain serine proteases, which is more than double that of published transcriptome with six clip-domain serine protease genes (Tang et al. 2014). After removing two sequences with uncompleted domains, the remaining 12 unigenes were clustered into three groups together with 12 known PAP genes by phylogenetic analyses (Figure 7), suggesting M. domestica might have three PAP genes named as mdPAP1, mdPAP2, and mdPAP3, as described in both M. sexta and H. diomphalia (Jiang et al. 2003; Kim et al. 2002). It is notable that no SPH gene was retrieved from any transcriptome (Liu et al. 2012; Tang et al. 2014) as well as genome of M. domestica (Scott et al. 2014). Thus, we speculated that SPH might be unnecessary for mdPAPs to effectively activate the proPO system in M. domestica.
Phylogenetic analysis of mdPAP unigenes. The amino acid sequences from 12 mdPAP unigenes in M. domestica and 12 known genes of prophenoloxidase activating factor in other species including Drosophila melanogaster (Dm-easter AAF55170; Dm-MP1 AAF52151.3; Dm-MP2 AAF54143.1), Holotrichia diomphalia (Hd-PPAE I precursor BAA34642.1; Hd-PPAF III BAC15604.1), Tenebrio molitor (Tm-SPE BAG14262.2), Bombyx mori (Bm-BAEE NP_001036844.1; Bm-PPAE NP_001036832.1), Manduca sexta (Ms-PAP1 AAX18636.1; Ms-PAP2 AAL76085.1; Ms-PAP3 AAO74570.1), and Tachypleus tridentatus (Tt-PCE AAA30094.1) were used to build the NJ phylogenetic tree by MEGA 5.0 with 1000 bootstraps. The circle-marked five unigenes and Dm-easter and Dm-MP1 were clustered into group 1. Two unigenes and one Dm-MP2 (diamond) were divided to group 2. The other five unigenes and Tt-PCE (triangle) were clustered into group 3.
mdSerpin genes:
Serpins have been identified in nearly all species that contain a conserved serpin domain that are usually 350 to 400 amino acid residues long. Some serpins do not have an N-terminal signal sequence and function intracellularly; in contrast, the others have an N-terminal signal sequence and function extracellularly (Gulley et al. 2013). In this study, 123 unigenes were annotated as serine proteinase inhibitors or serpins. After being checked manually, 25 unigenes were retrieved with relatively complete serpin domains. To date, more than 10,000 serpin sequences have been published; the D. melanogaster genome contains 29 Dmserpin genes (Reichhart 2005), whereas the cattle tick R. (Boophilus) microplus has 18 Rmserpin genes (Tirloni et al. 2014). Combined with the function annotations and the phylogeny analyses of intraspecies (Figure S1) and interspecies (Figure 8), the 25 retained serpin unigenes were divided into 11 categories preliminarily, suggesting M. domestica might have at least 11 putative mdSerpin genes. Compared with two previous reports in which the number of Serpin genes, respectively, was 10 (Tang et al. 2014) and 11 (Scott et al. 2014), the 11 putative mdSerpin genes are acceptable in M. domestica.
The interspecific phylogenetic analyses of mdSerpin unigenes. The amino acid sequences from 22 mdSerpin unigenes in M. domestica and seven Dm-Sps (Dm-Sp1 to Dm-Sp7: FBpp008812, FBpp0079094, FBpp0079171, FBpp0079243, FBpp0080979, FBpp0289586, FBpp0110138) of Drosophila melanogaster and 18 Rm-Sps (Rm-Sp1 to Rm-Sp18: AHC98652.1, AHC98653.1, AHC98654.1, AHC98655.1, AHC98656.1, AHC98657.1, AHC98658.1, AHC98659.1, AHC98660.1, AHC98661.1, AHC98662.1, AHC98663.1, AHC98664.1, AHC98665.1, AHC98666.1, AHC98667.1, AHC98668.1, AHC98669.1) of Rhipicephalus microplus were used to build the NJ phylogenetic tree by MEGA 5.0 with 1000 bootstraps. Eleven representative unigenes (triangle) were clustered into 11 groups together with known genes of D. melanogaster and R. microplus.
Altogether, we have identified 24 putative genes through M. domestica transcriptomic analyses, including eight mdPGRP (-SA, -SB, -SC, -SD, -LA, -LB, -LC, -LE), two mdproPO (1–2), three mdPAP (1–3), and 11 mdSerpin (1–11), but short of mdSPH. We selected nine unigenes that were respectively defined as mdPGRP (LE, SC), mdproPO1, mdPAP (1, 2, 3), and mdSerpin (3, 11) and conducted an alignment to their matched sequences in the RefSeq database of the M. domestica genome (Scott et al. 2014). The results indicated that transcript coverage varied from 75% to 100%, the average was 91% of their length, and the e-values of 98% alignments were 0.0 (Table 7). Therefore, it is reasonable for us to infer that M. domestica might have such a novel proPO system composed of these candidate genes, as illustrated in Figure 9.
Simplified schemes of the activation of proPO system in Musca domestica. The proPO system might be triggered by pattern-recognition proteins such as PGRP that have bound to pathogen-associated molecular patterns like PGN. Then, a proteolytic cascade is activated in which the upstream proteinases are activated by the complex of PGRP and PGN, culminating in the activation of pro-forms of prophenoloxidase-activating enzymes (promdPAPs), which are cleaved into active mdPAPs; some mdPAPs are capable of directly cleaving mdproPOs into active mdPOs and the mdPOs catalyze phenols into quinones that are converted into melanin spontaneously. The activating process may be suppressed by several serine proteinase inhibitors (mdSerpins) that specifically block different steps of the activation cascade.
Real-time RT-PCR analysis
To verify whether the predicted genes participate in the activation of the proPO system, the expression pattern analyses were implemented on eight selected unigenes in larvae challenged by E. coli or S. aureus (at 0, 4, 6, 12, and 24 hr) using qRT-PCR methods (Figure 10). After E. coli infection, the significantly higher expression levels were observed at 4 hr after challenge for mdPAP1, mdPAP2, and mdproPO1, at 12 hr after challenge for mdPGRP-SC and mdPGRP-LE, and at 24 hr after challenge for mdPAP3, respectively. On the contrary, the mRNA levels of mdSerpin3 and mdSerpin11 decreased within 0 to 24 hr after challenge, especially at 24 hr after challenge with the most significant downregulation, as shown in Figure 10A. In comparison with E. coli challenge, the S. aureus challenge also resulted in similar expression patterns on these selected genes in larvae (Figure 10B). The similar qRT-PCR results were also observed in a previous report in which those speculated genes of PGRP, PAP, and proPO were all upregulated except Serpin, whose expression pattern was strangely upregulated in M. domestica larvae (Tang et al. 2014). In the present study, the expression profiles of the tested genes were mainly inconsistent with the activating progress of proPO system, which may reflect their functional diversifications in the activation of the proPO system in M. domestica.
Expression profiles of candidate genes. The expression profiles of selected genes were detected by qRT-PCR using the third instar larvae at different time intervals (0, 2, 4, 6, 12, 24 hr) after challenge by E. coli (A) or S. aureus (B), in which actin acted as the quantity and quality control to normalize interest gene expression level. The error bars represent the mean ± SD of three repeat amplifications. The asterisks represent significant differences from the control (unpaired t-test, ***P < 0.001, **P < 0.01, *P < 0.05).
Measurement of phenoloxidase activity
We had previously measured PO activity that could increase rapidly in housefly larvae after challenge by S. aureus and E. coli with the 96-well microplates method (Guo et al. 2014) but did not know whether it was related to mdproPO1 cleavage into mdPO1. Here, prior to immune challenge by the mixture of S. aureus and E. coli, housefly larvae were injected with mdproPO1 antibodies to block the interior mdproPO1 proteins. As an important result, the PO activity increase was reversed to the very low level in infected larvae after blocking with mdproPO1 antibodies (Figure 11). It was confirmed that mdproPO1 can play an indispensable role in the activation of proPO system in M. domestica.
Measurement of PO activity. Phenoloxidase (PO) activity of hemolymph samples were assayed with 96-well microplates method using L-DOPA as a substrate. The PO level in hemolymph from normal larvae remained constant, as marked in the control (blank column). Compared with controls, four sample groups marked 1, 2, 3, and 4 (decorative columns) showed the variable increases in PO activity in the hemolymph. The most significant increase in PO activity appeared in sample 3 from larvae challenged by S. aureus and E. coli for 30 min. The lowest increase in PO activity existed in sample 2 from larvae injected with mdproPO 1 antibodies. Importantly, the marked increase in PO activity in bacteria-infected larvae was reverse of the quietly low level after blocking with mdproPO 1 antibodies in sample 4, which was close to the PO level in sample 1 from PBS challenged larvae. The significant variation between control and tested samples was calculated by t-test (***P < 0.001, **P < 0.01, *P < 0.05).
Conclusion
We have constructed the combined transcriptome by Illumina Hiseq2000 sequencing using multiple RNA samples from normal and bacteria-challenged larvae, pupae, and adults of M. domestica. The 4.93-Gb nucleotides were obtained and orderly assembled into 54.8 million clean reads, 223,936 contigs, and 89,842 unigenes, which represented the comprehensive transcriptomic resource currently available in M. domestica.
This study is the first to identify 24 putative genes related to the housefly proPO system, including eight mdPGRP, two mdproPO, three mdPAP, and 11 mdSerpin, but no mdSPH. The sequence alignment has demonstrated that these putative genes were highly reliable to their matched sequences in M. domestica genome. The qRT-PCR results also suggested that these putative genes might participate in the activating process of the housefly proPO system. It is important that the activation of the proPO system has been testified by measuring the changes of PO activity in bacteria-infected larvae after proPO antibody blockage.
Conclusively, this work may serve as a substantial foundation to outline the framework of the proPO system and further study the activation mechanism and immune defense functions of the proPO system in M. domestica.
Acknowledgments
We thank Prof. Jin-Xing Wang, School of Life Sciences, Shandong University in China, for providing good suggestions regarding the manuscript. This work was supported by research grants from Shandong Provincial Natural Science Foundation of China (no. ZR2011CM012), Shandong Provincial Science and Technology Development Projects of China (no. 2012GGB01172), and the Ph.D. Program Foundation of Jinan University of China (no. XBS0850).
Footnotes
Supporting information is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.115.016899/-/DC1
Communicating editor: M. Boutros
- Received January 22, 2015.
- Accepted June 26, 2015.
- Copyright © 2015 Li et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.