Whole Genome Sequencing and Comparative Genomics of Two Nematicidal Bacillus Strains Reveals a Wide Range of Possible Virulence Factors

Bacillus firmus nematicidal bacterial strains are used to control plant parasitic nematode infestation of crops in agricultural production. Proteases are presumed to be the primary nematode virulence factors in nematicidal B. firmus degrading the nematode cuticle and other organs. We determined and compared the whole genome sequences of two nematicidal strains. Comparative genomics with a particular focus on possible virulence determinants revealed a wider range of possible virulence factors in a B. firmus isolate from a commercial bionematicide and a wild type Bacillus sp. isolate with nematicidal activity. The resulting 4.6 Mb B. firmus I-1582 and 5.3 Mb Bacillus sp. ZZV12-4809 genome assemblies contain respectively 18 and 19 homologs to nematode-virulent proteases, two nematode-virulent chitinase homologs in ZZV12-4809 and 28 and 36 secondary metabolite biosynthetic clusters, projected to encode antibiotics, small peptides, toxins and siderophores. The results of this study point to the genetic capability of B. firmus and related species for nematode virulence through a range of direct and indirect mechanisms.


KEYWORDS
Bacillus firmus complete genomes bioinformatics biological control nematicidal activity virulence factors Bacillus firmus (Bacillales; Bacillaceae) is a species of Gram-positive, rod-shaped bacteria capable of producing endospores under adverse conditions. B. firmus is a cosmopolitan species that can be isolated primarily from soil, but also from a variety of different environments including wastewater and marine sediments (Geng et al. 2014). B. firmus has the potential for biotechnological exploitation, since various strains have been studied for bioremediation applications, such as: enzyme production from agro-industrial waste (Husseiny et al. 2008); heavy metals detoxification (Bachate et al. 2013;Noroozi et al. 2017); textile dye discoloration in wastewater discharge (Rathod and Pathak 2018); and microbial-enhanced heavy oil recovery (Shibulal et al. 2018), among others. Importantly, B. firmus is an agriculturally useful bacteria, as it has been demonstrated to promote plant growth and alleviate abiotic saline stress (El-Esawi et al. 2018), as well as offering plants protection against plant-parasitic nematodes, including root-knot nematodes (RKNs) (Keren-Zur et al. 2000;Giannakou et al. 2004). RKNs belonging to the Meloidogyne genus (Nematoda, Tylenchida, Meloidogynidae) are endoparasites and among the most important plant pests (Sasser 1977;Trudgill and Blok 2001;Bebber et al. 2014). RKNs infestations lead to physiologically stressed, low-yielding plants (Abad et al. 2003) and 5% crop yield loss on average (Carter and Sasser 1985), significantly contributing to about 110 billion EUR per year in economic damage due to plant-parasitic nematodes in agricultural systems across the world (Danchin et al. 2013).
Various broad-spectrum chemical pesticides have been used in crop production for decades to minimize the damage caused by soil-borne plant parasitic nematodes, but many were banned or phased out due to the associated toxicities (Regulation (EC) No. 1107/2009). Biopesticides based on microbial biocontrol agents can be used to control RKNs infestation as a safer alternative to agrochemicals and many candidate bacterial and fungal species have been studied for their potential use as biological nematicides (Wilson and Jackson 2013). Different biopesticides against RKNs are also commercially available, but preparations (AgroGreen) being deployed on the market in the early 2000s (Keren-Zur et al. 2000;Giannakou et al. 2004). Despite commercial use, little is known about the exact mode of action of B. firmus against RKNs Geng et al. 2016;Valencia and Kotcon 2016). Bacterial proteases are considered to be the primary virulence factor in nematicidal B. firmus strains (Lian et al. 2007;Tian et al. 2007;Geng et al. 2016). However, Bacillus spp. are known to produce various secondary metabolites, enzymes and toxins, which have never been studied for nematicidal activity. One example is a cereulide-like emetic toxin purified from B. firmus ATCC 14575 T and ATCC 8247 strains (Taylor et al. 2005). Additionally, Bacillus spp. may inhibit RKNs or alleviate their effects on plants indirectly by strengthening plant defense mechanisms (Kloepper et al. 2004), release of repellents (Valencia and Kotcon 2016), and plant growth-promotion (El-Esawi et al. 2018). Screening for secondary metabolite production potential in nematicidal bacteria could thus be warranted since there are various secondary metabolite compounds from other organisms with substantial nematotoxic properties (Khalil 2013). With the accessibility of next-generation sequencing, bacterial genomes can be assessed through bioinformatics analysis for the genetic potential to produce nematicidal substances, including novel secondary metabolites and virulence factors useful in agriculture. This approach has been used by Zheng et al. (2016) to screen the genomic sequences of 120 Bacillus strains exhibiting nematicidal activity against bacterivorous nematode Caenorhabditis elegans for the presence of various virulence factors. The whole-genome sequence analysis of B. firmus strain DS-1 was also the first step in the determination of nematicidal serine protease Sep1, capable of inhibiting the growth and development of the nematodes C. elegans and Meloidogyne incognita (Geng et al. 2014;2016). Bioinformatics investigation of the nature of B. firmus virulence against plant parasitic nematodes like RKNs is however still lackingespecially assessment of the range of possible virulence factors that could be found in B. firmus genomes.
The aim of this study was to compare the range of possible virulence determinants in the genomes of two geographically and phylogenetically distinct nematicidal Bacillus strainsa B. firmus isolate from widely used commercial bionematicide and a local wild-type Bacillus sp. isolate with nematicidal activity.

Bacterial strains
Two Bacillus strains with nematicidal potential were used in this study. Strain B. firmus I-1582 was isolated from the commercially available biological seed treatment preparation VOTiVO FS (Bayer CropScience, Germany), which is used to protect against soil-borne plant parasitic nematodes. Strain Bacillus sp. ZZV12-4809 was isolated from the pea (Pisum sativum L.) rhizosphere, in Maribor, Slovenia. Both strains were cultivated in LB liquid and solid media at 23°. The nematicidal activity of both isolates has been demonstrated in prior in vitro and pot experiments (Susi c et al. 2019). For endospore differential staining, bacterial cultures in exponential growth phase were inoculated into liquid Difco Sporulation Medium (DSM) (Monteiro et al. 2014) for 48h at 37°, stained according to Schaeffer and Fulton (1933) and visualized under a microscope.

Comparative genomics and bioinformatics analysis
The OrthoVenn web server with default parameters (E-value 1e25 and inflation value 1.5) (Wang et al. 2015; http://www.bioinfogenome.net/ OrthoVenn/) was used to identify orthologous gene clusters in the assembled genomes of B. firmus I-1582 and Bacillus sp. ZZV12-4809, and five other B. firmus sequences available in GenBank as described above. Homologs to known transporters, virulence factors, drug targets and antibiotic resistance genes, were identified using PATRIC by homology to known sequences in the following databases: Transporter Classification Database (TCDB) (Saier et al. 2016 . Antimicrobial resistance (AMR) genes were also annotated by using the k-mer-based AMR gene detection method, which utilizes PATRIC's curated collection of representative AMR gene sequence variants (Wattam et al. 2017). Putative nematode-virulent proteases were revealed by protein BLAST search on the assembled genomes, by sequence similarity with an e-value of 10 25 and minimum 30% sequence identity over 60% of both protein lengths (Rost 1999;Galagan et al. 2003). Selected protein sequences were aligned with Clustal Omega (https://www.ebi.ac.uk/ Tools/msa/clustalo/) to calculate the percent (%) identity matrix and presented in a heatmap. The annotated genome assembly sequence files, including information for contigs and ORFs, were analyzed with antiSMASH v4.2.0 (Blin et al. 2017) and RIPPMiner with pairwise BLAST (Agrawal et al. 2017) to predict and analyze various types of secondary metabolite gene clusters.

Data availability
The complete genome sequence data, including raw sequence reads, genome assemblies and annotations of B. firmus I-1582 and Bacillus sp. ZZV12-4809 used in this study were submitted to NCBI, GenBank under the BioProject accession ID: PRJNA533096. Supplemental material available at figshare: https://doi.org/10.25387/g3.11522544.

Microbial features and identification of bacterial species
Colony and cell morphology as well as 16S rRNA gene and whole genome sequence analysis were performed to determine species identity of the two studied nematicidal bacterial strains. Strains I-1582 and ZZV12-4809 generally formed round, pale-cream colonies on LB agar medium under the described growth conditions, with 2 mm-long rodshaped cells able to form endospores (Supplementary Figure 1B,C). B. firmus I-1582 and Bacillus sp. ZZV12-4809 16S rRNA gene sequence amplification (Supplementary Figure 1A) and EzBioCloud analysis showed 100% and 99.23% similarity (Supplementary Table 1), respectively, to the 16S rRNA sequence of B. firmus strain NBRC 15306 T (GenBank accession: BCUY01000205). Tetra analysis was in accordance with 16S rRNA results, and a TCS search for related genomes returned B. firmus DS1 as the closest genome in the database to I-1582 and ZZV12-4809, with Z-scores of 0.99866 and 0.99709, respectively (Supplementary  Table 2). According to the strict 95% cut-off value to delineate species boundary (Richter and Rosselló-Móra 2009;Chun et al. 2018), strain I-1582, but not ZZV12-4809, could be considered as B. firmus. Phylogenetic analysis within the genus Bacillus to discern the phylogenetic position of this strain accompanied by the ANI values showed that ZZV12-4089 was still most closely related to the B. firmus group (Supplementary Figure 2). ANI calculations and phylogenetic analysis indicated previously unreported variability within the strains of B. firmus that points to incorrect species circumscription in some strains. Therefore, description of a novel species could be warranted (Figure 1) Supplementary Figures 3,4). The genomes of the two strains have slightly different G+C content and similar numbers of tRNA and rRNA genes. In B. firmus I-1582, 75.4% of predicted coding sequences (CDSs) were assigned to the clusters of orthologous groups (COG) categories, and in Bacillus sp. ZZV12-4809 74% of predicted CDSs were assigned to COGs, although the functional category of many CDS could not be defined (Table 2). Multiple syntenic blocks were found to be shared with B. firmus I-1582 and Bacillus sp. ZZV12-4809 assemblies. High-synteny blocks (above 90% nucleotide identity), equal to or larger than 15 kbp, were present in 11 contigs in ZZV12-4809, aligning to 15 contigs in I-1582 (Supplementary Figure 5). Gene assignment to the COG categories indicated similar fractions of CDS assigned to different categories in both strains. Some differences in COG categories were observed between I-1582 and ZZV12-4809; certain categories contain a larger fraction of genes in the first while others in the second corresponding genomes (Table 2).
Comparative genomics reveals high overall similarity and certain strain-specific gene clusters Genome-wide analysis of orthologous clusters is frequently used in comparative genomics studiescommonly in whole genome analysis across species, as orthologous genes are clusters of genes in different species, originating by vertical descent from a single gene in the last common ancestor (Wang et al. 2015). Genome-wide analysis of orthologous clusters revealed that the I-1582 and ZZV12-4809 genomes shared the vast majority of gene families / orthologous clusters (3850), while there were 1.03 and 2.33% specific clusters in I-1582 and ZZV12-4809 cluster lists, respectively. Expanding on the ANI results, the I-1582 and ZZV12-4809 proteomes were separately compared with five other B. firmus in the GenBank database ( Figure 2). The strains formed 5341 clusters, 5211 orthologous clusters (listed in at least two strains) and 3116 single-copy gene clusters, when I-1582 was compared with other strains (Figure 2A); or 5550 clusters, 5419 orthologous clusters and 3096 single-copy gene clusters when ZZV12-4809 was compared with others ( Figure 2B). Apart from the orthologous clusters shared by all the strains, some strains shared a proportionally higher number of clusters with each other. 14_TX shared almost 4 times more additional orthologous clusters with ZZV12-4809 than with I-1582, despite the fact that 14_TX and I-1582 share a higher nucleotide identity. This result could be due to the fact that there were simply more clusters formed by the ZZV12-4809 list than I-1582. Comparison of I-1582 and ZZV12-4809 lists identified 40 and 92 clusters respectively specific to I-1582 or ZZV12-4809. These strain-specific clusters could potentially be involved in important biological processes, so affiliations across GO terms were checked for differences between the strains. Genes in these clusters seemed to be involved in different molecular functions in I-1582 or ZZV12-4809 (Supplementary Table 4).

Antimicrobial resistance genes, drug targets, transporters and virulence factors
Groups of genes that could contribute toward virulent activity of studied bacterial strains were investigated. The B. firmus I-1582 and Bacillus sp. ZZV12-4809 genomes were respectively annotated with 75 and 87 genes homologous to known transporters, virulence factors, drug targets and antibiotic resistance genes ( Table 3). The two genomes shared more than 80% of these genes. Most of the shared genes were associated with resistance to triclosan, phenicol, peptide, aminoglycoside and tetracycline classes of antibiotics. However, strain ZZV12-4809 possessed 14 unique genes not present in I-1582 (Supplementary Table 5)most of which seemed to be involved in resistance to various antibiotics. Among these, there were genes vgbA and vatD involved in resistance to streptogramin antibiotics; bcrC, bacitracin resistance; vanR, glycopeptide antibiotic resistance; fosB, fosfomycin resistance; catA9, resistance to chloramphenicol; rlmAII, resistance to the macrolide antibiotics tylosin and telithromycin (Takaya et al. 2013); and genes blaA and blaD, involved in resistance to b-lactam antibiotics. Of the genes present in strain ZZV12-4809 but not in strain I-1582, there were also capB and capC, which were classified as virulence factors involved in anti-phagocytosis (Candela and Fouet 2006); Figure 1 Maximum likelihood phylogenetic tree based on the 8 available genomic assemblies of Bacillus firmus and Bacillus oceanisediminis 2691 with heatmap annotation. Nodes annotated with bootstrap support. Average nucleotide identity (ANI) values (%) are presented in a heatmap, ranging from lower (violet) to higher sequence identity (green-yellow), and clustered according to the phylogenetic tree. ANI analysis showed 100% sequence identity between B. firmus NBRC 15306 and B. firmus NCTC 10335, which is in accordance with information from the NCBI GenBank (data not shown). B. firmus I-1582 exhibited 98.14% sequence identity with strains NBRC 15306 and NCTC 10335 (highlighted in red), and 89.83% sequence identity with Bacillus sp. ZZV12-4809. The heatmap was annotated with a bar chart showing the varying sizes (Mb) of all 8 assemblies (at the top) and their respective GC (%) content (right-hand side). and aapA encoding an Amino-acid permease AapA transporter. Genes for putative gap-family peptidoglycolipid transporter and hemH for coproporphyrin ferrochelatase (EC 4.99.1.9) were found in I-1582 but not in ZZV12-4809. The latter gene product was classified as a drug target for N-Methylmesoporphyrin (Supplementary Table 5).

Multiple nematode virulence factors in I-1582 and ZZV12-4809 genomes
In order to assess the virulence potential of the two sequenced Bacillus strains, we performed a BLASTP search of the previously described virulence factors of bacterial, fungal, plant and animal origin surveyed by Zheng et al. (2016), bacterial proteases identified by Geng et al. (2016), and whole-genome biosynthetic cluster prediction. The candidate virulence factors found were classified into classes such as proteases (Supplementary loosely into groups based on the conserved family domains, as determined with InterProScan. Most of them belonged to the peptidase S8 domain family ( Figure 3). Five, including Immune inhibitor A metalloproteases (InhA), hypothetical proteins and one putative secreted protease found only in ZZV12-4809, belonged to the M6 peptidase family. Three sequences annotated as Zn-metalloproteinase aureolysin (EC 3.4.24.29), were found to have conserved peptidase M4 domain. Six protein sequences were found to contain M48 peptidase domains, although two of these, annotated as HtpX proteases, had substantially shorter amino-acid sequences and did not cluster with the rest of the group. Haemolysin III homologs clustered more readily with peptidase M48 family proteins, while haemolysin A homologs clustered better with peptidase S8 family proteins. Homologs of haemolysin A in B. firmus genomes had similar domains to TlyA from Mycobacterium tuberculosis, which is known to exhibit hemolytic exotoxin activity in vitro (Rahman et al. 2010). Haemolysin III homologs from I-1582 and ZZV12-4809 showed high sequence identity with the haemolysin III family protein from B. firmus DS1, which exhibited some nematicidal activity in vitro against C. elegans N2 (Geng et al. 2016). Most of the putative nematode-virulent proteases had direct homologous counterparts in either the I-1582 or ZZV12-4809 genome, as indicated by high identity correlation in amino-acid sequences ( Figure 3). One protease gene 'fig|1399.11.peg.4746' in the I-1582 did not have a counterpart in the ZZV12-4809 genome while 4 genes ('fig|1399.9.peg.4924', 'fig|1399.9.peg.3481', 'fig|1399.9.peg.1988' and 'fig|1399.9.peg.272') in ZZV12-4809 did not have a counterpart in the I-1582. This study revealed numerous putative nematode-virulent proteases in the B. firmus I-1582 and Bacillus sp. ZZV12-4809 genomes. However, previous work with B. firmus DS1 has demonstrated that most of the tested putative proteases did not exhibit any nematicidal effects and only recombinant protease EWG10090 (Serine protease 1; Sep1) led to significant 73.2% mortality of C. elegans N2 when n■  (Silva et al. 2017). Through a recursive BLAST search, two additional sequences were found in the I-1582 assembly and one in ZZV12-4809 but appeared to lack the catalytic chitinase domain (Supplementary Table 8).
Secondary metabolites: The secondary-metabolite gene clusters (antiSMASH) analysis revealed the presence of 28 and 36 predicted clusters in strains I-1582 and ZZV12-4809, respectively. Of these, only 9 and 12 were assigned to specific types, and five predicted clusters in each genome assembly (clusters 11, 21, 24, 25 and 28 in strain I-1582, and clusters 10, 11, 16, 17 and 35 in strain ZZV12-4809) were found on the contig borders and were likely to be incomplete (Supplementary Table 9). One gene cluster in strain I-1582 and two gene clusters in the Figure 3 Clustering of putative nematode-virulent proteases found in Bacillus firmus I-1582 (light-gray) and Bacillus sp. ZZV12-4809 (dark-gray) genomes. All-against-all comparison of amino-acid sequences is based on percent (%) identity values calculated with Clustal Omega and presented in a heatmap, ranging from lower (violet) to higher sequence identity (green-yellow). A dendrogram was created using hierarchical clustering by Euclidean distance. The identities of analyzed proteins are given as genomic annotations together with PATRIC IDs (left-hand margin). Colored highlights (bottom margin) represent clustering of sequences according to the conserved peptidase family domains: S8 (blue), M4 (magenta), M6 (cyan), M48 (orange), or other (no color). Asterisks ( Ã ) denote high-similarity protein sequences to the nematicidal protease Sep1 from B. firmus DS1.
ZZV12-4809 genome were predicted to be involved in the synthesis of the siderophore petrobactin which is considered a virulence factor of Bacillus anthracis (Wilson et al. 2010) and is the primary siderophore produced by this bacterium under iron starvation (Koppisch et al. 2005). Due to the biological importance of iron (Beneduzi et al. 2012), siderophore production by rhizobacteria can indirectly inhibit the growth of phytopathogenic soil microorganisms by limiting the bioavailability of iron, or promote plant growth through plant-microbe interactions (Singh et al. 2011). In I-1582, one lanthipeptide was predicted with 43.5% sequence identity (e = 5e-04) with paenicidin A. Two lanthipeptides were predicted in ZZV12-4809, the first showed very weak sequence identity with plantazolicin (44%, e = 2.9) and the second showed 38.9% identity (e = 6e-10) with cerecidin A7. Bacillus spp. are known to produce class I and II lanthipeptides with antibacterial activitylantibiotics (Barbosa et al. 2015). Clusters 10 and 24 in the ZZV12-4809 genome were found to contain genes associated with bacteriocin-terpene (cluster 10) or bacteriocin (cluster 24) synthesis, while in I-1582, one cluster associated with terpene synthesis was predicted. Bacteriocins are antimicrobial peptides produced by bacteria, which have the ability to kill or inhibit closely related bacterial strains without the negative effects on bacteriocin-producing bacterium (Yang et al. 2014). In ZZV12-4809, cluster 19 was predicted to contain a hybrid type I polyketide synthase-nonribosomal peptide synthetase (PKS-NRPS) system, with 35% of genes similar to paenilamicin biosynthetic cluster. Paenilamicin has previously been identified as an antibiotic compound produced by the honey bee pathogen Paenibacillus larvae (Garcia-Gonzalez et al. 2014). Additionally, polysaccharide-type cluster 5 in I-1582 was found to contain two genes similar to aveBIII and aveBII from the avermectin oleandrose (GenBank: AB032523.1) biosynthetic cluster (Supplementary Table 9). Avermectins are considered to be potent anthelmintics and insecticides (Wohlert et al. 2001), one example being the widely commercialized insecticide / acaricide / nematicide / anthelmintic abamectin from Streptomyces avermitilis (Khalil 2013). The results of this analysis indicated an array of possible secondary metabolites potentially produced by the two studied Bacillus strains. This information is useful for further experimental evaluation of these secondary metabolites for nematicidal activity, especially since the exact mechanism(s) of action by B. firmus against nematodes are not fully understood. Although the cuticle-degrading proteases are thought the primary virulence factor in Bacillus (Lian et al. 2007;Geng et al. 2016), it has been observed previously that various B. firmus strains elicited high nematode mortality rates, while having relatively few homologs or gene copies to known nematode-virulence factors in their genomes compared to other bacteria ). This could be due to high expression rates, novel interaction by known virulence factors, or the existence of as yet uncharacterized nematotoxic compounds.

Conclusions
Through whole genome sequencing, comparative genomics and bioinformatics analysis it was possible to determine differences between the B. firmus I-1582 and Bacillus sp. ZZV12-4809 genomes and to assess their genetic capacity for nematode virulence. Although the B. firmus group appears to be relatively homogenous according to 16S rDNA, ANI analysis of whole genome assemblies available in GenBank highlighted previously unreported variability within the B. firmus group that probably mirrors the diverse habitats from which these strains were originally isolated. The differences in ANI% were sufficient to suggest incorrect species circumscription of some strains in GenBank. Various homologs to known nematode-virulent proteases from different organisms were found, specifically 18 and 19 different homologous sequences were found respectively in the B. firmus I-1582 and Bacillus sp. ZZV12-4809 strains. Additionally, two putative nematode-virulent chitinases were found in ZZV12-4809, containing the GH18-chitinase domain necessary to hydrolyze chitin and degrade chitinous structures in nematode eggs, thus contributing to their mortality. Additionally, 28 and 36 secondary metabolite clusters that could express an array of compounds belonging to siderophores, toxins, bacteriocins, lanthipeptides and antibiotics were predicted respectively in the genomes of B. firmus I-1582 and Bacillus sp. ZZV12-4809. The activity and nematotoxic potential of these putative secondary metabolites has yet to be experimentally characterized, but the resulting body of information points to a range of possible direct and indirect mechanisms of B. firmus and related species that could contribute to nematode virulence and could thus be used for targeted studies on virulence mechanisms, gene expression, and proteomic and metabolic studies of B. firmus, widely used as a biocontrol agent in crop production.

ACKNOWLEDGMENTS
We thank Tanja Zidari c for her contribution during strain isolation and Marko Verce for his help with the program code written in Python.