Cotton Late Embryogenesis Abundant (LEA2) Genes Promote Root Growth and Confer Drought Stress Tolerance in Transgenic Arabidopsis thaliana

Late embryogenesis abundant (LEA) proteins play key roles in plant drought tolerance. In this study, 157, 85 and 89 candidate LEA2 proteins were identified in G. hirsutum, G. arboreum and G. raimondii respectively. LEA2 genes were classified into 6 groups, designated as group 1 to 6. Phylogenetic tree analysis revealed orthologous gene pairs within the cotton genome. The cotton specific LEA2 motifs identified were E, R and D in addition to Y, K and S motifs. The genes were distributed on all chromosomes. LEA2s were found to be highly enriched in non-polar, aliphatic amino acid residues, with leucine being the highest, 9.1% in proportion. The miRNA, ghr-miR827a/b/c/d and ghr-miR164 targeted many genes are known to be drought stress responsive. Various stress-responsive regulatory elements, ABA-responsive element (ABRE), Drought-responsive Element (DRE/CRT), MYBS and low-temperature-responsive element (LTRE) were detected. Most genes were highly expressed in leaves and roots, being the primary organs greatly affected by water deficit. The expression levels were much higher in G. tomentosum as opposed to G. hirsutum. The tolerant genotype had higher capacity to induce more of LEA2 genes. Over expression of the transformed gene Cot_AD24498 showed that the LEA2 genes are involved in promoting root growth and in turn confers drought stress tolerance. We therefore infer that Cot_AD24498, CotAD_20020, CotAD_21924 and CotAD_59405 could be the candidate genes with profound functions under drought stress in upland cotton among the LEA2 genes. The transformed Arabidopsis plants showed higher tolerance levels to drought stress compared to the wild types. There was significant increase in antioxidants, catalase (CAT), peroxidase (POD) and superoxide dismutase (SOD) accumulation, increased root length and significant reduction in oxidants, Hydrogen peroxide (H2O2) and malondialdehyde (MDA) concentrations in the leaves of transformed lines under drought stress condition. This study provides comprehensive analysis of LEA2 proteins in cotton thus forms primary foundation for breeders to utilize these genes in developing drought tolerant genotypes.

LEA2 proteins are the members of a larger protein family of the late embryogenesis abundant (LEA) (Hundertmark and Hincha 2008). As the name suggests, this group of proteins are found to in large quantities in seeds at the late stages of embryo development (Dure et al. 1983). Even though, the LEA proteins are synonymous with the seeds, a number of LEA proteins have been detected in the other plant tissues, such as the vegetative tissues (de Nazaré Monteiro Costa et al. 2011). The distribution of LEA proteins is not restricted to plants only, but have been found in animals (10) (Denekamp et al. 2010) and in bacteria (11) (Espelund et al. 1992). The LEA protein families basically have universal structural architecture, high hydrophilicity, low proportion of cysteine (Cys) and tryptophan (Trp) residues and high contents of arginine (Arg), lysine (Lys), glutamate (Glu), alanine (Ala), threonine (Thr) and glycine (Gly). Due to the unique and common features of the LEA proteins, the LEA proteins are mainly referred as hydrophilins with a hydrophilicity index of more than 1 and a glycine (Gly) content of more than 6% (Battaglia et al. 2008).
The late embryogenesis abundant (LEA) proteins have been positively correlated with several of abiotic stress, and have been found to confer tolerance in plants such as Brassica napus (Dalal et al. 2009), rice (He et al. 2012) and Fagus sylvatica (Jiménez et al. 2008). For instance, overexpression of Arabidopsis LEA gene, AtLEA3 have been found to enhance tolerance to drought and salinity stresses ). Overexpression of a rice LEA gene type, OsLEA3-1 was found to confer drought tolerance (Xiao et al. 2007). Similarly, the LEA gene HVA1 LEA gene from barley, was found to confer dehydration tolerance in transgenic rice (Babu et al. 2004). In addition, SiLEA14, a novel gene was found to be highly expressed in the roots of foxtail millet under drought condition (Wang et al. 2014). However, the precise roles of LEA proteins are still not well understood. A number of proposals have been made to explain the possible roles of the LEA proteins in plants during water deficit conditions, such as enzyme protection (Hand et al. 2011), molecular shield (Furuki et al. 2011), hydration buffer (Hundertmark et al. 2012) and membrane interactions (Olvera-Carrillo et al. 2011). To date, a number of studies have been conducted in trying to determine the distribution and characterization of the LEA proteins in various plants, for instance Arabidopsis (Hundertmark and Hincha 2008), Brassica napus (Dalal et al. 2009), water melon (Celik Altunoglu et al. 2017) among other plants. Despite all the significance of the LEA genes, little has been done to investigate their putative role in cotton in relation to drought stress tolerance.
Cotton (Gossypium hirsutum) is an economically important fiber and oil crop cultivated in many tropical and subtropical areas of the world, where they are constantly exposed to a range of abiotic stresses which includes drought, extreme temperature and high salinity (Mahajan et al. 2005). The completion and publication of the draft genome sequences of upland cotton G. hirsutum (Li et al. 2015b), Gossypium arboreum (Li et al. 2015c) and Gossypium raimondii  has become a valuable tool in elucidating the transcriptome factors (TFs) in cotton genomes. There is a paucity of information available about LEA2 sub family in upland cotton. Therefore, in this study we carried out the identification, characterization of the LEA2 genes in three cotton genomes and transformed a novel LEA2 gene, Cot_AD24498 into Arabidopsis thaliana, in which we further investigated the expression levels of the transformed gene in both the transgenic lines and the wild type (WT) under drought stress condition.

MATERIALS AND METHODS
Identification, Sequence Analysis, Phylogenetic Tree Analysis and Subcellular Location Prediction of The LEA2 Proteins In Cotton G. hirsutum, tetraploid (AD) genome LEA2 protein sequences were downloaded from the Cotton Research Institute website (http:// mascotton.njau.edu.cn). The G. arboreum of A genome LEA2 protein sequences were downloaded from the Beijing Genome Institute database (https://www.bgi.com/), and G. raimondii of D genome was obtained from Phytozome (http://www.phytozome.net/). The conserved domain of LEA2 protein (PF03168) was downloaded from Pfam protein families (http://pfam.xfam.org). The hidden Markov model analysis (HMM) profile of LEA2 protein was queried to carry out the HMMER search (http://hmmer.janelia.org/) (Finn et al. 2011) against G. hirsutum, G. raimondii and G. arboreum protein sequences. The amino acids sequences were analyzed for the presence of the LEA2 protein domains by ScanProsite tool (http://prosite.expasy.org/scanprosite/) and SMART program (http://smart.embl-heidelberg.de/). The three cotton genomes LEA2 proteins together with the LEA2 proteins from Arabidopsis (http://www.arabidopsis.org/) and rice (http:// rice.plantbiology.msu.edu/index.shtml) were used to investigate the evolutionary history and patterning in relation to orthology or paralogy among the proteins encoding LEA2 genes. A phylogenetic tree was constructed, the multiple sequence alignments of all the LEA2 proteins were done by Clustal omega, MEGA 7.0 software using default parameters as described by Higgins et al., (Higgins et al. 1996). The physiochemical characteristics of all the obtained LEA2 proteins were determined through an online ExPASy Server tool (http://www.web.xpasy.org/compute_pi/). In addition, subcellular location prediction for all the upland cotton LEA2 proteins were determined through Wolfpsort (https://www. wolfpsort.hgc.jp/) (Horton et al. 2007). The subcellular prediction results were further validated through other two online tools Tar-getP1.1 server (Emanuelsson et al. 2007) and Protein Prowler Subcellular Localization Predictor version 1.2 (http://www.bioinf.scmb. uq.edu.au/pprowler_webapp_1-2/) (Bodén and Hawkins 2005).

Expression analysis of LEA2 genes and determination of the gene to be transformed
The qRT-PCR analysis was used to determine the expression changes of the LEA2 genes in response to drought stress in the two parental lines used. the upland elite cultivar, G. hirsutum is known to be drought sensitive while the wild tetraploid cotton, G. tomentosum is a drought tolerant (Zheng et al. 2016). The two cotton genotypes were treated for drought stress for 14 days. The samples for RNA extraction were obtained from the leaves, stem and roots, at 0, 7 and 14 days of stress exposure. All the samples were taken in three biological replicates in both control and treated seedlings. In order to get the best sets of the LEA2 genes for carrying out qRT-PCR validation, we had to rely on the RNA-sequencing data profiled under drought stress condition. The RNA-Sequence data were downloaded from cotton research institute website (http://mascotton.njau.edu.cn/html/Data). RNAs were reversely transcribed to first strand cDNA by use of TransCript-Allin-One-First-Strand cDNA synthesis Super Mix for qPCR (TransGen, Beijing, China). The fluorescent quantitative primers were designed for the selected genes (24 up and 24 down regulated genes) using Primer Premier 5 (Supplemental Table S1). Actin gene served as a reference. The synthesized cDNA was pre-incubated at 95°for 15 sec, followed by 40 cycles of denaturation at 95°for 5 sec and extension at 60°for 34 sec. The fluorescence quantitative assay was used to analyze expression level of the LEA2 genes in root, leaves and stem tissues of cotton plant, and expression changes in G. hirsutum and G. tomentosum under drought stress. The assay was designed with three replicates and the results were analyzed with the double delta Ct method.
Transformation and Screening of Novel gene Cot_AD24498 (LEA2) in the Model Plant Arabidopsis thaliana (Ecotype Colombia-0) Lines The gene was transformed into model plant, A. thaliana ecotype Colombia-0 (Col-0). The upland cotton, G. hirsutum, accession number CRI-12 (G09091801-2) was used to confirm for the presence of the Cot_AD24498 gene in various tissues. The pWM101-35S:Cot_AD24498 (LEA2) construct in Agrobacterium tumefaciens GV3101 was confirmed by gene specific primer, the forward primer sequence Cot_AD24498 (59CGGATCCATGTCGGTAAAA-GAGTGCGGC39) and reverse primer sequence pair of Cot_AD24498 (59GGTCGACTTACACGCTAACACTGCATCT39), synthesized from Invitrogen, Beijing, China. The Arabidopsis Wild-type (WT) plants were transformed by use of floral dip method (Clough SJ und Bent A 1998). Infiltration media mainly composed of 4.3 g/l, sucrose 50 g/l (5%), 2-(4-morpholino) ethane sulfonic acid (MES) 0.5 g/l, Silwet-77 200 ml/l (0.02%), 6-benzylaminopurine (6-BA) 0.01 mg/l with pH of 5.7. Transformed lines of A. thaliana were selected by germinating seeds on 50% (0.5) MS (PhytoTechnology Laboratories, Lenexa, USA), containing 50 mg/l hygromycin B (Roche Diagnostics GmbH, Mannheim, Germany) for a duration of three (3) days at temperature of 4°t o optimize germination. Upon which the seedlings were transferred to Arabidopsis conditioned growth room set at 16 hr light and 8 hr dark. After 7 days in selection medium, and at three true leaves stage, the seedlings were transplanted into small plastic containers filled with vermiculite and humus in equal ratios. The seedlings at generation T0 were grown to set seeds, the seeds obtained were generation T1. The T1 seeds were germinated in selective antibiotic medium; the onecopy lines were identified by determining the segregation ratio of 3:1 of the antibiotics-selectable marker. The 3:1 ratio of the segregated lines (T2) seeds were again germinated in antibiotics-selective medium, only the lines with 100% were selected for the development of T3 generation. The T3 homozygous progeny was bred from a T2 population after real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) and the selection of three out of the eight successfully transformed overexpressed lines (L2, L3, and L4) was done by using Cot_AD24498 (LEA2) forward primer sequence (59CGAACATCCATCCCTCCAAC39) and Cot_AD24498 (LEA2) reverse primer sequence (59ATCATCAAGAAAACCGACCC39) with total complementary DNA (cDNA) as template. The phenotypic investigations were carried out in T3 homozygous generation.

qRT-PCR Analysis of the Expression of Drought-Responsive Genes in Transgenic Arabidopsis
We assessed the action of the transformed gene in the transgenic lines and the wild type of the model plant, A. thaliana by carrying out expression analysis of two drought responsive genes. ABRE-binding factor 4 (ABF4) gene; forward sequence 59AACAACTTAGGAGGTGGTGGTCAT39 and reverse sequence 59TGTAGCAGCTGGCGCAGAAGTCAT39 and responsive to desiccation 29A (RD29A) gene with forward sequence 59TGAAAGGAGGAGGAGGAATGGTTGG39 and the reverse sequence 59ACAAAACACACATAAACATCCAAAGT39. Total RNA was isolated from four-week-old transgenic Arabidopsis seedlings and wild type (Columbia ecotype) grown under normal conditions (CK) and 15% PEG6000 treatments for 4 days. RNA extraction and real-time RT-PCR (qRT-PCR) analyzed was applied as described in the section" Expression analysis of LEA2 genes and determination of the gene to be transformed", cotton Actin2 forward sequence 59ATCCTCCGTCTTGACCTTG39 and reverse sequence 59TGTCCGTCAGGCAACTCAT39 applied as the reference gene.
Quantification of oxidant and antioxidants in transgenic lines and the wild type When plants are exposed to any form of stress, there are drastic changes which occurs both at molecular and cellular level in order to tolerate the stress factors (Gill et al. 2016). Reactive oxygen species is an oxidant substance being produced continuously from the respiring cells, and plants have an elaborate mechanism to keep the level within nontoxic limit, but when stresses such as drought sets in, the ROS equilibrium shifts leading to excessive production. In this research work, we undertook to evaluate the various oxidants and antioxidants levels between the transgenic lines (L1, L2 and L3) compared to the wild type when exposed to drought stress condition. Catalase (CAT), superoxide dismutase (SOD), peroxidase (POD), Malondialdehyde (MDA) and hydrogen peroxide (H 2 O 2 ) levels were quantified according to the method described by Bartosz (Bartosz 2005). The seeds for transgenic and the wild types were grown in0.5 MS for eight (8) days, then transferred to small conical containers filled with a mixture vermiculite and sand in the ratio of 1:1 and grown for 21 days. After 21 days, water was totally withdrawn from drought treated plants for a period of 8 days, while the controlled plants were watered normally. The leaf samples were then harvested for antioxidants and oxidant determination after 8 days of post stress exposure. The samples were obtained in triplicate, in which each represented a biological repeat.

Availability of Data Statement
The author do affirms that all the data supporting the conclusions of this research work are represented fully within the manuscripts and its supplementary files. Supplemental material available at Figshare: https:// doi.org/10.25387/g3.6626849.

RESULTS AND DISCUSSION
LEA2 protein encoding genes in the cotton genome and other plants In the identification of the LEA2 proteins in the three cotton genomes, we employed the Hidden Markov Model (HMM profile) of the Pfam LEA2 domains PF03168, as keyword to search the three cotton genome sequences databases. Based on the Pfam domain search, we obtained 200 LEA2 genes in G. hirsutum of AD genome, 101 LEA2 genes in G. raimondii of D genome and 110 LEA genes in G. arboreum of A genome. In order to ascertain the various genes obtained for the three cotton genomes, we carried out manual search through SMART (http://smart.embl.de/smart/) and PFAM database (http://pfam.xfam. org) to verify the presence of the LEA2 gene domain. Upon removal of the redundant sequences with no functional domain or those that lacked the LEA2 domains, we eventually obtained 157, 85 and 89 LEA2 proteins in G. hirsutum, G. arboreum and G. raimondii, respectively. The confirmed domains of the LEA2 proteins in the three cotton genomes were further analyzed for their functional domain attributes of the LEA2 proteins, by use of an online tool, conserved domain database (CDD) tool hosted in the NCBI database. The results showed that the LEA2 proteins were members of c112118 super family with E values ranging from 0 to 0.008 (Supplementary  Table S2) and all contained transmembrane domain (Supplementary Table S3) The association of the LEA2s with transmembrane domain could possibly explain the reason why the LEA proteins are found in high concentrations in seeds at late stages of seed development, this possibly to aid in maintaining the stability of the cell membrane under dehydration state. Similar results have also been reported in some of the drought and salt enhancing genes such as Salicornia brachiata SNARE-like superfamily protein (SbSLSP), has been reported to be localized in the plasma membrane (Singh et al. 2016). LEA2 proteins could be playing an integral role in maintaining nonlethal level of reactive oxygen species (ROS homeostasis) in order to minimize oxidative damages to cellular membranous and macromolecules, in addition, LEA2s could also be playing similar roles as the aquaporin's, the water channel proteins, which are responsible in the regulation of water movement channels such as plasmodesmata and xylem vessels (Buckley 2015). Aquaporin's (AQPs) have been associated with salt and drought stress tolerance in plants, the aquaporin's share similar functional domain with LEAs, being basically membrane proteins (Li et al. 2015a).
The number of proteins encoding the LEA2 genes found in G. arboreum, G. raimondii and G. hirsutum were relatively higher than the number recorded in other plants, the entire repertoire of LEA proteins in the 8 LEA families outlined in (Hundertmark and Hincha 2008) have been found to be 34 in rice (Wang et al. 2007), 30 in Chinese plum (Du et al. 2013), 27 in tomatoes (Cao and Li 2014), 53 in poplar (Lan et al. 2013) and 29 in potatoes (Charfeddine et al. 2015), which is far below the individual numbers of LEA2 in the three cotton genome. The abundance of cotton proteins encoding the LEA2 genes could be possibly due to their unique characteristics of being more hydrophobic than other LEA2 proteins from other species and or they could have evolved much later after other transcriptome factors. The genome size of plants and animal is constant, and high abundance of a particular gene family gives an indication of their integral role in enhancing the survival of the plants. The ever changing environmental conditions, plants are constantly faced with hearse environmental condition and disadvantaged by their sessile nature. The survival of the plants under these extreme environmental conditions therefore is through the increase of more stress tolerance genes or integrating a more complex gene interaction in initiating adaptive response mechanisms aimed at increased tolerance levels (Avramova 2015).
Phylogenetic analyses of LEA2 proteins in G. hirsutum, G. arboreum and G. raimondii Phylogenetic tree analysis provides valuable knowledge on the lines of evolutionary descent of different genes or proteins from a common ancestor, since its inception, it has remained a powerful tool for structuring classifications, biological diversity and for providing insight into events that occurred during gene evolution (Gregory 2008). In this study a total of 157, 85 and 89 LEA2 proteins were identified from G. hirsutum, G. arboreum and G. raimondii, respectively (Table 1). All the LEA2 proteins were aligned by the neighbor joining (NJ) method in ClustalW. The various LEA2 proteins from upland cotton, G. arboreum, G. raimondii, A. thaliana, T. cacao and G. max were analyzed. The inclusion of A. thaliana, T. cacao and G. max in the analysis of the cotton LEA2s was due to fact that Theobroma cacao share ancestral origins with cotton, A. thaliana and G. max have undergone whole genome duplication similar to cotton plant. The resulting phylogenetic tree showed that the cotton LEA genes tend to cluster together. Based on the clustering pattern, the LEA2 genes were sub-divided into 6 groups, namely group 1 with three sub-groups, group 2, group 3 with two subgroups, group 4, group 5 and finally group 6 with 5 sub-groups. Groups 1, 2, 4 and 5 were entirely LEA2 proteins from the three cotton genomes.
The LEA2s seems to have evolved later among all the LEA genes, in the analysis of the LEA genes in sweet orange, the highest among all the 8 members of the LEA genes were members of the LEA2 (Muniz Pedrosa et al. 2015), this kind of observation was replicated in a number of plants. More than a half of the phylogenetic tree was mainly covered by the cotton LEA2 proteins, with no presence of LEA2s from other plants used in the analysis of the phylogenetic tree. Theobroma cacao, being evolutionary related to cotton, a few members of the LEA proteins clustered with cotton, while majority of the proteins encoding the LEA2 genes from Theobroma cacao clustered together.
The late embryogenesis abundant (LEA2) proteins from A. thaliana were found to cluster with those of cotton LEA2s in group 3 and 6 (3-2 and 6-1) while Glycine max LEA2 proteins were predominantly found in group 6-1 ( Figure 1). No ortholog gene pairs were detected between the proteins encoding the cotton LEA2 genes of cotton to any of the plants used. All the ortholog gene pairs occurred between G. hirsutum and G. arboreum, G. hirsutum and G. raimondii and G. arboreum and G. raimondii. Interestingly, even Theobroma cacao, which is evolutionary related to Gossypium species, had their LEA2 proteins clustered together.
The abundance of LEA2s in plants can be explained by either being the last members of the LEA genes to evolve and or due to duplication. Upland cotton is a tetraploid cotton, having emerged through whole genome duplication (WGD) between the two diploid cotton of A and D genomes. The high number of LEA2 genes, have also been observed in Arabidopsis (Hundertmark and Hincha 2008). Therefore, we could infer that LEA2 proteins might have evolved later after species divergence and the presence of ortholog genes in the cotton genome could be due to the whole genome duplication event coupled with chromosome rearrangement. It is generally assumed that ortholog genes have the same biological functions in different species (Tatusov 1997), and duplication makes room for paralogous gene pairs to evolve new functions (Ohno 1970). LEA2 genes could be functionally-oriented ortholog groups consisting of orthologous pair which plays the same biological role in the three different cotton genomes.
Physio-chemical analysis, subcellular localization and amino acid composition of the LEA2 genes in upland cotton In the analysis of the physio-chemical properties of the LEA2 genes in upland cotton, the proteins encoding the LEA2 genes had varied molecular formulae though with similar elemental composition, carbon (C), hydrogen (H), oxygen (O), nitrogen (N) and sulfur (S) in varying proportions. Molecular weights ranged from 11.5384 to 73.5831 kD, Pl values from 4.63 to 10.35, aliphatic index from 19.78 to 65.4, instability index from 6.91 to 63.52, protein lengths ranged from 100 to 661 bp and the grand average of hydropathy (GRAVY) values ranged from 0.574 to 1.04. The grand average hydropathy (GRAVY) values showed that almost all the LEA2s are hydrophobic proteins, the hydrophobic nature of proteins is integral for their biological functions, allows the proteins to fold spontaneously into complex three-dimensional structures that are significant for biological activity (Gosline et al. 2002). The hydrophobic nature of the proteins enables the removal of nonpolar amino acids from solvent and their burial in the core of the protein, this attribute is common among the aquaporin's (AQPs), water channel proteins, are highly hydrophobic and known to have a functional role in water and salt stress tolerance in plants (Sreedharan et al. 2013). In the sub cellular localization prediction, 10 different sites were detected, in which majority of the LEA2 proteins were found to be localized within the chloroplast with 73 genes. Further analysis by TargetP and Pprowler, more than 70% of the genes were found to be associated with secretory pathway and chloroplast (Table 2 and Supplementary Table S4). The high number of these genes in chloroplast explains their significant role in drought stress, since chloroplast plays a central role in plant response to stress (Gläßer et al. 2014). The connection between different stress n responses and organellar signaling pathways such as reactive oxygen species, emanate from the chloroplast (Kmiecik et al. 2016). Chloroplasts being semi-autonomous organelles provide complex communication channel that allow for effective coordination of gene expression since most plastid localized proteins are nuclearencoded, thus ensuring an effective functioning of overall cellular metabolism (Pfannschmidt et al. 2009). Numerous and vital cellular processes such as aromatic amino acids, fatty acids and carotenoids biosynthesis and sulfate assimilation pathways are harbored within the chloroplast, in addition to photosynthesis, these cellular processes are known to be key factors in plants response to stress. The chloroplast acts as a sensor to abiotic stress thus initiates different cell functions in response to stress factor, enhancing adaptability of the plant to the environmental stress (Mittler 2006). Higher proportions of LEA2 genes were found to be localized within the cytoplasm, nucleus and mitochondrion, with 24, 20 and 16 genes respectively, which further provided a stronger evidence of the importance of these genes in enhancing drought tolerance ability in cotton. The following cell structures contained low numbers of LEA2 genes, endoplasmic reticulum (E.R) with 3, extracellular structures with 5, Golgi body 6, plasma 4 and vacuole with 3 genes each. The result obtained for the subcellular localization of the LEA2 genes is in agreement to previous findings in which the highest proportions of LEA2 genes were found to be localized within the cytoplasm and chloroplast, accounting for 35.7% and 30.9% of the total LEA2 genes in sweet orange, while others were found to target endoplasmic reticulum (E.R) and mitochondrion (Muniz Pedrosa et al. 2015). Similarly, abiotic stress related gene, plasma membrane protein 3 (PMP3), a member of the small hydrophobic polypeptides with high sequence similarity, and have been functionally characterized to be responsible for salt, drought, cold, and abscisic acid, have been found to be sub localized in the nucleus, cytoplasm, and cell membrane (Fu et al. 2012). The cell compartmentalization of stress related genes is fundamental to their functional role (Osman et al. 2009), the presence of the proteins encoding LEA2 genes in the chloroplast, could be responsible for maintaining osmotic balance and suppression of reactive oxygen species (ROS) production in the guard cells , while those present in the membrane, could be responsible for the protection of the membrane integrity (Guo et al. 2009). In addition, the sub cellular localized proteins encoding LEA2 genes embedded in the channeling or transporter organelles such endoplasmic reticulum, are likely to aid in the process of the ions sequestration (Porcel et al. 2005). Based on various findings, the LEA protein families are known to have a universal structure, with varying proportions of the various amino acids (Hong-bo et al. 2005). In order to verify the LEA2 proteins due to their unique hydrophobic property, we found that the LEA2s are rich in nonpolar aliphatic amino acid residues, in which the highest proportion was noted in leucine with 9.2%, Valine with 8.2%, isoleucine (6.3%), alanine (5.9%) and the least was proline (5.7%). The high proportions of the non-polar residues, indicated that the LEA2 proteins are mainly embedded within the membrane, non-polar amino acids are found in the center of water soluble proteins while the polar amino acids are found at the surface (Petukhov et al. 1998). The second in proportions were the polar, non-charged residues such as serine (8.9%), threonine (6.4%), cysteine 1.9%), methionine (2.2%), asparagine (5.0%) and glutamine (3.4%) The high proportions of the polar residues have been found to be predominant among the stress related proteins, such as the heat shock proteins (HSPs) (Wang et al. 2004), therefore the presence of the polar residue, indicated that the LEA2 proteins could be responsible for coating the cellular macromolecules with a cohesive water layer and in turn protect the membrane and the membrane bounds multiprotein complexes from unfolding and aggregation during drought stress condition.

Genomic organization and motif detection of LEA2 proteins in cotton
Analysis of the exon-intron structure of all the 157 LEA2 genes was done using the gene structure displayer (http://gsds.cbi.pku.edu.cn/), a greater percentage of the LEA2 genes and their exons were highly conserved within the group (Supplementary Figure S1). Most of the LEA2 genes were intronless, with 114 genes, accounting for over 73%, of the LEA2s found to be intronless. The existence of introns in a genome is argued to cause enormous burden on the host (Wahl et al. 2009). The burden is because the introns requires a spliceosome, which is among the largest molecular complexes in the cell, comprising of 5 small nuclear RNAs and more than 150 proteins (Wahl et al. 2009). Intron transcription is costly in terms of time and energy (Lane and Martin 2010). Due to various stresses in which the plants are exposed to, the energy demand for survival is relatively high, thus various gene actions within the plant has to function under conserved energy demand threshold (Timperio et al. 2008). A plant under stress condition requires to survive the effects caused by overload of excessive production of reactive oxygen species (ROS), 3,4-Methylenedioxyamphetamine (MDA) and low levels of Peroxidase (PODs) activities, therefore most of the genes responsible for stress tolerance either lack introns or possess significantly reduced number of introns within their gene structure (Jeffares et al. 2008). Being the transcription process of the intron laden genes requires a lot of time and energy, which is hypothesized to cause or results into deleterious effect on gene expression (Calderwood et al. 2003). Conserved motifs in the 157 LEA2 proteins were identified through an online tool MEME (Supplementary Figure S1). The motif lengths identified by MEME (http://meme-suite. org/), were between 14 and 112 amino acids in LEA2 proteins, similar results of conserved motif with lengths between 11 and 164 amino acids were obtained in cotton MYBs protein (He et al. 2016). The homology in motif lengths with that of MYBs provided significant evidence supporting the possible role of the LEA2s in response to water stress which includes the regulation of stomatal movement, the control of suberin and cuticular waxes synthesis and the regulation of flower development (He et al. 2016). Most of the LEA2 proteins had distinctive motifs, which are valuable for their identification, the common motifs identified for the cotton LEA proteins were; motif 1(FFVLFSVFSLILWGASRPQKPKITMKSIKFENFKIQAGSDFSGVPT-DMITMNSTVKMTYRNTATFFGVHVTSTPLDLSYSQJTIASG), motif 2 (WLVFRPKKPKFSLQSVTVYAL), motif 3 (NFQVTVTARNPNKRIG IYYD), motif 5 (TVKNPNFGSFKYDNSTVSVNYRGKVVGEA) and motif 14 (RRRSCCCCCCLWTLJ) (Supplementary Figure S2).
The number of the conserved motifs in each LEA2s varied between 1 and 7. The majority of close members in the phylogenetic tree exhibited common motif compositions, which suggested they have a functional similarity within the same subgroup. The alignment results of the LEA2 proteins showed various segments such as Y-segment, K-segment and S-segments (Supplementary Figure S3), which have been previously described in dehydrins (Hanin et al. 2011). Other unique segments identified were E, R and D segments. The K segment has been found to form an amphipathic a-helix (Monera et al. 1995). The K-segments assumes a-helical structure identical to class A2 amphipathic a-helices mainly found in apolipoproteins, apolipoproteins facilitate the transportation of water-insoluble lipids in plasma, and a-synucleins (Rorat 2006). The conformation of the protein structure in turn leads to functional change (Dyson and Wright 2005). Drought stress alters the protein ambient microenvironment, leading to protein conformational n and functional changes (Mahdieh et al. 2008). The amphipathic a-helices have the ability to interact with the dehydrated surfaces of various other proteins and biomembranes (Cornell and Taneva 2006). The binding of dehydrins to the dehydrated surface of other proteins enhances formation of amphipathic a-helices which protects other proteins from further loss of water. The presence of this K segment in LEA2 revealed the significant role played by these proteins in plants during drought stress. It has been suggested that the protective role of the LEA proteins is due to their ability to form a-helices which enables them to interact with other proteins and or biomembranes (Koag 2003). Kovacs et al., (Kovacs et al. 2008), reported the protective activities of two dehydrin proteins isolated from A. thaliana, early response to dehydration 10 (ERD10) and early response to dehydration 14 (ERD14), against thermal inactivation of alcohol dehydrogenase and thermal aggregation of citrate synthase.
Chromosomal location and duplication events of cotton LEA2 genes A gene's location on a chromosome plays a significant role in shaping how an organism's traits vary and evolve (Lazazzera and Hughes 2015). Chromosomes hold thousands of genes, with some situated in the middle of their linear structure and others at either end (Bickmore and Van Steensel 2013). Therefore, for us to understand the gene distribution and mapping positions of the LEA2 genes, the positions of each LEA2 genes were mapped on the A, D and AD cotton chromosome by carrying out homology search against the full-lengths of G. arboreum (A-genome), G. raimondii (D-genome) and G. hirsutum (AD genome) assembly. The LEA2 genes were mapped in all the 26 chromosomes in G. hirsutum, 13 chromosomes in G. arboreum and 12 chromosomes in G.raimondii. In diploid cotton genome, G. arboreum and G. raimondii, the gene distribution pattern was almost identical to the tetraploid cotton gene distribution (Supplementary Figure S4). In chromosome 9 in G. arboreum and its homolog chromosome in G. raimondii, a significant level of gene loss was observed in which only a single gene was contained in chr09 of G. arboreum compared to 10 genes in chr09. But more interestingly, there was total gene loss in chr13 of G. raimondii. The lack of LEA2 genes in chr13 in G. raimondii could only be accounted for due to either gene loss or gene deletion, for most of the LEA genes are found in every chromosome. The occurrences of LEA2 genes on every chromosome indicated that the genes are widely distribution on the entire cotton genome. However, the density of these loci was variable across the 26 chromosomes of upland and 13 chromosomes in A and D diploid cotton. The largest number of genes were located on chromosomes At09 (chr09) and Dt09 (chr23), with 12 and 14 genes respectively, followed by chromosome, Dt08 (chr24) with 10 genes, Dt 06 (chr25) with 9 genes, At07 and At12 with 12 genes each. The lowest loci ranged from 1 to 5 genes, with chromosome At02, At05, At09, Dt02 (chr14) and Dt04 (chr22) had a single gene each (Supplementary Figure S5). A total of 39 genes were not mapped and thus grouped as scaffold. The distribution of the genes on the chromosomes appeared to be uneven. In general, the central sections of chromosomes were located with less LEA2 genes and relatively high densities of upland cotton LEA2s were observed in the top and bottom sections of most chromosomes. Similar gene loci clustering pattern was also observed in GrMYB genes distribution in which most of the genes were clumped either on the upper or lower regions of the chromosomes (He et al. 2016). A gene's location on a chromosome plays a significant role in shaping how an organism's traits vary and evolve (Sexton and Cavalli 2015). It has been found that evolution is less a function of what a physical trait is, but more of where the genes that affect that trait are located in the genome (Sexton and Cavalli 2015). The distribution of this subset of LEA genes across the whole cotton genome provided a significant role played by these genes within the plant.
The main cause of gene expansion in a genome or organism is either due to segmental or tandem duplication (Cannon et al. 2004). Two or more genes located on the same chromosome, one following the other, confirms a tandem duplication event, while gene duplication on different chromosomes is designated as segmental duplication event (Yu et al. 2005). In the present study, cluster formations by the LEA2 genes explained the mechanism behind their expansion in cotton. Most of the duplicated genes were between G. hirsutum and its ancestors, G. arboreum (53) and G. raimondii (11) ( Table 3). The tetraploid cotton, G. hirsutum evolved due to whole genome duplication resulting into polyploidy cotton. The Ka/Ks values ranged from 0 to 2.17333, with an average value of 0.4238, which implied that majority of the gene pair had Ka/Ks values of less than 1, which indicated that the LEA2 genes have been influenced extensively by purifying selection during the process of their evolution.

Cis element prediction in LEA2 proteins
Transcription factors (TFs) and cis-acting regulatory elements contained in stress-responsive promoter regions function not only as molecular switches for gene expression, but also as terminal points of signal transduction in the signaling processes (Chang et al. 2008). The cis-regulatory promoters are located on the upstream of genes and functions as binding sites for transcription factors (TFs) which play essential functions in determining the tissue-specificity or stressresponsive expression patterns of the genes (Yamaguchi-Shinozaki and Shinozaki 2005). For better understanding of the potential roles of the LEA2 genes, 1000 bp regions upstream of the transcriptional start site were extracted and used in the identification of cis-regulatory promoters and other important motifs. Abiotic stress-related cis-elements were found in the putative promoters of LEA2 genes in upland cotton, G. hirsutum, (Figure 2) and (Supplementary table S5). For instance, MYBCORE, is known to have a functional role in drought and regulation of flavonoid biosynthesis (Solano et al. 1995). ABRELA-TERD1, ABRE-like sequence and ACGTATERD1 are responsive to dehydration (Simpson et al. 2003). ACGTATERD1 is associated to early responsive to dehydration (Simpson et al. 2003). The presence of the stress promoter elements strongly supported the possible role of upland cotton LEA2 proteins in enhancing drought tolerance in cotton. The high proportion of cis promoter elements in LEA2 proteins, could possibly explain why genes encoding LEA proteins are highly expressed under abiotic stress, as was found in the root tissues of Arabidopsis under drought stress (Dalal et al. 2009;Candat et al. 2014). It is also important to mention that various transcription factors (TFs) and cis-acting regulatory elements contained in stress-responsive promoter regions function not only as molecular switches for gene expression, but also as terminal points of signal transduction in the signaling processes (Yamaguchi-Shinozaki and Shinozaki 2005).

Prediction of LEA genes targeted by miRNAs
Drought is a recurring climate feature in most parts of the world (Kang et al. 2009). The sessile nature of the plants, has made the plants to developed their own defense systems to cope up with perennial and erratic adverse climatic conditions (Bartwal et al. 2013). One of the defense mechanisms used by the plants toward the effect of drought stress is the reprogramming of gene expression by microRNAs (Ferdous et al. 2015). The small RNAs (miRNAs) are known as the   small noncoding RNAs with approximately 22 nucleotides length. The miRNAs are mainly involved in the regulation of genes at post-transcriptional levels in a range of organisms (Grivna et al. 2006). Large groups of small RNAs have been reported as regulators in plant adaptation to abiotic stresses (Xie et al. 2015). To get more information on the LEA2 genes functions, we determined the prediction of miRNAs targets on LEA2 genes by the use of psRNATarget, the same as been applied for other functional genes in cotton (Dai and Zhao 2011). Out of 157 upland cotton LEA2 genes, 63 genes were found to be targeted by 48 miRNAs, representing 40% of all the LEA2 genes (Supplementary Table S6). The highest levels of target was detected for the following genes with more than 6 miRNAs, CotAD_00799 being targeted by ghr-miR2948-5p, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR7494 and ghr-miR7510b. CotAD_19205 targeted by ghr-miR390a, ghr-miR390b, ghr-miR390c, ghr-miR7492a, ghr-miR7492b and ghr-miR7492c. CotAD_31936 targeted by ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR827a, ghr-miR827b and ghr-miR827c. CotAD_ 32487 targeted by ghr-miR156a, ghr-miR156b, ghr-miR156d, ghr-miR7507 and ghr-miR7509. CotAD_33143 targeted by ghr-miR2948-5p, ghr-miR482a, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c and ghr-miR7510b. CotAD_41925 targeted ghr-miR396a, ghr-miR396b, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR827a, ghr-miR827b and ghr-miR827c. The rest of the genes were either targeted by 1 or 5 miRNAs. The high number of miRNAs targeting LEA2 genes could possibly have direct or indirect correlation to their stress tolerance levels to abiotic stress more so drought. Some specific miRNAs had high level of target to various genes such as ghr-miR164 (4 genes), ghr-miR2949a-3p (4 genes), ghr-miR2950 (8 genes), ghr-miR7492a (10 genes), ghr-miR7492b (10 genes), ghr-miR7492c (10 genes), ghr-miR7504a (5 genes), ghr-miR7507 (5 genes), ghr-miR7510a (6 genes), ghr-miR7510b (10 genes), ghr-miR827b (4 genes) and lastly ghr-miR827c (4 genes). It has been found that miRNAs might be playing a role in response to drought and salinity stresses through targeting a series of stress-related genes. The plant specific transcriptome factors such as NAC gene family have been found to have varied functional roles in plant growth and development (Pereira-Santana et al. 2015), myeloblastosis (MYB) is highly correlated to various stress factors (Ambawat et al. 2013). The detection of some the LEA2 genes being targeted by specific miRNA linked to mitogen-activated protein kinase (MAPK), N-acetyl-L-cysteine (NAC) and myeloblastosis (MYB) provided a stronger indication of the significance contributions of the LEA2s in enhancing drought tolerance in plants. The micro/small RNAs mediated post-transcriptional processes have been linked to response to water deficit condition. Plant miRNAs are involved in multi-complex and arrays of processes, including but not limited to response to stress, nutrient limitation, development, pattern formation, flowering time, hormone regulation, and even self-regulation of the miRNA biogenesis pathway (Yamaguchi-Shinozaki and Shinozaki 2005). It is important to note that most of the miRNA target genes encode transcription factors, which place miRNAs at the focal point of gene regulatory networks. Moreover, the availability of genome-wide characterization of cotton miRNA genes enabled us to perform the prediction of the miRNA targets involved in drought response.
Expression Patterns of LEA2 Genes in Different Tissues of Upland cotton as determined Through RNA sequence Analysis of the RNA expression profile provides an indicator of the functional role of the genes in the plant. We therefore carried the RNA n  (Figure 3). Based on their expression profiling, the genes were clustered into three broad groups. Group 1 members with 29 genes were highly up regulated under drought and salt conditions. Under salt and drought stress, CotAD_33321, CotAD_41571, CotAD_ 11876, CotAD_24498 and CotAD_59405 showed the highest expression levels, Similarly CotAD_11876, CotAD_24498 and CotAD_59405 were equally significantly up regulated in all the tissues tested. A total of 23 genes were highly up regulated in 5 tissues, which provided a strong evidence of the functional role of the LEA2 genes in enhancing stress tolerance in plants. Majority of the analyzed genes, showed relatively lower expression levels in the root tissues, but CotAD_11876, CotAD_59405 and CotAD_24498 exhibited significant higher expression levels, with expression values of more than 2. A unique observation was made, among the moderately up regulated genes in the roots, the genes exhibited significant up regulation in the calyx. The up regulation of these genes in the reproductive tissues could be an indication of their functional role in the fiber development process.
In the validation of the expression profile of the LEA2 genes under drought stress condition, CotAD_24498, CotAD_21924, CotAD_20020 and CotAD_59405 were highly up regulated in root, stem and roots tissues under drought stress condition. However, the expression levels were much higher in G. tomentosum as opposed to G. hirsutum, suggesting that, these genes could be the key genes.
qRT-PCR Expression profiling of the LEA2 genes in leaf, stem and roots of upland cotton Based on the results obtained from the RNA sequence data, 48 genes were selected for qRT-PCR validation. Two cotton genotypes were used, G hirsutum an elite cultivar, majorly grown around the world; it covers more than 90% the cotton growing regions in China but susceptible to drought stress condition. The second plant used was the G. tomentosum, wild cotton, native to the Hawaiian island, it is known for its high ability to tolerate salinity and drought stress conditions. The two cotton plants were grown in the greenhouse, and at three leaf stage, were exposed to drought for a period of 14 days. The roots stem and leaves were obtained for RNA extraction and qRT-PCR analysis. In the analysis of qRT-PCR profiling of various tissues, the results indicated high variability in transcript abundance of LEA2 genes in upland cotton (Figure 4). In G. tomentosum and G. hirsutum, majority of these genes showed relatively high expression in the root and leaf, except in stem. Leaves and roots are the main plant organs affected by drought stress (Alexandersson et al. 2005). The plant leaf is the site for photosynthesis; drought stress might possibly be the cause of excess release of reactive oxygen species (ROS). ROS are toxic to the plants, the genes with high expression in the leaves, could perhaps be involved in the ubiquitin of the ROS, thus preventing the damage and maintain the normal functions of the photosynthetic cells. The high osmotic potential generated in the cytoplasm of guard cells during stomatal opening could probably lead to accumulation of LEA2s in leaf tissue. Increased osmotic potential within the guard cells necessitates mass flow of water into the guard cells, leading to its turgidity and thus opening of the stomatal pore, but during drought stress, the osmotic potential is never offset, and thus dehydration stress on the nucleus. The LEA2s increased accumulation within the leaf tissues, could be due to maintaining structural integrity and preventing the membranes from dehydration stress. The finding is consistent to proposed functions of the LEA genes, which is the protective role during abiotic stresses (Nylander et al. 2001). The roots are the connection point between the water reservoir and the plants. High up regulation of LEA2 genes in the roots indicated that these genes could be involved in the water balance in the roots. Increased or high up regulation of LEA2s in the roots, further augment the primary role of LEA genes in plants, the protective function, roots are the very first plant organs to be affected by drought stress.
Expression profiles of LEA2 genes Under drought treatment in G. hirsutum and G. tomentosum Gene expression profile provides vital information of the roles played by the genes in plants (Movahedi et al. 2012). In order to determine the expression pattern of the LEA2 genes in tolerant and non-tolerant upland cotton genotypes, we carried the qRT-PCR validation of Figure 2 Average number of the cis-elements in promoter region of upland cotton G. hirsutum LEA2 genes. The cis-elements were analyzed in the 1 kb upstream promoter region of translation start site using the PLACE database.
48 LEA2 genes in leaves, roots and stem tissues. The 48 genes were selected based on the RNA sequence expression profile, 24 genes were up regulated while the other half were down regulated. The samples for qRT-PCR were collected at 0, 7 and 14 th day of stress exposure, in which 0 day (control) was used as the reference point. More genes were up regulated in all the tissues of the drought tolerant genotype, G. tomentosum as compared to the drought sensitive genotype, G. hirsutum ( Figure 5). The result obtained denotes that the drought resistant genotype have the potential to mobilize more drought related genes, when exposed to drought tolerance as opposed to the less tolerant  genotypes, thus the higher expression levels, similar results were obtained in the expression for cold tolerance genes in Arabidopsis with varying tolerance levels, more genes were up regulated in the cold tolerant and in the cold susceptible genotype (Hannah et al. 2006).
The up regulation of LEA2 genes under drought stress, could possibly explain their protective role in plants tissues under dehydration stress. For instance, HVA1, a LEA gene from barley (Hordeum vulgare L) was found to confer drought stress in transgenic rice (Babu et al. 2004). Interestingly, some phylogenetic LEA2 gene pairs, orthologous genes were found to have differential expression pattern in either of the cotton genotypes (Figure 6), for instance, CotAD_71431 and CotAD_ 51205 exhibited varied expression pattern under drought and salt stress conditions as evident in the RNA expression analysis. The result suggests that even if these genes are cladded together; they could have developed different biological function over time. Orthologous genes are members of the genes with a common evolutionary origin and share greater percentage of sequence similarity (Nehrt et al. 2011). According to the expression pattern of LEA2 genes in different tissues, it would be interesting to functionally characterize these genes in upland cotton, G. hirsutum. Majority of the LEA2 genes showed higher expression level in leaf and root tissues, which indicated the functional conservation of the gene sub family. The variation in expression between G. hirsutum and G. tomentosum could be due to broad changes in environmental conditions, G. tomentosum exhibits divergence signals that are associated with directionally selected traits and are functionally related to stress responses. These results suggest that stress adaptation in G. tomentosum might have involved the evolution of protein-coding sequences and thus these genes can be introgressed in to elite upland cotton, in order to boost their performance in the current face of declining fresh water and precipitation.
qRT-PCR Analysis of the Transformed Gene in Upland Cotton Tissues Based on the expression analysis of the LEA2 genes in the various tissues of G. tomentosum (drought susceptible) and G. hirsutum (drought susceptible). We identified a single gene with significant expression in the various tissues and transformed the gene into the model plant, A. thaliana (Colombia ecotype-0). The gene CotAD_24498 was analyzed in various tissues of the upland cotton, G. hirsutum. This was carried out in order to determine its relative abundance within the plant. We found that the gene was more abundantly expressed in the reproductive tissues, more specifically in the petal and stamen ( Figure 7A). In addition, we further carried out treatment on cotton seedlings after three true leaves stage under drought stress (PEG6000_15%) the samples for RNA extraction and qRT-PCR analysis were obtained from leaf, root and leaves at intervals of 0 h, 3 hr, 6 hr, 12 hr and 24 hr of post stress treatment. In all the three tissues, 6 hr marked the peak up-regulation of the gene, and then a gradual decline was observed with increase in time of stress exposure. The gene exhibited a significant up regulation in the root as compare to leaf and stem tissues ( Figure 7B). We successfully transformed 9 lines with overexpressed gene CotAD_24498 ( Figure  7C), out the nine (9) lines, three (3) lines showed the highest level of overexpression and were further used in the investigation of the potential of the gene in the transgenic lines under drought stress conditions ( Figure 7D). Overexpression of CotAD_24498 in plants promote root growth and confers tolerance to drought stress tolerance Increased primary root growth and overall plant fresh biomass are indicators of tolerance to various abiotic stresses in which plants are exposed to (Verslues et al. 2006;Jisha et al. 2013). We sought to investigate the response of the transgenic lines and the wilt type to drought stress condition in relation to primary root length elongation and fresh biomass accumulation. The transgenic lines showed enhanced performance with relatively increased primary root growth and with higher fresh biomass increment compared to the wild type under drought stress condition. The drought stress was imposed by exposing the transgenic lines to different concentrations of mannitol 0 mM, 100 mM, 200 mM and 300 mM for a period of six (6) days. Under osmotic stress, highest level of root length assays and fresh biomass accumulations was observed at 100 mM of mannitol concentration ( Figure 8B). The transgenic lines had significantly higher primary root length and fresh biomass accumulation ( Figure 8C), an indication that the photosynthetic processes were not impaired by the drought stress as compared to the wilt type.

Transcripts Investigation of Drought Stress-Responsive Genes
The root appears to be the most relevant organ for breeding drought stress tolerance (Henry 2013). Underlying the ABA-mediated stress responses is the transcriptional regulation of stress-responsive gene expression (Giraudat et al. 1994). Numerous genes have been reported that are up-regulated under stress conditions in vegetative tissues, these include a class of genes known as LEA genes, which are expressed abundantly in developing seed under normal conditions, osmolyte biosynthetic genes, and genes of general cellular metabolism. We undertook to check the expression of two known abiotic stress responsive genes on the transgenic lines (L2, L3 and L4) and the wild types when the plants are exposed to drought condition. The result showed that the stress responsive genes were highly up-regulated in the transgenic lines as opposed to the wild type ( Figure 9). The result obtained was in agreement to the result obtained when the various LEA2 genes were analyzed through qRT-PCR on the tissues obtained from two upland cotton genotypes. More genes were found to be up regulated on the various tissues of the more tolerant genotype as opposed to the less tolerant. Constitutive expression of RD29A and ABF4 demonstrated enhanced drought tolerance in the transgenic Arabidopsis plants.

Oxidants and antioxidant determination in the transgenic lines
In order to understand the role of the transformed LEA2 genes in the transgenic lines in relation to drought stress. We carried out the analysis of the various oxidants and antioxidants measurements in the leaves of the transgenic lines and the wild type. The levels of oxidants were significantly reduced in the transgenic lines compared to the wild type ( Figure 10A-B). When plants are exposed to drought the level of ROS increases, which results into oxidative stress. MDA concentration provides a measure on the damage caused on the membrane lipids due to oxidative stress (Jain et al. 2001). The significant reduction in MDA and H2O2 in the leaf tissues of the transgenic lines showed that  the transformed gene had a regulatory role in controlling various biological pathways geared toward detoxification of the reactive oxygen species in the cells. In addition, we quantified the levels of various antioxidants, SOD, POD and CAT. In all the three antioxidants, there was significant increased levels in the transgenic lines (L1, L2 and L3) compared to the wild type ( Figure 10 C-D). The increased levels of the antioxidants showed that the transgenic lines had a higher ability to tolerant drought stress compared to the wild types. The results obtained in this research, correlates to previous findings, in which drought stressed wheat plants were found to have higher accumulation of oxidants levels (Luna et al. 2005). More tolerant plants genotypes have ability to induct more of the antioxidants such as the CAT, POD and SOD in order to scavenge on the excess ROS and other deleterious molecules released by the cells due to stress condition (Bian and Jiang 2009).

Conclusions
In this study, the identification, phylogenetic relationships, miRNA targets, cis promoter analysis, GO functional annotation and exon/ intron structures of LEA2 genes family members were evaluated in upland cotton, Gossypium hirsutum, and the tissue expression pattern of the two tetraploid cotton species, G. hirsutum (drought sensitive) and G. tomentosum (drought tolerant) were detected under drought stress. The abundance of LEA2 genes and unique gene structure reported in this work provide a solid foundation for future research to understand the evolution of LEA2 gene family and the potential functional role of the 157 LEA2 genes in plants under drought stress condition. Since the discovery of LEA genes, little work has been reported on LEA genes as a whole in upland cotton. The transformation and expression analysis of the transformed LEA2 gene indicated that the LEA2 genes have a profound role in enhancing drought stress tolerance. The transgenic lines L2, L3 and L4 exhibited superior performance compared to the wild type. The roots were significantly longer than the wild type under drought stress condition; similarly, the levels of oxidants in the levels were significantly reduced while the antioxidants levels were higher in the leaves of the transgenic lines compared to the wild type. An indication that the transgenic plants had a higher capacity to regulate the oxidative stress as opposed to the wild type (WT). The genes could be promoting growth of the root cells under limited water condition. Primary root growth is linked to drought stress tolerance; due to increased surface area of the roots thus improving its ability maximally absorb any little moisture available. Deep or extensive root growth is a trait known for most of the xerophytic plants (Brunner et al. 2015).  Expression levels of drought stress-responsive genes (ABF4 and RD29A) in transgenic lines and wild-type. Arabidopsis ACTIN2 was used as the reference gene mean values with 6 SD. Ã P , 0.05 as calculated by Student's t-test.