About the Authors:
Pengcheng Li
Contributed equally to this work with: Pengcheng Li, Joy Nyangasi Kirungu, Hejun Lu, Richard Odongo Magwanga
Roles Conceptualization, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing
Affiliations State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China, School of Life Science, Henan University/State Key Laboratory of Cotton Biology/Henan Key Laboratory of Plant Stress Biology, Kaifeng, Henan, China
Joy Nyangasi Kirungu
Contributed equally to this work with: Pengcheng Li, Joy Nyangasi Kirungu, Hejun Lu, Richard Odongo Magwanga
Roles Conceptualization, Data curation, Formal analysis, Investigation, Software, Writing – original draft, Writing – review & editing
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
ORCID logo http://orcid.org/0000-0002-9299-1176
Hejun Lu
Contributed equally to this work with: Pengcheng Li, Joy Nyangasi Kirungu, Hejun Lu, Richard Odongo Magwanga
Roles Conceptualization, Investigation
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Richard Odongo Magwanga
Contributed equally to this work with: Pengcheng Li, Joy Nyangasi Kirungu, Hejun Lu, Richard Odongo Magwanga
Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing
Affiliations State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China, School of Biological and Physical Sciences (SBPS), Jaramogi Oginga Odinga University of Science and Technology (JOOUST), Bondo- Kenya
ORCID logo http://orcid.org/0000-0002-1900-5798
Pu Lu
Roles Conceptualization, Methodology, Writing – original draft
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Xiaoyan Cai
Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Resources, Supervision, Writing – original draft, Writing – review & editing
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Zhongli Zhou
Roles Funding acquisition, Project administration, Resources, Software
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Xingxing Wang
Roles Methodology, Project administration, Resources
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Yuqing Hou
Roles Investigation, Project administration, Writing – review & editing
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Yuhong Wang
Roles Investigation, Project administration, Resources
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Yanchao Xu
Roles Conceptualization, Writing – original draft
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Renhai Peng
Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing
Affiliation: Biological and Food Engineering, Anyang Institute of technology, Anyang, Henan, China
Yingfan Cai
Roles Conceptualization, Resources, Software, Supervision, Validation, Visualization
Affiliation: School of Life Science, Henan University/State Key Laboratory of Cotton Biology/Henan Key Laboratory of Plant Stress Biology, Kaifeng, Henan, China
Yun Zhou
Roles Conceptualization, Resources, Software, Supervision, Validation, Visualization
* E-mail: [email protected] (YZ); [email protected] (KW); [email protected] (FL)
Affiliation: School of Life Science, Henan University/State Key Laboratory of Cotton Biology/Henan Key Laboratory of Plant Stress Biology, Kaifeng, Henan, China
Kunbo Wang
Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration
* E-mail: [email protected] (YZ); [email protected] (KW); [email protected] (FL)
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
Fang Liu
Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Visualization, Writing – original draft
* E-mail: [email protected] (YZ); [email protected] (KW); [email protected] (FL)
Affiliation: State Key Laboratory of Cotton Biology /Institute of Cotton Research, Chinese Academy of Agricultural Science, Anyang, Henan, China
ORCID logo http://orcid.org/0000-0002-6213-9572
Introduction
Cotton (Gossypium spp.) is the most important fiber crop and one of the sources for animal feeds and edible oil. However cotton growing has significantly been threatened by various abiotic and biotic stresses, this condition has been worsened by intensive selection and inbreeding resulting into the narrow genetic base [1]. Improving cotton yield and fiber quality is important for the survival of the cotton industry [2]. Serious outbreaks of diseases and pests have resulted in great loss in fibre production and its quality.
Researchers are facing difficulties in developing new varieties of cotton to meet emerging challenges, this is because of the limited diversity in the Germplasm of the commercially cultivated, upland cotton [3]. Wild cotton species have rich reservoir of genetic material, much of which has potential and valuable agronomic traits [4]. Some of the useful alleles introgressed into elite cultivars have been achieved through interspecific hybridization for instance, improvement in fibre quality where, long fibre length upland cotton, G. hirsutum was achieved through tri-interspecific hybridization between G. thurberi, G. raimondii and G. barbadense [5]. Drought resistance in upland cotton has been achieved through the utilization of alleles from Asiatic Cottons [6]. Production of fertile hybrid germplasm with diploid Australian Gossypium species has been achieved [7].
Cotton are of two types, the diploid and the tetraploid species, the diploid have 13 chromosomes, n = 2n = 26, while the tetraploid cotton emerged due to whole genome duplication of the two diploid parental lines, resulting into 2n = 4n = 52 chromosomes [8]. The diploid cotton are subdivided into A,B, C, D,E,F,G and K genomes, on the other hand, there are 7 known species of AD genome [9]. Among the diploid cotton genomes, D genome has been found to harbor high number of significant agronomic traits, such as superior fiber qualities, and high tolerance to both biotic and abiotic stresses [10,11]. In a number studies done on the tetraploid cotton, more on quantitative trait loci (QTL) mapping, high number of QTLSs have been found to be mapped on the Dt_sub genome compared to At_sub genome [12–15]. This explains the significance on the D genome and ability to utilize the genes in D cotton species to improve the elite cotton cultivars, which have been has narrow genetic base due to intensive selection and inbreeding [16–18]. The two wild cotton species, G. thurberi and G. trilobum belong to the diploid cotton of the D genome. G. thurberi Todaro (D8) is a wild cotton species, native to Mexico in the Sonora Desert and parts of the southwestern part of the United States of America (USA). G. thurberi has good characteristics that can be introgressed into elite cultivars such as fibre fineness, fibre strength, long fibre, prolific boll bearing, resistance to Fusarium wilt, resistance to frost and cotton bollworms [19]. In addition, G. thurberi has been found to be highly resistant to silver leaf whitefly [20]. The second parental line used in this research, G. trilobum (D1) is an endemic species of West and central Mexico. It has glabrous leaves which is a key character for its identification, it has important agronomic traits such as resistance to Verticillium wilt and drought tolerance [21].
The application of genetic maps between interspecific crosses in cotton, have become vital tools in understanding the genome structure, exploring important agronomic traits and also provide the basis for finding new DNA markers for further construction of high density maps [22]. Currently there are limited numbers of genetic maps that have been constructed from interspecific crosses between the wild progenies of the D cotton genome. The use of Simple sequence repeats (SSRs) are considered to be one of the markers of choice for genome mapping, because they are PCR-based, co-dominancy, multiallelic and hyper-variable in nature [23]. SSR markers, derived from either genomic region or expressed sequence tags (EST), are considered to be essential in the construction of genetic maps. In addition, EST-SSR markers have been extensively used in unraveling the complexities of eukaryotic organisms genomes being that they are directly tagged to the functional genes [24]
In this study we developed an F2generation between two wild cotton species in the D genome; G. thurberi and G. trilobum. We applied the use of mono-markers, SWU simple sequence repeat (SSR) in genotyping 188 individuals of the F2generation. The developed genotypes were applied in the construction of the genetic map; the map enabled us to unearth some of the vital transcriptome factors with profound effect on fiber development. The linkage map and the genes mined will provide a basis in genetic studies such as Marker-assisted selection (MAS) and gene transformation.
Materials and methods
Parental materials
G. thurberi as female parent was crossed with G. trilobum as male parent to obtain F1 generation. The F1 generation was then self-pollinated to get the F2individuals. A total of 274 F2were obtained through F1 self-crossing. From the F2progenies, 188 individuals were randomly selected for genotypic analysis with the polymorphic markers. The two parental materials and the F2 progenies were developed at the National Wild Cotton Nursery in Sanya, Hainan Island, China.
DNA extraction, quantification and electrophoresis
The leaves from the parents, F1 individual and F2progenies were collected and stored in the fridge at -80°C. DNA extraction was done following the CTAB method [25]. DNA quantification and purification was then done to determine the concentration and level of RNA contamination using the Nanodrop techniques, Spectrophotometer was used for quantification and quality checking depending on A260/A280 [26]. Concentration of genomic DNA was estimated by comparing the size and intensity of each sample band with those of sizing standard, DNA mass ladder. We then diluted the sample according to each sample concentration until it was within the working concentration range; the DNA working concentration was based on 10–100 μg/μl. The polymerase chain reaction (PCR) amplification on the reagents was conducted using TAKARA Bio Inc TP 600 thermal cycler. Electrophoresis was performed on the PCR product following the method described by [25] with minor modifications. The amplified PCR products were separated on 8% denaturing polyacrylamide gel and visualized by silver nitrate staining [27].
Application of SSR markers genotyping the F2progenies derived from the two diploid parental lines
We employed the use of expressed sequence tag-simple sequence repeat (EST-SSR) mono-markers developed by South West University, China thus the acronym SWU. The SWU markers were developed from G. raimondii genome. A total of 12,650 markers were screened for polymorphism, out of this we obtained 996 polymorphic loci which were used to genotype 188 F2 individuals, the Details of the SWU markers, forward and reverse sequence are summarized in (S1 Table). The male plant G. trilobum, the female plant G. thurberi and the heterozygous F2 progenies were scored as A, B, and H respectively, The missing data was designated as‘-’.Multi-allelic markers were named separately by primer name followed by the letters a, b, c, and d as a suffix.
Genetic map construction
We employed the use of Join Map 4.0 with a recombination frequency of 0.40 and a LOD score of 2.5 for the formation of linkage groups [28]. Linkage groups were assigned to chromosomes depending on blast searches on the markers since these markers are newly developed. The linkage groups were then drawn using Mapchart 2.3 Software [29], A Chi-square (χ2) test was performed to determine whether the markers significantly deviated from Mendelian segregation ratios The markers showing segregation distortion were indicated by asterisks (*P<0.05, **P<0.01, ***P<0.00, ****P<0.001, *****P<0.0005, ******P<0.0001, *******P<0.00005. The markers that deviated significantly from the normal Mendelian ratio of 3:1 for dominant markers and 1:2:1 for codominant markers were termed to be segregated distorted markers and were used to determine segregation distortion in the linkage groups [30].
Gene mining, protein characterization and GO functional annotation
The physical positions of the flanking markers were employed in mining the genes. The SSR marker sequences were used as the query by blasting in to the reference genome, G. raimondii genome assembly, being the markers were developed from G. raimondii. By employing the physical position and use of cotton genome database, all the genes were obtained per each chromosome. The method adopted was similar to previous method employed by Magwanga et al [11] in obtaining the conserved genes between two tetraploid cotton, G, hirsutum and G. tomentosum. Further analysis were carried out on the various genes mined in order to determine the characteristics of the proteins encoding the mined genes and their putative roles in cotton through GO annotation, which was carried out through BLAST2GO [31]. Furthermore, the isoelectric points (pI), grand hydropathy values (GRAVY), charge and molecular masses of the proteins encoding the mined genes were estimated by ExPASy Server tool (http://web.expasy.org/compute_pi/).
RNA expression analysis and RT-qPCR validation of the highly upregulated genes
We obtained the RNA sequence data for the genes mined within the SDR regions. The RNA-seq data was obtained from the Cotton Functional Genomics Database (https://cottonfgd.org/). The RNA sequenced data were for the reference genome, G. raimondii profiled at different stages of fiber development. The Raw RNA seq data were transformed into log 2, and used in the construction of heatmap. Furthermore, we selected 50 highly upregulated genes, and carried out RT-qPCR analysis in order to validate the possible role of these mined genes in fiber development using their gene specific primers (S2 Table). The two parental lines flowers were tagged and samples harvested at 0, 5, 25 and 30 DPA for real time quantitative polymerase chain reaction (RT-qPCR). The RT-qPCR analysis was carried out as outlined by Magwanga et al [1], cotton GrActin with forward sequence “ATCCTCCGTCTAGACCTTG” and reverse sequence “TGTCCATCAGGCAACTCAT” was used as the reference gene
Collinearity analysis
A BLASTN Search with E ≤ 1 × 10−5, identity ≥ 80%, and matched length ≥ 200 bp was applied (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The SSR sequences were used as queries, the genome assemblies of (AD)1 [32] genome and (AD)2 [33] were used in collinearity analysis. Markers with the best hits were chosen and the Circos program (http://circos.ca) was applied to draw the circos maps.
Results
Parental polymorphism
The SWU primers used were 12,560 in total and were used for screening for interspecific polymorphism between the two parental lines, G. trilobum, G. thurberi and their F1 generation. A total of 994 markers were obtained as polymorphic which accounted for only 7.91% of all the markers screened. A total of 132 (13.3%) markers were scored as dominant markers while 862 (86.7%) markers were scored as codominant. In our study we noted that the rate of polymorphism in the eSSRs markers used was lower this could be due to DNA sequences conserved at transcribed regions [34]. Low levels of polymorphism have been reported in other plants for instance in peanut (6.8%) polymorphism was detected among the eSSRs, [35], maize (1.4%), rice (4.7%), sorghum (3.6%), wheat (3.2%) [36], in Gossypium species lower polymorphic rate of the eSSRs have been reported [37–39]. However the eSSRs remain to be the markers of choice due to their ability to detect incomplete dominance inheritance, cost less and having a good genomic coverage despite the lower polymorphism observed in some plant [40].
Genetic map construction and determination of the segregation distortion regions (SDRs)
All the polymorphic markers used in the genotyping of the F2progenies were successfully scored and utilized in the construction of the linkage map by the use of JoinMap. A total of 849 out 994 polymorphic markers were linked and distributed across the entire 13 chromosomes of the D genome (Fig 1.). The details for the SSR markers and the alleles scores used for the construction of the genetic map are shown in (S3 Table). The distribution of markers on the linkages was symmetric and there was no clustering of loci, 145 loci were not linked due to high distortions. The genetic map size generated was 1,012.458 cM with an average marker distance of 1.193 cM. The chromosome with the highest marker loci density was Chr09 with 93 (11%) markers followed by Chr05 with 89 (10.5%) markers while Chr02 had the least number of markers loci with only 21 markers. The largest gap between adjacent loci was observed on Chr01 covering 15.699 cM while Chr04, Chr08, Chr10 and Chr11 had the smallest gap of 0.001 cM (Table 1). The longest chromosome was Chr12 spanning a distance of 103.563 cM while the shortest chromosome was Chr02 with a map distance of 28.665cM. A total of 714 loci (84.216%) accorded with the Mendelian ratio while 135 (15.783%) deviated from Mendelian ratio, chromosomes with the highest number of distorted loci were Chr07 and Chr11 with 23 distorted loci each while Chr12 had the least number of distorted loci with only 2 distorted loci (Table 1). Some regions on linkage groups had large clustered segregation distortion loci (SDLs); these regions were referred as segregation regions (SDR’S). A total of 8 SDR regions were noted on Chr01, Chr02, Chr06, Chr09, Chr10 and Chr11 each had a single SDR, designated as SDR1, SDR2, SDR6, SDR9, SDR10, and SDR11 respectively while Chr07 had two SDR’S namely SDR7-1, SDR7-2. Large clusters of segregated distorted loci on these regions were observed on Chr02, Chr06, Chr07 and Chr11. The largest SDR’s were skewed towards the heterozygous allele (Table 2).
[Figure omitted. See PDF.]
Fig 1. Genetic maps for the 13 chromosomes of the F2interspecific individuals derived between G. thurberi and G. trilobum.
The markers in blue are distorted while markers in red and underlined indicates the distorted regions per chromosomes.
https://doi.org/10.1371/journal.pone.0207271.g001
[Figure omitted. See PDF.]
Table 1. Characteristics of the genetic map.
https://doi.org/10.1371/journal.pone.0207271.t001
[Figure omitted. See PDF.]
Table 2. Analysis of the protein structure and domain features of the genes located within the SDRs.
https://doi.org/10.1371/journal.pone.0207271.t002
Gene mining, protein characterization and Gene Ontology (GO) functional annotations of the mined genes
We conducted a blast search at regions up and down stream of 20 Kb of each SSR location using the total 846 SSR markers sequences that were extracted from D5 genome. 2,496 genes were identified. The genes were mapped in all the chromosomes the chromosome with the highest number of genes was Chr09 with 316 genes followed by Chr11with 257 genes while Chr02 had the least number of genes with only 48 genes. The genes were characterized for their physiochemical properties. The grand average hydropathy (GRAVY) values ranged between -2.335 and 1.654, their molecular weight ranged between 5.935 and 437.729 kDa, their charge ranged from -170.5 to 66 while the Isoelectric Point (pI) ranged from 3.435 to 12.839, there were 2049 genes that were hydrophilic while only 446 were hydrophobic as shown by their GRAVY values (S4 Table).
From the GO blast analysis, all the three GO terms were detected, in which the highest being the cellular component with 9 functions, while the least was the molecular component with 6 functions (Fig 2). In cellular component (CC), the following genes were found to harbor critical functions, Gorai.009G374600, Gorai.012G141400, Gorai.007G347200, Gorai.001G121600, Gorai.011G137100, Gorai.010G012200, Gorai.011G141200, Gorai.007G353300 and Gorai.007G221800. The CC functions detected were; nucleus (GO: 0005634), chloroplast (GO: 0009507), membrane (GO: 0016020), integral to membrane (GO: 0016021), integral to Golgi membrane (GO: 0030173) and SAGA-type complex (GO: 0070461).
[Figure omitted. See PDF.]
Fig 2. GO annotation results from the mined genes, the highest being cellular component with 9 functions.
https://doi.org/10.1371/journal.pone.0207271.g002
The integrity of cell membrane and cell membranous is significant for normal functioning of the plant, when plants are exposed to any form of stress, the excessive production of reactive oxygen species (ROS), do degrades the cell membrane thus affecting the normal osmotic balance within the cell, which eventually lead to cell death [41,42]. In molecular functions (MF), 60 genes were found to be involved, with 27 different molecular functions, such as transferase activity, transferring phosphorus-containing groups (GO: 0016772), calcium ion binding (GO: 0005509), protein binding (GO: 0005515), transferase activity, transferring acyl groups other than amino-acyl groups among others (GO: 0016747), among others. In the determination of a gene with higher contributory role in cotton fiber development, a gene in ligon lintless-1 gene (Li1 gene) was found to harbor various molecular functions such as transferring acyl groups other than amino-acyl groups among others (GO: 0016747), which has been found to play an important role in cotton fiber elongation [43].
Finally, the GO functions detected to be involved in biological processes were 15, which included functions such as oxidation-reduction process (GO: 0055114), translational elongation (GO: 0006414), folic acid-containing compound metabolic process (GO: 0006760), response to light stimulus (GO: 0009416), regulation of transcription, DNA-dependent (GO: 0006355), microtubule-based movement (GO: 0007018) among others. Oxidation-reduction process, is important in plants responses to stress conditions [44], thus, the detection of this biological function among the genes obtained within the SDRs perhaps indicates that, these genes could be having a stress responsiveness functions in enhancing plants survival under abiotic stress conditions. Detailed information on the GO functions and the genes involved are summarized in (S5 Table). In the identification and characterization of the late embryogenesis abundant proteins (LEA) in cotton, Magwanga et al [1], found that integral to membrane (GO: 0016021), was detected for over 95% of the LEA genes, and this he postulated to have a functional role in maintaining the cell membrane integrity. Moreover, in the analysis of the genes which could been introgressed into the backcross population, BC2F2 developed from G. tomentosum a drought and salt resistant donor parent and G. hirsutum a high yielding tetraploid cotton but more susceptible to various forms of abiotic stress [18], revealed several GO functions, some which have been detected for the genes obtained within the SDRs in this study, an indication that these genes could be playing an important role in the plant.
Analysis of the structure of the genes within the SDR
We undertook to analyze the structures of the genes found within the segregation distortion regions as obtained for chr1, chr2, chr6, chr7, chr9, chr10 and chr11 with 9, 35, 16, 33, 10, 12 and 30 genes, respectively. Out of all the genes within the SDRs, 22 were intronless, of significant were Gorai.011G142800 (Tetrapyrrole-binding protein, chloroplastic), Gorai.002G235800 (Protein YLS9), Gorai.011G141100 (Acetolactate synthase 3, chloroplastic), Gorai.011G158300 (Receptor like protein kinase S.2), Gorai.007G347800 (Cytochrome P450 89A2), Gorai.011G142500 (Serine/threonine-protein kinase-like protein ACR4), Gorai.001G121600 (Beta-1,6-galactosyltransferase GALT29A) and Gorai.006G023300 (Putative pentatricopeptide repeat-containing protein At3g28640) (Table 2). In all the genes within the SDRs, 23 genes were classified as genes of unknown domain, accounting for 16% of all the genes mined within the various SDRs across the seven (7) chromosomes, being the remaining genes were from different domains. The dominant domain among the remaining 123 genes was the P450; Cytochrome P450 (PF00067) with six (6) genes which were Gorai.001G088900 (Cytochrome P450 81D1), Gorai.001G089000 (Cytochrome P450 81E8), Gorai.002G241100 (Cytochrome P450 94C1), Gorai.006G081800 (Cytochrome P450 CYP736A12), Gorai.007G347700 (Cytochrome P450 89A2) and Gorai.007G347800 (Cytochrome P450 89A2). Cytochromes P450 (CYPs)are proteins of the superfamily containing heme as a cofactor), therefore, they are hemoproteins [45]. The cytochromes (CYPs) use a variety of small and large molecules as substrates) in enzymatic reactions. They are the terminal oxidase enzymes in electron transfer chains, broadly categorized as P450-containing systems The term "P450" is derived from the spectrophotometric peak at the wavelength of the absorption maximum of the enzyme (450 nm) when it is in the reduced state and combined with carbon (II) oxide. In plants, CYPs are involved in numerous biosynthetic reactions, which leads to plant hormones production, secondary metabolites synthesis, fatty acid conjugation, lignification of various plant tissues, and production of various defensive compounds [46]. Plant cytochrome P450 genes make up 1% of the plant genes as per the annotations of plant genome. The number and diversity of these genes is believed to trigger numerous bioactive compounds [47]. The detection of these genes within the SDR could explain the significance of these regions in the evolution of new functional transcriptome within the plants.
Collinearity analysis between the genetic map and the physical map reference genome, G. raimondii (D5)
We performed Collinearity analysis between the constructed genetic map of G. thurberi and G. trilobum with a reference to G. raimondii physical map. All the SSR markers full sequences were used to do blast search against the physical genomic map of G. raimondii; and the matching sites were extracted from blast result for collinearity analysis. After removal of redundant markers 846 SSR markers located in genetic linkage map produced 869 loci, which translated to 95.9% of the mapped markers showed consistency between two maps (Fig 3); however there were 36 markers that were in non-conformity to the physical map of the reference genome (Table 3). There were six (6) inversions noted on Chr02, Chr03, Chr07, and Chr13, and also four translocations in Chr03, Chr08, and Chr09 (Table 4), the map developed was of high resolution. Comparison of genetic and physical map is important in confirming the order of genetic markers, using the information from sequence-based physical maps and also to support the genetic-marker order [48]. The collinearity analysis conducted between our genetic map and the physical map with the reference genome being G. raimondii indicated good collinearity between the chromosomes in the genetic map and the physical map; it also confirms the accuracy of the genetic map.
[Figure omitted. See PDF.]
Fig 3. Collinearity between genetic and physical map of D5.
The different colors represent the various syntenic block regions in the chromosomes.
https://doi.org/10.1371/journal.pone.0207271.g003
[Figure omitted. See PDF.]
Table 3. Marker inconformity between the genetic and physical map.
https://doi.org/10.1371/journal.pone.0207271.t003
[Figure omitted. See PDF.]
Table 4. Chromosomes showing inversion and translocation between the genetic and physical map of D5.
https://doi.org/10.1371/journal.pone.0207271.t004
Collinearity analysis of genetic map to that of the physical map (Dt) for G. hirsutum (GhDt) and G. barbadense (GbDt)
A blast search was conducted by using 846 SSR markers from the genetic map and was screened on the Dt-sub genome of G. hirsutum and G. barbadense, out of the total markers 745 and 337 markers were aligned to the assembly genome of G. hirsutum (GhDt) and G. barbadense (GbDt) respectively. 16% of marker in GhDt and 15% of markers in GbDt showed consistency between two maps; however 85% of markers in both sub genomes were in non-conformity between genetic and physical map (Fig 4A, Fig 4B, S6 Table and S7 Table). From the results obtained on the two collinearity analysis, showed that there was a closer relationship between the genetic map and physical map of GhDt than GbDt, this is clearly shown by higher number of markers that were in conformity with GhDt rather than GbDt, this lead to formation of better syntenic blocks between the genetic and physical map of GhDt. Good syntenic block formation was however observed between chromosome 3 in both the two sub genomes. This could possibly mean that more genes have been introgressed from the two wild cotton species into G. hirsutum rather than to G. barbadense, from previous studies on genes introgression within G. hirsutum, a higher percentage of the introgression observed (43.7%) was accounted for by wild accessions, as compared to improved accessions (18.4%) whereas within G. barbadense, 33.1% of the introgression was accounted for by wild accessions, and only 27.1% of the introgression was accounted for by improved accessions, thus the wild accessions accounted for more introgression in G. hirsutum than in G. barbadense [49]. The results obtained are in agreement to the earlier reports, in which fibre quality traits such as, high fibre length have been found to be introgressed into G. hirsutum from its wild progenitors of the D genome, G. thurberi [5]. Moreover, G. hirsutum and G. barbadense are said to have a common ancestry however interference by human activities and abiotic factors have made them to evolve different agronomic traits [50]. However, their genomic sequences have made it possible to study the divergence and comparative analysis of the two species.
[Figure omitted. See PDF.]
Fig 4. Collinearity analysis.
(A) Analysis between genetic map and physical map of GhDt while (B). Analysis between genetic map and physical map of GbDt.
https://doi.org/10.1371/journal.pone.0207271.g004
RNA sequence data analysis and RT-qPCR validation of genes in the SDR regions
The RNA sequence data for the genes at the SDR were obtained using their homeolog genes of the upland cotton; G. hirsutum sequenced at different development stages cotton fiber 0, 5, 10, 20 and 25 day post anthesis (DPA) from cotton genome data base and analyzed. Based on the expression profile, the genes were categorized into four groups. Group 1 members were highly up regulated, though some were found to be only highly up regulated in some stages of fiber development. Group 2, exhibited differential expression, with some being up regulated while other were either down regulated or not expressed. Group 3, majority of the genes were down regulated, with only very few genes were either partially up regulated or not expressed. Lastly groups 4 were not expressed in all the various stages of fiber development (Fig 5). Furthermore, 30 highly upregulated genes later validated through RT-qPCR, at similar fiber development stages. Majority of the genes were highly inducted in G. thurberi as compared to G. tribolum an indication that G. thurberi had a higher potential of producing superior fibers compared to G. trilobum (Fig 6A). Similar findings have been previously reported in which G. thurberi has been found to have superior [19]. The expression pattern for the fifty (50) analysed genes through RT-qPCR analysis revealed that these genes could be playing an important role in various stages of cotton fiber development, for instance Gorai.007G347600 (Tubulin-folding cofactor A), Gorai.012G141600 (Dihydroneopterin aldolase 1), Gorai.006G024500 (60S ribosomal export protein NMD3), Gorai.002G229900 (Protein LST8 homolog) and Gorai.002G235200 (Ribosome biogenesis protein NSA2 homolog) were highly upregulated at different stages of fiber development. Studies have been conducted to investigate the role of tubulin in cotton fiber development, the tubulin genes were found to be upregulated at specific stages of fiber development, the transcript α-tubulin genes GhTua2/3 and GhTua4 were found to be highly upregulated from 10 to 20 DPA, while GhTua1 and GhTua5 transcripts were highly upregulated from 0 DPA up to 14 DPA then registered a significant drop at 16 DPA with the onset of secondary wall synthesis [51–53]. For the results obtained in the RNA sequence and RT-qPCR analysis, we observed that the highly up regulated genes were mainly enzymes that performed catalytic activities. Similar results have been observed in maize where gene cluster was observed on five adjacent genes (Bx1–Bx5) that encode enzymes for successive steps in the biosynthesis of the cyclic hydroxamic acid 2,4-dihydroxy-1,4-benzoxazin-3-one [54]. We further analysed the gene structures of the 50 highly up regulated genes as per the RNA sequencing results, all the genes were disrupted by introns except six (6) genes, which were Gorai.010G010100 (Probable LRR receptor-like serine/threonine-protein kinase At2g16250), Gorai.011G141100 (Acetolactate synthase 3, chloroplastic), Gorai.002G231400 (uncharacterized gene), Gorai.011G158300 (Receptor like protein kinase S.2), Gorai.007G350900 (Histone-lysine N-methyltransferase, H3 lysine-9 specific SUVH1) and Gorai.006G024500 (60S ribosomal export protein NMD3) (Fig 6B). Several stress responsive genes have been found to be heavily laden with introns, such as the LEA genes [55], cyclin dependent kinase (CDK) genes [56], G protein coupled receptors (GPCRs) [57] among others.
[Figure omitted. See PDF.]
Fig 5. Heat map for the RNA expression seq. for the mined genes at the SDR regions in relation to fiber development.
The heat map was visualized using Mev.exe program (Showed by log 2 values). (i) Red-up regulated, green-down regulated and black- no expression. DPA: day post anthesis.
https://doi.org/10.1371/journal.pone.0207271.g005
[Figure omitted. See PDF.]
Fig 6. RT-qPCR validation of the selected genes.
(A): Heat map for the 50 highly upregulated genes as per the RNA seq. The heat map was visualized using Mev.exe program (Showed by log 2 values). (i) Red-up regulated, green-down regulated and black- no expression. DPA: day post anthesis (B): Gene structure analysis of the 50 selected genes. Abbreviation: DPA: day post anthesis; Gth: Gossypium thurberi; Gtr: Gossypium trilobum.
https://doi.org/10.1371/journal.pone.0207271.g006
Discussion
Construction of genetic maps has become increasingly important in the understanding of marker- assisted selection (MAS) in plants and its efficiency in gene mapping. The use of SSR markers in the construction of genetic maps have a great significance since these markers are abundant and have high levels of transferability, low levels of intra-locus, relative abundance and good genome coverage [58]. In the development of our genetic map we employed the use of EST-SSR (eSSRs) SWU mono markers, ESR-SSR markers have been used in the construction of several genetic maps in cotton.[59–61] The ESSRs are based on expressed sequences and are conserved across cotton species and other closely related plant species.
Wild cotton has been known to have both advantageous and disadvantageous traits, therefore the development of genetic maps of interspecific and intraspecific crosses aids in the introgression of advantageous alleles into the already cultivated cultivars. We developed a genetic map, from an interspecific cross between two wild cotton species in diploid cotton in the D genome. So far, fewer genetic maps developed from the diploid D genome have been reported. The constructed genetic map had a total length of 1,012.458 cM with an average distance between the loci of 1.193 cM, a total of 849 markers loci were used in the map development. The map developed had a much higher genome coverage compared to earlier developed maps from the diploid genomes, for instance, a map between G. arboreum and G. herbaceum had a total length of 1,109 cM with an average marker distance of 7.92 cM between loci [59]. Although the map size was relatively smaller compared to dense genetic maps developed, the average distance between two marker loci (1.193 cM) was smaller compared to previously developed maps, this indicates that the map is suitable for analysis of quantitative trait loci (QTLs) and gene mining [62]. The genetic map between the two sister species G. thurberi and G. trilobum is the second to have been developed from wild species in the D genome, the first map was developed from an interspecific cross between G. davidsonii and G. klotzschianum. However several maps have been developed using diploid cotton from A genomes.
Segregation distortion (SD) is the deviation observed on genotypic frequencies from expected Mendelian ratios [63]. This phenomenon has been reported in other plants such as Maize [64], Barley [65] and Potatoes [66]. From our genetic map we noted segregation distortion loci (SDLs) in all the chromosomes; however they were unevenly distributed within the 13 chromosomes. Chr07 and Chr11 had the highest number of SDLs with 23 distorted loci each. Similar results were recorded on the genetic maps in Gossypium spp [61] Chr02, Chr07 and Chr11 [67], Chr07 (>50% of the loci were distorted). Some SDL were clustered in specific regions on the chromosome these regions were designated as the segregation distortion regions (SDRs). The largest SDRs were observed on Chr02, Chr06, Chr07-2 and Chr1. The largest SDR’s were skewed towards the heterozygous, similar results were also recorded from previous studies by Liang et al [38]. The skewness towards heterozygosity could be due to the genetic loci expressing themselves at different times leading to gametophytic and zygotic selection. Results from earlier constructed genetic map in cotton showed that larger SDRs were located on Chr02 [39], Chr02 had 3 SDRs (>50% of loci were distorted); [68], Chr02, Chr16 and Chr18 [69]. Most interestingly we also noted that in most of these genetic maps Chr02 had fewer number of marker loci. SDRs exhibiting similar patterns of distortion at the same chromosomal regions in several species-related populations could lead to the identification of common genetic factors causing these phenomena, [64]. From these results we concluded that Chr02 could be carrying vital genes that could be segregating around these SDRs and therefore making the flanking markers to segregate. Hence there is need to mine genes within these regions. The genes would help to ravel the issues of SDRs through the identification of important traits and genome wide association studies, for example Bovill et al [70] identified gene for crown rot resistance in wheat around the SDR, similar Sr36 gene locus was detected in the SDR on chromosome 2B [71].
From the physiochemical properties of the genes mined we noted that the gravy values range were both positive and negative values, indicating that the proteins encoding the mined genes were both hydrophilic and hydrophobic in nature [72]. We noted that the identified genes were more hydrophilic in nature rather than hydrophobic, many genes contained proteins and enzymes related to metabolism and disease/defense; these proteins are mainly activated when in solution forms hence the occurrence of more hydrophilic genes than hydrophobic genes. We analyzed the genes located on the SDR to determine if they had role in segregation distortion of the flanking markers. We noted that within the SDR there were common gene domains; Cytochromes P450 (CYPs) appeared in almost all the SDRs with six members, the Myb-like DNA binding domain with four members and the Multi-protein bridging factor 1 domain with three members, these genes could probably have caused segregation of flanking markers. It could also be due to same gametophyte factors or unknown genes in a population segregating, thus exhibiting segregation distortion in the same chromosomal regions [73]. We further observed that the chromosomes with largest distortions had the highest number of genes; interestingly we noted that Chr02 had the least markers (21), with the shortest map size of 28.665 cM but had highest number of genes which were 35 genes. This implied that the genes located in these regions could have been segregating due to zygotic or gametophytic factors or other underlying factors hence there is need to do more research on these genes locate on the SDRs in chromosome 2. The expression analysis of the genes at the SDR region showed that some of the genes were involved in fibre development. The expression of genes in the SDR regions revealed that most of the genes that were up regulated were involved in enzyme catalytic activities; examples of these genes include Serine decarboxylase, Tetraketide alpha-pyrone reductase 2, Digalactosyldiacylglycerol synthase-1-chloroplastic, and Mediator of RNA polymerase II transcription subunit 27 among others.
The Cytochromes P450 (CYPs) were the dominant domain among the genes mined within the segregation regions (SDRs). This superfamily is among the largest group of enzymes in plants, they play vital role in a range of metabolic pathways. [74]. The name cytochrome is gotten from the spectral absorbance maximum, produced when carbon (II) oxide binds to the enzyme in its reduced state produced at 450 nm [75]. They have been found to function in all the eukaryotes. The two species G. thurberi and G. trilobum are also known to be resistance to soil-borne fungal pathogens, Fusarium wilt and Verticillium wilt respectively, enzymes play major function in various fungal metabolisms such as biochemical reactions, adaptation to hostile environment and detoxification of chemicals [76], and this could explain their higher number. In addition, G. thurberi also possess other beneficial agronomic traits, such as resistant to cotton bollworm and silver leaf whitefly, thus the reflection of the higher number of CYP, they play important role in both insects and plants, and they participate in a range of spectrum of plant toxins metabolized by insects and the defense compounds manufactured by plants. [77].
SDR has become a common feature in plants, and it is believed that the SDRs have significant effect on mapping and breeding applications. High level of distortions has been found in a number of plants, for instance in Medicago sativa L. 24% and 34% of markers have been found to be distorted in the mapping of the F1 generation, while in the F2 generation, very high level of distortion of 68% per linkage [78,79], similarly SD has also been observed in rice chromosome 9 in doubled haploid, recombinant inbred [64]. Segregation distortion is believed to be caused by a group of genetic elements near the centromere of chromosomes, and now been seen as a potentially powerful evolutionary force, has been observed in monocotyledons plants such as maize [80]. The detection of these genes provides further evidence of the significance role of the SDRs in plants.
Conclusions
The use of genetic maps between wild cotton species has significance in identification of vital alleles with profound agronomic benefits that could be introgressed into elite cotton cultivar. Cotton farming is facing challenge emanating from environmental stresses such as cold, drought and salinity. The application of the two species would help molecular breeders in introgression of the identified vital genes into already cultivated cotton that were mined within the SSR regions. The two wild species in the D genome; G. thurberi as female parent and G. trilobum as the male parent were used in the construction of a fine genetic map, this map will provide a basic tool for researchers to conduct evaluation of QTLs and identification of novel genes along the SSR regions. The genetic map had a length of 1,012.458 cM with an average length between the loci of 1.193 cM. A total of 849 loci were successfully mapped in all the 13 chromosomes. Chromosome regions with obvious segregation distortion were identified in this map, approximately 16% of mapped markers showed distorted segregation in the F2progenies. 2,495 genes were mined within the SSR region and characterized on their physiochemical properties. We further analyzed the genes within the SDR region with an aim of identifying genes that could be segregating within the SDR, we noted that the common gene domain (Cytochromes P450 (CYPs) appeared in almost all the SDRs and it contained six members. Further analysis of this gene domain will enable understanding of the role they play in SDRs by future molecular breeders. The constructed linkage map will allow future breeders to identify the markers that linked to the trait of interest and use them in marker-assisted breeding program and genome wide studies.
Ethical approval and consent to participate
No ethical nor consent to participate in this research was sought.
Supporting information
[Figure omitted. See PDF.]
S1 Table. SWU marker sequences details.
https://doi.org/10.1371/journal.pone.0207271.s001
(XLSX)
S2 Table. Details for primers used for RT-qPCR analysis.
https://doi.org/10.1371/journal.pone.0207271.s002
(DOCX)
S3 Table. SWUmarkers and their allele scores.
The male plant G. trilobum, the female plant G. thurberi and the heterozygous F2 progenies were scored as A, B, and H respectively. The missing data was designated as‘-’.Multi-allelic markers were named separately by primer name followed by the letters a, b, c, and d as a suffix.
https://doi.org/10.1371/journal.pone.0207271.s003
(XLSX)
S4 Table. Physiochemical properties of the proteins encoding the mined genes as obtained from the genetic map developed between the two wild cotton species of the D genome.
https://doi.org/10.1371/journal.pone.0207271.s004
(XLSX)
S5 Table. Gene Ontology analysis of the genes obtained within the SDR regions.
https://doi.org/10.1371/journal.pone.0207271.s005
(DOCX)
S6 Table. Markers inconformity between genetic map and the physical map G. hirsutum (GhDt).
https://doi.org/10.1371/journal.pone.0207271.s006
(DOCX)
S7 Table. Markers inconformity between genetic map and the physical map of G. barbadense (GbDt).
https://doi.org/10.1371/journal.pone.0207271.s007
(DOCX)
Acknowledgments
We are indebted to the entire research team, students and teachers, without your valuable input, this outcome would have not been possible.
Citation: Li P, Kirungu JN, Lu H, Magwanga RO, Lu P, Cai X, et al. (2018) SSR-Linkage map of interspecific populations derived from Gossypium trilobum and Gossypium thurberi and determination of genes harbored within the segregating distortion regions. PLoS ONE 13(11): e0207271. https://doi.org/10.1371/journal.pone.0207271
1. Magwanga RO, Lu P, Kirungu JN, Lu H, Wang X, Cai X, et al. Characterization of the late embryogenesis abundant (LEA) proteins family and their role in drought stress tolerance in upland cotton. BMC Genet. 2018;19.
2. Guo W. Gene cloning and molecular breeding to improve fiber qualities in cotton. Chinese Sci Bull. 2003;48: 709.
3. Abdurakhmonov IY, Kohel RJ, Yu JZ, Pepper AE, Abdullaev AA, Kushanov FN, et al. Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics. 2008;92: 478–487. pmid:18801424
4. Wegier A, Piñeyro-Nelson A, Alarcón J, Gálvez-Mariscal A, Álvarez-Buylla ER, Piñero D. Recent long-distance transgene flow into wild populations conforms to historical patterns of gene flow in cotton (Gossypium hirsutum) at its centre of origin. Mol Ecol. 2011;20: 4182–4194. pmid:21899621
5. Basbag S, Gencer O. Investigation of some yield and fibre quality characteristics of interspecific hybrid (Gossypium hirsutum L. ?? G. barbadense L.) cotton varieties. Hereditas. 2007;144: 33–42. pmid:17567438
6. Singh C, Kumar V, Prasad I, Patil VR, Rajkumar BK. Response of upland cotton (G.Hirsutum L.) genotypes to drought stress using drought tolerance indices. J Crop Sci Biotechnol. 2016;19: 53–59.
7. Brubaker CL, Brown AHD, Stewart JMD, Kilby MJ, Grace JP. Production of fertile hybrid germplasm with diploid Australian Gossypium species for cotton improvement. Euphytica. 1999;108: 199–213.
8. Wendel JF, Grover CE. Taxonomy and Evolution of the Cotton Genus, Gossypium. 2015; 1–20.
9. Kunbo W F WJ, Jinping HUA. Designations for individual genomes and chromosomes in Gossypium. Journal of Cotton Research; 2018; 3–7.
10. Kirungu JN, Deng Y, Cai X, Magwanga RO, Zhou Z, Wang X, et al. Simple sequence repeat (SSR) genetic linkage map of D genome diploid cotton derived from an interspecific cross between Gossypium davidsonii and Gossypium klotzschianum. Int J Mol Sci. 2018;19. pmid:29324636
11. Magwanga R, Lu P, Kirungu J, Diouf L, Dong Q, Hu Y, et al. GBS Mapping and Analysis of Genes Conserved between Gossypium tomentosum and Gossypium hirsutum Cotton Cultivars that Respond to Drought Stress at the Seedling Stage of the BC2F2 Generation. Int J Mol Sci. Multidisciplinary Digital Publishing Institute; 2018;19: 1614. pmid:29848989
12. Qin H, Guo W, Zhang YM, Zhang T. QTL mapping of yield and fiber traits based on a four-way cross population in Gossypium hirsutum L. Theor Appl Genet. 2008;117: 883–894. pmid:18604518
13. Sun FD, Zhang JH, Wang SF, Gong WK, Shi YZ, Liu AY, et al. QTL mapping for fiber quality traits across multiple generations and environments in upland cotton. Mol Breed. 2012;30: 569–582.
14. Wang H, Huang C, Guo H, Li X, Zhao W, Dai B. QTL Mapping for Fiber and Yield Traits in Upland Cotton under Multiple Environments. 2015; 1–14. pmid:26110526
15. Tan Z, Zhang Z, Sun X, Li Q, Sun Y, Yang P, et al. Genetic Map Construction and Fiber Quality QTL Mapping Using the CottonSNP80K Array in Upland Cotton. Front Plant Sci. 2018;9. pmid:29535744
16. Bowman DT, May OL, Calhoun DS. Genetic base of upland cotton cultivars released between 1970 and 1990. Crop Sci. 1996;36: 577–581.
17. Linos AA, Bebeli PJ, Kaltsikes PJ. Cultivar identification in upland cotton using RAPD markers. Aust J Agric Res. 2002;53: 637–642.
18. Magwanga RO, Lu P, Kirungu JN, Diouf L, Dong Q, Hu Y, et al. GBS mapping and analysis of genes conserved between gossypium tomentosum and gossypium hirsutum cotton cultivars that respond to drought stress at the seedling stage of the BC2F2generation. Int J Mol Sci. 2018;19. pmid:29848989
19. Zhao F, Fang W, Xie D, Zhao Y, Tang Z, Li W, et al. Proteomic identification of differentially expressed proteins in Gossypium thurberi inoculated with cotton Verticillium dahliae. Plant Sci. 2012;185–186: 176–184. pmid:22325879
20. Walker GP, Natwick ET. Resistance to silverleaf whitefly, Bemisia argentifolii (Hem., Aleyrodidae), in Gossypium thurberi, a wild cotton species. J Appl Entomol. 2006;130: 429–436.
21. Miyazaki J, Stiller WN, Wilson LJ. Identification of host plant resistance to silverleaf whitefly in cotton: Implications for breeding. F Crop Res. 2013;154: 145–152.
22. Yu JZ, Kohel RJ, Fang DD, Cho J, Van Deynze a., Ulloa M, et al. A High-Density Simple Sequence Repeat and Single Nucleotide Polymorphism Genetic Map of the Tetraploid Cotton Genome. G3: Genes|Genomes|Genetics. 2012;2: 43–58. pmid:22384381
23. Guo Y, Saha S, Yu JZ, Jenkins JN, Kohel RJ, Scheffler BE, et al. BAC-derived SSR markers chromosome locations in cotton. Euphytica. 2008;161: 361–370.
24. Qureshi SN, Saha S, Kantety R V, Jenkins JN. EST-SSR: A New Class of Genetic Markers in Cotton. J Cotton Sci. 2004;8: 112–123.
25. Zhang J, Stewart J, Mac. Economical and rapid method for extracting cotton genomic DNA. J Cott Sci. 2000;4: 193–201.
26. Boopathi NM, Thiyagu K, Urbi B, Santhoshkumar M, Gopikrishnan A, Aravind S, et al. Marker-assisted breeding as next-generation strategy for genetic improvement of productivity and quality: Can it be realized in cotton? Int J Plant Genomics. 2011;2011. pmid:21577317
27. Creste S, Tulmann Neto A, Figueira A. Detection of Single Sequence Repeat Polymorphisms in Denaturing Polyacrylamide Sequencing Gels by Silver Staining. Plant Mol Biol Report. 2001;19: 299–306.
28. Stam P. Construction of integrated genetic linkage maps by means of a new computer package: Join Map. Plant J. 1993;3: 739–744.
29. Voorrips RE. MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered. 2002;93: 77–78. pmid:12011185
30. Li W, Lin Z, Zhang X. A Novel Segregation Distortion in Intraspecific Population of Asian Cotton (Gossypium arboretum L.) Detected by Molecular Markers. J Genet Genomics. 2007;34: 634–640. pmid:17643949
31. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21: 3674–3676. pmid:16081474
32. Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol. 2015;33: 531–537. pmid:25893781
33. Liu X, Zhao B, Zheng HJ, Hu Y, Lu G, Yang CQ, et al. Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites. Sci Rep. 2015;5. pmid:26420475
34. Chabane K, Varshney RK, Graner A, Valkoun J. Generation and exploitation of EST-derived SSR markers for assaying molecular diversity in durum wheat populations. Genet Resour Crop Evol. 2008;55: 869–881.
35. Liang X, Chen X, Hong Y, Liu H, Zhou G, Li S, et al. Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species. BMC Plant Biol. 2009;9. pmid:19309524
36. Kantety R V., La Rota M, Matthews DE, Sorrells ME. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol. 2002;48: 501–510. pmid:11999831
37. Frelichowski JE, Palmer MB, Main D, Tomkins JP, Cantrell RG, Stelly DM, et al. Cotton genome mapping with new microsatellites from Acala “Maxxa” BAC-ends. Mol Genet Genomics. 2006;275: 479–491. pmid:16501995
38. Liang Z, Yuanda L, Caiping C, Xiangchao T, Xiangdong C, Wei Z, et al. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information. BMC Genomics. 2012;13: 539. pmid:23046547
39. Yu Y, Yuan D, Liang S, Li X, Wang X, Lin Z, et al. Genome structure of cotton revealed by a genome-wide SSR genetic map constructed from a BC1 population between gossypium hirsutum and G. barbadense. BMC Genomics. 2011;12: 15. pmid:21214949
40. Hu J, Wang L, Li J. Comparison of genomic SSR and EST-SSR markers for estimating genetic diversity in cucumber. Biol Plant. 2011;55: 577–580.
41. Langebartels C, Wohlgemuth H, Kschieschan S, Grün S, Sandermann H. Oxidative burst and cell death in ozone-exposed plants. Plant Physiology and Biochemistry. 2002. pp. 567–575.
42. Vanderauwera S, Hoeberichts FA, Breusegem F Van. Hydrogen Peroxide-Responsive Genes in Stress Acclimation and Cell Death. Reactive Oxygen Species in Plant Signaling: Springer. 2009. pp. 149–164. https://doi.org/10.1007/978-3-642-00390-5_9
43. Liu K, Sun J, Yao L, Yuan Y. Transcriptome analysis reveals critical genes and key pathways for early cotton fiber elongation in Ligon lintless-1 mutant. Genomics. 2012;100: 42–50. pmid:22576057
44. Saha B, Borovskii G, Panda SK. Alternative oxidase and plant stress tolerance. Plant signaling & behavior. 2016. p. e1256530. pmid:27830987
45. Gonzalez FJ, Gelboin H V. Human cytochromes p450: Evolution and cDNA-directed expression. Environmental Health Perspectives. 1992. pp. 81–85. pmid:1486867
46. Schuler MA, Werck-Reichhart D. Functional genomics of P450s. Annu Rev Plant Biol. 2003;54: 629–667. pmid:14503006
47. Mizutani M, Sato F. Unusual P450 reactions in plant secondary metabolism. Arch Biochem Biophys. 2011;507: 194–203. pmid:20920462
48. Córdoba JM, Chavarro C, Schlueter JA, Jackson SA, Blair MW. Integration of physical and genetic maps of common bean through BAC-derived microsatellite markers. BMC Genomics. 2010;11. pmid:20637113
49. Hinze LL, Gazave E, Gore MA, Fang DD, Scheffler BE, Yu JZ, et al. Genetic diversity of the two commercial tetraploid cotton species in the gossypium diversity reference set. J Hered. 2016;107: 274–286. pmid:26774060
50. Chee P, Draye X, Jiang CX, Decanini L, Delmonte TA, Bredhauer R, et al. Molecular dissection of interspecific variation between Gossypium hirsutum and Gossypium barbadense (cotton) by a backcross-self approach: I. Fiber elongation. Theor Appl Genet. 2005;111: 757–763. pmid:15983756
51. Dixon DC, Seagull RW, Triplett BA. Changes in the Accumulation of [alpha]- and [beta]-Tubulin Isotypes during Cotton Fiber Development. Plant Physiol. 1994;105: 1347–1353. pmid:12232289
52. Whittaker DJ, Triplett B a. Gene-specific changes in alpha-tubulin transcript accumulation in developing cotton fibers. Plant Physiol. 1999;121: 181–8. pmid:10482673
53. Li Y, Sun J, Li C, Zhu Y, Xia G. Specific expression of a beta-tubulin gene (GhTub1) in developing cotton fibers. Sci China C Life Sci. 2003;46: 235–242. ET—2008/09/03 LA—eng pmid:18763138
54. Frey M, Chomet P, Glawischnig E, Stettner C, Grün S, Winklmair A, et al. Analysis of a chemical plant defense mechanism in grasses. Science. 1997;277: 696–9. pmid:9235894
55. Magwanga RO, Lu P, Kirungu JN, Dong Q, Hu Y, Zhou Z, et al. Cotton Late Embryogenesis Abundant (LEA2) Genes Promote Root Growth and Confers Drought Stress Tolerance in Transgenic Arabidopsis thaliana. G3 (Bethesda). 2018; g3.200423.2018. pmid:29934376
56. Magwanga R, Lu P, Kirungu J, Cai X, Zhou Z, Wang X, et al. Whole Genome Analysis of Cyclin Dependent Kinase (CDK) Gene Family in Cotton and Functional Evaluation of the Role of CDKF4 Gene in Drought and Salt Stress Tolerance in Plants. Int J Mol Sci. Multidisciplinary Digital Publishing Institute; 2018;19: 2625. pmid:30189594
57. Lu P, Magwanga RO, Lu H, Kirungu JN, Wei Y, Dong Q, et al. A novel G-protein-coupled receptors gene from upland cotton enhances salt stress tolerance in transgenic Arabidopsis. Genes (Basel). 2018;9. pmid:29649144
58. Xu J, Liu L, Xu Y, Chen C, Rong T, Ali F, et al. Development and characterization of simple sequence repeat markers providing genome-wide coverage and high resolution in maize. DNA Res. 2013;20: 497–509. pmid:23804557
59. Badigannavar A, Myers GO. Construction of genetic linkage map and QTL analysis for fiber traits in diploid cotton (Gossypium arboreum x Gossypium herbaceum). J Cotton Sci. 2015;19: 15–26.
60. Han Z, Wang C, Song X, Guo W, Gou J, Li C, et al. Characteristics, development and mapping of Gossypium hirsutum derived EST-SSRs in allotetraploid cotton. Theor Appl Genet. 2006;112: 430–439. pmid:16341684
61. Ma XX, Zhou BL, Lü YH, Guo WZ, Zhang TZ. Simple sequence repeat genetic linkage maps of A-genome diploid cotton (Gossypium arboreum). J Integr Plant Biol. 2008;50: 491–502. pmid:18713384
62. Collard BCY, Jahufer MZZ, Brouwer JB, Pang ECK. An introduction to markers, quantitative trait loci (QTL) mapping and marker-assisted selection for crop improvement: The basic concepts. Euphytica. 2005. pp. 169–196.
63. Zhan H, Xu S. Generalized linear mixed model for segregation distortion analysis. BMC Genet. 2011;12.
64. Xu Y, Zhu L, Xiao J, Huang N, McCouch SR. Chromosomal regions associated with segregation distortion of molecular markers in F2, backcross, doubled haploid, and recombinant inbred populations in rice (Oryza sativa L.). Mol Gen Genet. 1997;253: 535–545. pmid:9065686
65. Liu X, You J, Guo L, Liu X, He Y, Yuan J, et al. Genetic Analysis of Segregation Distortion of SSR Markers in F2 Population of Barley. J Agric Sci. 2011;3: 172–177.
66. Manrique-Carpintero NC, Coombs JJ, Veilleux RE, Buell CR, Douches DS. Comparative Analysis of Regions with Distorted Segregation in Three Diploid Populations of Potato. G3: Genes|Genomes|Genetics. 2016;6: 2617–2628. pmid:27342736
67. Shi Y, Li W, Li A, Ge R, Zhang B, Li J, et al. Constructing a high-density linkage map for Gossypium hirsutum×Gossypium barbadense and identifying QTLs for lint percentage. J Integr Plant Biol. 2015;57: 450–467. pmid:25263268
68. Li X, Jin X, Wang H, Zhang X, Lin Z. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map. DNA Res. 2016;23: 283–293. pmid:27084896
69. Khan MK, Chen H, Zhou Z, Ilyas MK, Wang X, Cai X, et al. Genome Wide SSR High Density Genetic Map Construction from an Interspecific Cross of Gossypium hirsutum x Gossypium tomentosum. Front Plant Sci. 2016;7: 436. pmid:27148280
70. Bovill WD, Ma W, Ritter K, Collard BCY, Davis M, Wildermuth GB, et al. Identification of novel QTL for resistance to crown rot in the doubled haploid wheat population “W21MMT70” x “Mendos.” Plant Breed. 2006;125: 538–543.
71. Tsilo TJ, Jin Y, Anderson JA. Diagnostic microsatellite markers for the detection of stem rust resistance gene Sr36 in diverse genetic backgrounds of wheat. Crop Sci. 2008;48: 253–261.
72. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157: 105–132. pmid:7108955
73. Flagfeldt DB, Siewers V, Huang L, Nielsen J. Characterization of chromosomal integration sites for heterologous gene expression in Saccharomyces cerevisiae. Yeast. 2009;26: 545–551. pmid:19681174
74. Riviere JL, Cabanne F. Animal and plant cytochrome P-450 systems. Biochimie. 1987;69: 743–752. pmid:3120808
75. Capdevila JH, Falck JR, Dishman E, Karara A. Cytochrome P-450 arachidonate oxygenase. Methods Enzymol. 1990;187: 385–394. pmid:2233355
76. Van Hans M. DB, Van Robert F. M. GVan Cees A. M. J. J. DH, Punt PJ. Cytochrome P450 enzyme systems in fungi. Fungal Genet Biol. 1998;23: 1–17. pmid:9501474
77. Ninsin KD, Tanaka T. Synergism and stability of acetamiprid resistance in a laboratory colony of Plutella xylostella. Pest Manag Sci. 2005;61: 723–727. pmid:15776405
78. Osborn TC, Brouwer DJ, Kidwell KK, Tavoletti S, Bingham ET. Molecular Marker Applications to Genetics and Breeding of Alfalfa. Mol Cell Technol Forage Improv. 1998;cssaspecia: 25–31.
79. Li X, Wang X, Wei Y, Brummer EC. Prevalence of segregation distortion in diploid alfalfa and its implications for genetics and breeding applications. Theor Appl Genet. 2011;123: 667–679. pmid:21625992
80. Xu X, Li L, Dong X, Jin W, Melchinger AE, Chen S. Gametophytic and zygotic selection leads to segregation distortion through in vivo induction of a maternal haploid in maize. J Exp Bot. 2013;64: 1083–1096. pmid:23349137
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2018 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Wild cotton species have significant agronomic traits that can be introgressed into elite cultivated varieties. The use of a genetic map is important in exploring, identification and mining genes which carry significant traits. In this study, 188 F2mapping individuals were developed from Gossypium thurberi (female) and Gossypium trilobum (male), and were genotyped by using simple sequence repeat (SSR) markers. A total of 12,560 simple sequence repeat (SSR) markers, developed by Southwest University, thus coded SWU were screened out of which only 994 were found to be polymorphic, and 849 markers were linked in all the 13 chromosomes. The map had a length of 1,012.458 cM with an average marker distance of 1.193 cM. Segregation distortion regions (SDRs) were observed on Chr01, Chr02, Chr06, Chr07 Chr09, Chr10 and Chr11 with a large proportion of the SDR regions segregating towards the heterozygous allele. There was good syntenic block formation that revealed good collinearity between the genetic and physical map of G. raimondii, compared to the Dt_sub genome of the G. hirsutum and G. barbadense. A total of 2,496 genes were mined within the SSR related regions. The proteins encoding the mined genes within the SDR had varied physiochemical properties; their molecular weights ranged from 6.586 to 252.737 kDa, charge range of -39.5 to 52, grand hydropathy value (GRAVY) of -1.177 to 0.936 and isoelectric (pI) value of 4.087 to 12.206. The low GRAVY values detected showed that the proteins encoding these genes were hydrophilic in nature, a property common among the stress responsive genes. The RNA sequence analysis revealed more of the genes were highly upregulated in various stages of fiber development for instance; Gorai.002G241300 was highly up regulated at 5, 10, 20 and 25 day post anthesis (DPA). Validation through RT-qPCR further revealed that these genes mined within the SDR regions might be playing a significant role under fiber development stages, therefore we infer that Gorai.007G347600 (TFCA), Gorai.012G141600 (FOLB1), Gorai.006G024500 (NMD3), Gorai.002G229900 (LST8) and Gorai.002G235200 (NSA2) are significantly important in fiber development and in turn the quality, and further researches needed to be done to elucidate their exact roles in the fiber development process. The construction of the genetic map between the two wild species paves away for the mapping of quantitative trait loci (QTLs) since the average distance between the markers is small, and mining of genes on the SSR regions will provide an insight in identifying key genes that can be introgressed into the cultivated cotton cultivars.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer