Abstract
Grain size and weight are two of the most important determinants of crop yield. Key genes associated with the grain size and weight have been identified in major crops. However, studies on the genetic basis of the grain size and weight related genes in wild Sorghum are limited. In this study, we analysed the variation of grain size related genes using variant analysis of 15 accessions across one cultivated and six tertiary gene pool species representing the five subgenera of Sorghum. A wide variation in grain size related parameters was observed. The highest grain weight, width, and thickness was observed for the accession S. bicolor (L.) Moench 314,746, while the highest grain length was observed for the accession S. macrospermum E.D. Garber 302,367. The wild sorghum species exhibited high morphological diversity. The six candidate genes related to grain size, Sobic.001G335800 (qGW7/GL7), Sobic.001G341700 (GS3), Sobic.002G257900 (GW8), Sobic.003G035400 (GW5/qSW5), Sobic.004G107300 (GW2), and Sobic.009G053600 (GS5) showed polymorphism in the coding sequence regions including variants generating premature stop codons. These variants might contribute to the observed variation in grain size and weight. The tertiary wild sorghum species may be a useful source of genes for understanding and engineering grain size in sorghum and other cereals.
Similar content being viewed by others
Introduction
Domestication of plants is a process based on many natural and non-natural factors. The ‘domestication syndrome’ is a concept which explains the key traits selected in major crops by humans. For instance, reduced grain shattering (i.e. retention of seeds on the parent plant), synchronized flowering and grain maturation, increased grain size and number, compact plant architecture, reduction of grain dormancy, and increased apical dominance are all part of the domestication syndrome (Harlan 1992). In sorghum, domestication was initiated based on allelic changes in two loci in response to the selection pressures imposed by harvesting techniques: in the change from a shattering and open panicled phenotype to non-shattering and compact panicled phenotype (Mann et al. 1983; House 1985). This was likely followed by selection for phenotypes with traits such as increased grain size and total number of branches within the inflorescence, and a reduction in rachis internode length. As a result, cultivated crop lines carry a higher yield compared to their wild relatives.
Grain size and weight are two of the important traits selected during domestication determined by the rate of cell division, the size of the cells and the duration of the grain filling period which are under both genetic and environmental control (Nicolas et al. 1984, 1985). Those are also key determinants of yield (Lee et al. 2002; Tao et al. 2017) as well as important quality attributes (Lee et al. 2002). Grain size and weight are complex quantitative traits controlled by multiple genes. Many important QTLs associated with grain size have been identified in Arabidopsis, rice, and maize (Li et al. 2011; Song et al. 2007; Wang et al. 2015a). For instance, Grain Size 3 (GS3) (Takano-Kai et al. 2009), Grain Size 5 (GS5) (Li et al. 2011), Grain Width 8 (GW8) (Wang et al. 2012), Grain Width and Weight 5 (GW5) (Liu et al. 2017), Grain Width 2 (GW2) (Song et al. 2007) and Grain Length 7 (GL7) (Wang et al. 2015b) regulate grain size by controlling cell division in rice.
In domestication, larger grains were selected over smaller grains because larger grains were easier to sow, harvest and process (Tao et al. 2017), offer increased yield as well as facilitate rapid seedling growth (Manga and Yadav 1995). This selection has led to reduction in genetic diversity in the cultivated accessions of cereal crops (Doebley et al. 2006). Studies have identified selection signatures in grain size related genes in cereals such as GS3 (Botella 2012), and GS5 (Li et al. 2011) in rice. However, limited genomic studies have been undertaken on grain size regulating genes in sorghum (Tao et al. 2017).
Genes related to the grain size in rice have been well documented (Song et al. 2007; Wang et al. 2008, 2015b). The GS3 gene encodes a protein which controls the grain length in rice (Fan et al. 2006). Studies by Takano-Kai et al. (2009) using genomic approaches have shown that a mutation in the GS3 gene was associated with the enhanced grain length in O. sativa by controlling grain elongation. GW2 is a gene in rice which controls grain size by encoding a RING-type protein with a E3 ubiquitin ligase activity which acts in the ubiquitin–proteasome pathway. Larger grains are a result of loss of function of the GW2 gene (Song et al. 2007). The qGW7/GL7 gene encodes a protein homologous to longifolia 1 in Arabidopsis and regulates longitudinal cell elongation. Mutations of GL7 resulted in an increase in grain length in rice (Wang et al. 2015b). GW8 also known as OsSPL16, is associated with grain size by encoding a squamosa promoter-binding protein–like 16. Loss of function of this gene is related to more slender grain varieties such as Basmati (Wang et al. 2008). GW5 is a gene in rice encoding a nuclear protein which controls the grain width and weight of rice which also acts in the ubiquitin–proteasome pathway. A deletion in the GW5 gene is associated with increased grain width in rice (Weng et al. 2008). GS5 in rice encodes a putative serine carboxypeptidase which regulates the grain weight, filling and width and consequently increase grain size (Li et al. 2011). Interestingly, in rice, many of these genes (GS3, GW3 and GW5) negatively regulate grain size as the wild accessions had smaller grains while mutations in these gene alleles of cultivated species resulted in larger grains (Li et al. 2011; Zou et al. 2020).
The genomic resources of crop wild relatives are well documented in rice, wheat, sugarcane, and maize (Stalker 1980; Plucknett and Smith 2014; Brozynska et al. 2016) but less so in sorghum (Cowan et al. 2022; Ananda et al. 2020; Mace et al. 2013). The indigenous Australian sorghums are ecologically widely adaptable (Cowan et al. 2020; Myrans et al. 2021; Myrans et al. 2020). This high diversity is a result of having separate origins from domesticated sorghums, outcrossing of cultivars with highly variable wild races and cross pollination between races (Doggett 1988). The diversity among the wild species of sorghum is higher than the diversity among the cultivated species suggesting that the diversity has been reduced during the domestication process. However, gene flow is suggested to be asymmetric (Mutegi et al. 2012) since the rate of gene flow from crop-to-wild is higher than vice versa although the rare phenomenon of bidirectional gene flow can be observed in sorghum which is not common among other major crops (Mace et al. 2013).
In Australia, sorghum is mainly cultivated for animal feed (Venkateswaran et al. 2019). There are 17 species of sorghum native to Australia across four subgenera Chaetosorghum, Heterosorghum, Parasorghum, and Stiposorghum (Lazarides et al. 1991; Ananda et al. 2020). Most are found in the semiarid tropical regions of northern Australia with only one species (S. leiocladum (Hack.) C.E. Hubb.) extending to cool temperate regions (Myrans et al. 2020). The monotypic subgenus, Chaetosorghum contains the endemic species S. macrospermum E.D. Garber which is an annual that has 40 chromosomes (2n = 40). It is isolated in distribution to the Northern Territory of Australia, and has a small, sessile spikelet with an ovoid to ellipsoid caryopsis and a reduced pedicellate spikelet. Sorghum laxiflorum F.M. Bailey belongs to the subgenus Heterosorghum and is widely distributed throughout Australia, the Philippines, and Papua New Guinea. It is an annual, with 40 chromosomes (2n = 40) with a large, sessile spikelet, obovoid to ellipsoid caryopsis and reduced pedicellate spikelet. The Parasorghum contains the species S. grande Lazarides, S. leiocladum, S. matarankense E.D. Garber & Snyder, S. nitidum Pers., S. purpureosericeum (Hochst. Ex A, Rich.) Schweinf & Asch., S. versicolor Andersson, S. timorense Buse ex de Vriese and S. trichocladum Kuntze that are distributed across Australia, Africa, Asia, and Mexico. They are mainly perennials with varying chromosome numbers (2n = 10, 20, 30 and 40), with a minute sessile, and a developed pedicellate spikelet, with the five species S. grande, S. leiocladum, S. matarankense, S. nitidum and S. timorense native or endemic to Australia. The Stiposorghum subgenus contains ten species, S. amplum Lazarides, S. angustum S.T. Blake, S. brachypodum Lazarides, S. bulbosum Lazarides, S. ecarinatum Lazarides, S. exstans Lazarides, S. interjectum Lazarides, S. intrans F. Muell. ex Benth., S. plumosum P. Beauv and S. stipoideum (Ewart & Jean White) C.A. Gardner & C.E. Hubb. and all are endemic to Australia. These species are mainly perennials with varying chromosome numbers (2n = 10, 20, 30 and 40) with a small sessile and a well-developed pedicellate spikelet (Lazarides et al. 1991).
Variations in the grain morphology between the domesticated and wild sorghum species were studied by Shapter et al. (2008). The typical cultivated sorghum grains are spherical in shape (Tao et al. 2017). The size of the grain is determined by the cell size, cell number and number of starch granules (Nicolas et al. 1984; Yang et al. 2009). No consistent measurements for the grain size characteristics in sorghum are available in the literature with individual grain weight being used as an indicator for grain size. The weight of the grain is determined by the rate and duration of grain filling (Tao et al. 2017; Nicolas et al. 1984). Therefore, understanding the genetic basis of the grain size in sorghum will provide useful genetic information about the domestication of sorghum and use of this trait in crop improvement.
Materials and methods
Plant material and DNA sequencing
A total of 15 accessions from seven species representing the five sorghum subgenera were used in this experiment (Table 1). Plants were grown at the Australian Grains Genebank, Horsham, Vic, Australia (36° 43′ 21.93764″ S and 142° 10′ 29.50331″ E) following the protocol described in Ananda et al. (2021). Total genomic DNA was extracted from pulverized leaf tissue samples of the 15 sorghum accessions using the Cetyltrimethyl ammonium bromide (CTAB) method optimized for sorghum (Furtado, 2014) and DNA samples were sequenced on an Illumina HiSeq 2000 platform at the Ramaciotti Centre, University of New South Wales, Australia. The data yield obtained post trimming was 20X-36X of the genome size (Ananda et al. 2021).
Statistical analysis
Morphological measurements of grain weight (g), grain width (mm), grain length (mm) and grain thickness (mm) of 10 grains per accession were measured using a ruler under 10X magnification light ring to get to two decimal places. For S. leiocladum, S. matarankense and S. laxiflorum, 10 grains were weighed together as they were too small to register individual weights on the balance. All other species had individual weights for 10 grains measured. Morphological measurements of the grains of 15 accessions were analysed using One-way ANOVA in Minitab (Minitab, LLC, 2021. https://www.minitab.com) at the significance level α ≤ 0.05. Multiple means were compared using the Tukey pairwise comparison test in Minitab.
Variant analysis
A comparative variant analysis of the selected grain size regulating genes (Table 2) was conducted using the basic variant analysis tool in the CLC Genomics Workbench (CLC-GWB 11.0, http://www.clcbio.com). Raw sequencing reads were imported to CLC-GWB together with the annotated nuclear genome sequence of S. bicolor genome from NCBI (accession NC012870.2) as reference (Paterson et al. 2009). Raw reads were subjected to Quality Control (QC) analysis and trimmed to meet a quality score limit of 0.01 (with most calls at Phred score > 30). Variant analysis was undertaken sequentially as follows; trimmed reads were mapped against the reference genome of S. bicolor followed by structural variant analysis using a p-value of 0.0001 as the threshold and finally the reads were subjected local realignment using the Indel track as the guidance variance track.
Variant analysis was conducted using the locally realigned mapping file. Total number of homozygous and heterozygous variants were filtered based on the frequency values equal to 100% and in the range of > 25%–< 75%, respectively. The number of synonymous and nonsynonymous amino acid changes were also determined.
Phylogenetic analysis
Trimmed reads were mapped against the annotated S. bicolor reference genome. The consensus sequences were extracted for each species and converted into coding DNA sequence (CDS) and genome tracks. From the CDS and genome tracks, annotations for the selected grain size related genes (Table 2) were selected for each accession. These genes (exons only) were concatenated to give a final annotated sequence per accession. The concatenated sequences of all the accessions were aligned using the MAFFT alignment tool in Geneious 11.1.5 software (www.geneious.com) with default parameters. A neighbour joining tree was constructed with 1000 bootstrap replicates in Geneious software.
Results
Morphological characteristics of the grains
Figure 1 shows the morphological variation of the sorghum grains representing the five subgenera, Eusorghum (S. bicolor), Chaetosorghum (S. macrospermum), Heterosorghum (S. laxiflorum), Parasorghum (S. matarankense, S. leiocladum, S. purpureosericeum) and Stiposorghum (S. brachypodum). The cultivated species, S. bicolor had distinctly larger grains compared to the grains of the wild sorghum species. The two cultivated accessions used in this study had a spherical shaped grain with a creamy white colour. All the wild sorghum species had smaller and narrower grains with brown to dark brown colour. S. macrospermum had the largest grain followed by S. purpureosericeum and S. brachypodum while S. leiocladum had the smallest grain (Fig. 1)
Statistical analysis
Table 3 shows the morphological characteristics of the grains using grain weight, width, length, and thickness as defining parameters and analysed using One-way ANOVA test. According to the results, all four parameters were significantly different between the accessions (significance level of α ≤ 0.05) (Table 3).
The cultivated species S. bicolor had significantly higher values compared to the grains of the wild species for each of the characters. However, of the two S. bicolor species, the S. bicolor accession 314,746 had significantly higher grain weight, width, length, and thickness compared to S. bicolor accession 112,151. Among the wild species, all the grain size parameters of the two S. macrospermum accessions, 302,367 and 326,072 were distinct from the majority of the other wild species. Grain weight was highest in S. bicolor 314,746 while it was lowest in the accessions of S. matarankense 326,065 and 326,066. In the pairwise comparison, the species S. purpureosericeum and S. brachypodum were not significantly different in grain weight from S. laxiflorum and S. leiocladum. Similarly, the highest grain width was observed in S. bicolor 314,746 whereas the lowest was observed in S. matarankense. In the pairwise comparison, the species S. purpureosericeum and S. brachypodum were grouped together, while S. laxiflorum, S. matarankense and S. leiocladum were grouped together. Grain length was highest in S. macrospermum 302,367 while S. leiocladum 326,062 had the lowest. Interestingly, grain length was not significantly different for the two species S. bicolor 314,746 and S. brachypodum 326,073 and the same for S. laxiflorum, S. leiocladum, and S. matarankense. Likewise, the highest grain thickness was observed in S. bicolor 314,746, whereas the lowest was observed in S. matarankense 326,066. Moreover, the species S. laxiflorum, S. leiocladum, and S. matarankense had grain thickness values which were not significantly different (Table S1, Fig. 2).
Variant analysis of the coding regions of selected grain size related genes in the different Sorghum species
Based on the reference genome of S. bicolor BTX623, variant analysis within the coding sequence regions of the selected grain size related genes from different sorghum species was carried out using the basic variant analysis tool. The highest number of total variants, including single nucleotide polymorphisms (SNPs), insertions and deletions (Indels) and multi- nucleotide variants (MNVs) was found in S. purpureosericeum 326,075 while the lowest number of total variants was found in S. bicolor 112,151 (Table 4). The total number of SNPs was also highest in S. purpureosericeum 326,075 and lowest in S. bicolor 112,151. In the accessions of S. bicolor 314,746 and 112,151, S. macrospermum 302,367 and 326,072, S. laxiflorum 326,060 and 326,074 and S. leiocladum 326,061, the number of homozygous SNPs was higher than that of the heterozygous SNPs, whereas the opposite was observed for the remaining species (Table 4).
According to the basic variant analysis of the CDS regions of the selected genes, all the wild species had a similar number of variants per gene. In the sorghum reference genome, some of these selected genes have several transcript variants resulting in different protein sequences. For instance, Sobic.001G335800 has three transcript variants giving rise to three protein products (XP_021307644.1, XP_021307643.1, and XP_002467688.1), Sobic.002G257900 has two transcript variants giving rise to two proteins (XP_002460490.1 and XP_021308005.1), and Sobic.004G107300 has two transcript variants (XP_002453598.2 and XP_021315956.1). As expected, no variants were observed in the CDS regions for any of the selected genes of the two S. bicolor accessions compared to the reference S. bicolor, since they are all the BTx623 genotype (Table 5).
Sobic.001G335800 gene (qGW7/GL7)
Within the wild sorghums, the highest total number of variants within the Sobic.001G335800 gene was observed in all three transcript variants in the two S. macrospermum accessions followed by S. matarankense. The lowest number of variants was observed in the two accessions of S. leiocladum. Similar results were observed for the total number of SNPs found in the region. Compared to the number of homozygous SNPs, the number of heterozygous SNPs was higher in the species of S. macrospermum and S. laxiflorum while lower in the rest of the species. In all wild sorghum species, the number of nonsynonymous amino acid changes was higher than the number of synonymous changes (Table 5).
Sobic.001G341700 gene (GS3)
The highest total number of variants within Sobic.001G341700 was found in the two S. macrospermum accessions while the lowest was found in S. matarankense 326,066. Similar results were observed for the total number of SNPs. Compared to the number of homologous SNPs, the number of heterozygous SNPs was higher in the species of S. macrospermum, S. laxiflorum and S. brachypodum 302,670. The number of nonsynonymous amino acid changes was higher than the synonymous amino acid changes only in the species S. macrospermum, S. laxiflorum, S. leiocladum 326,062, S. matarankense 326,065 and, S. brachypodum 302,670 (Table5).
Sobic.002G257900 gene (GW8)
In the CDS regions of the two transcript variants of Sobic.002G257900, the highest total number of variants was observed in the three S. purpureosericeum accessions followed by S. laxiflorum, while the lowest was observed in S. macrospermum. A parallel situation was observed for the total number of SNPs found in the region. Compared to homologous SNPs, the number of heterozygous SNPs was higher in the species of S. macrospermum, S. laxiflorum, S. leiocladum 326,061, and S. purpureosericeum. The number of nonsynonymous amino acid changes was higher than synonymous amino acid changes in S. macrospermum 302,367, S. laxiflorum, S. purpureosericeum 326,068 and 326,075 (Table 5).
Sobic.003G035400 gene (GW5/qSW5)
The highest total number of variants in the CDS region of the two transcript variants of the Sobic.003G035400 gene was present in S. brachypodum 302,670 followed by S. laxiflorum 326,060 whereas S. matarankense had the lowest number. Similar results were observed for the total number of SNPs. Compared to the number of homozygous SNPs, the number of heterozygous SNPs was higher in the species of S. macrospermum, S. laxiflorum, S. leiocladum, and S. brachypodum. In all species, the number of nonsynonymous amino acid changes were higher than the synonymous amino acid changes (Table 5).
Sobic.004G107300 gene (GW2)
For the CDS regions of the two transcript variants of Sobic.004G107300 gene, the accessions S. matarankense 326,066 and S. leiocladum carried the highest number and lowest number of total variants, respectively. A parallel situation applied for the total number of SNPs. Except for the species S. macrospermum and S. laxiflorum, the number of heterozygous SNPs were higher than the homozygous SNPs in all the species. For all the species, a higher number of synonymous compared to nonsynonymous amino acid changes wase observed (Table 5).
Sobic.009G053600 gene (GS5)
The highest total number of variants was observed in S. leiocladum and S. macrospermum while the lowest was observed in S. matarankense 326,065. Similar results were observed for the total number of SNPs found in the region. In comparison to the number of homozygous SNPs, the number of heterozygous SNPs were higher in the species S. macrospermum, S. laxiflorum, S. leiocladum and S. purpureosericeum. The number of nonsynonymous amino acid changes was lower than the synonymous amino acid changes in all species (Table 5).
The variant analysis of the selected grain size related genes in some accessions identified SNP variants which resulted in premature stop codons. In both accessions of S. macrospermum and S. laxiflorum, SNP variants resulting in premature stop codons were observed in all the three transcript variants of Sobic.001G335800 (qGW7/GL7) gene, as the result of changes of a glycine at the positions 958, 1057 and 1057 into a stop codon. In the two accessions of S. macrospermum, an additional change was observed in the Sobic.001G341700 (GS3) gene resulting in the change of a glycine at position 241 into a stop codon. In Sobic.009G053600 (GS5), stop codons were observed in the species; S. macrospermum 326,072 (glycine28 > *), S. laxiflorum 326,060 (glycine28 > *) and 326,074 (glycine28 > *), S. leiocladum 326,061 (tyrosine483 > *) and 326,062 (glycine637 > *, tyrosine1236 > *), S. purpureosericeum 326,068 (tyrosine483 > *), 326,071 (tyrosine483 > *) and 326,075 (tyrosine483 > *). A unique amino acid change was observed in Sobic.001G341700 (GS3) in the S. brachypodum 302,670 accession, changing an arginine at the position 262 into a stop codon. A stop codon was observed in the Sobic.004G107300 (GW2) gene in all the three accessions of S. purpureosericeum at the position 466 (Table 6).
The consensus sequences of the coding regions of the selected grain size related genes were concatenated and then aligned to the reference S. bicolor genome derived sequences to construct a neighbour-joining tree. The topology of the tree was supported by high bootstrap values for all clades. In the neighbour-joining tree, two distinct clades were observed with Eusorghum, Chaetosorghum and Heterosorghum in one clade while Stiposorghum and Parasorghum clustered in a separate clade. All the accessions within subgenera were clustered together in the same clade (Fig. 3) which resembled the phylogenetic tree in Ananda et al. (2021).
Discussion
Grain size and weight are two of the key yield components in cereals. Grain size in sorghum varies across the genus (Dillon et al. 2007) but information on genes controlling grain size is scarce. In our current study, we demonstrate a clear difference in the shape, colour, size, and weight between cultivated and wild sorghum species from across the genus. Significant differences in grain size were also detected between the two different S. bicolor (Eusorghum) lines. It could be that these two accessions were derived from two different parent accessions (Ananda et al. 2021) or due to environmental effects when the seed lines were grown. The two monotypic subgenera Chaetosorghum and Heterosorghum are closely related (Ananda et al. 2021) (Fig. 3). Nevertheless, the size of the grains was significantly different.
Interestingly, significant differences were observed even within subgenera, with the grains of S. matarankense, S. purpureosericeum, and S. leiocladum, which belong to Parasorghum, having significantly different grain size parameters.
In this study, we identified the presence of variants in the CDS regions of a number of grain size related genes in wild sorghum species. The number of variants was lowest in the two S. bicolor accessions as the sequence comparisons were made using the sequence of S. bicolor (genotype BTx623) as reference. The wild sorghum species are distant to S. bicolor, and the mapping percentages therefore differed significantly and were low for some of the species. S. macrospermum and S. laxiflorum are closely related to S. bicolor and thus contained a higher percentage of trimmed reads mapping to the reference genome resulting in identification of a higher number of variants. This makes direct comparisons between species difficult. This imbalance can be resolved by using the same species as the reference sequence when the whole genome sequences of these wild species become available.
Within the same species, no significant differences were observed with the number of SNPs within a certain gene suggesting the accessions are indeed closely related. Furthermore, except for the species S. bicolor, S. macrospermum, and S. laxiflorum and S. leiocladum 326,061, all the other accessions had a higher number of homozygous SNPs than heterozygous SNPs. Although sorghum is considered as a self-pollinated crop, cross-pollination ranging from 5 to 15% has been reported (Poehlman 2013). Therefore, S. bicolor, S. macrospermum, and S. laxiflorum might have a higher cross-pollination rate than self-pollination.
In this study, six key grain size related genes were analysed for variants using the S. bicolor annotated genome as a reference. Some of the genes were annotated with more than one transcript variant as an indicator of alternative splicing and different protein products. In all the sorghum species, the same pattern of number of variants were observed in all the transcript variants for a particular gene. The highest percentage of variants was observed in the Sobic.001G335800 (qGW7/GL7) gene (9% of the length) which was the longest transcript. Among the other genes, the Sobic.004G107300 (GW2) Sobic.002G257900 (GW8) genes were more conserved within the genus as those had comparatively less percentage of variants (3% of the length).
In sorghum and other cereals, mutation studies have been reported to cause a loss of function of some grain size related genes (Song et al. 2007; Zou et al. 2020; Tao et al. 2017). In our study, for some of the genes, nonsynonymous amino acid changes which code for stop codons were observed in some accessions. In the GS3 gene in S. bicolor, a premature stop codon in the fifth exon was shown to result from a single C to A nucleotide change preventing expression of the gene and resulting in an increase in grain weight (Tao et al. 2020). In our analysis of the same gene, mutations causing G to A nucleotide changes resulted in conversion of codons for the amino acids glycine and arginine into stop codons in S. macrospermum and S. brachypodum, respectively. Our measurements of grain size show that S. macrospermum has the highest grain weight among the wild accessions followed by S. brachypodum. Therefore, the stop codons identified in the GS3 gene causing a loss of function in the Sobic.001G341700 (GS3) gene in these two sorghum species might contribute to the observed increased grain weight. In rice, loss of function mutation of the GW2 gene is known to cause increased grain weight and width and thus larger grains (Song et al. 2007). In this study, a SNP mutation was observed in both transcript variants of the same gene in the S. purpureosericeum accessions causing introduction of premature stop codons. The grain morphology of S. purpureosericeum was characterized by a comparatively higher grain width and weight among the Parasorghum species. This may be due to the loss of function of the Sobic.004G107300 (GW2) gene. Sequence changes which introduce premature stop codons were also observed in Sobic.001G335800 (qGW7/GL7) gene and may therefore affect grain length in S. macrospermum and S. laxiflorum. These two species belong to the subgenera Chaetosorghum and Heterosorghum, respectively, and are phylogenetically closely related (Ananda et al. 2021). Nevertheless, their grain size parameters are drastically different. Therefore, it is difficult to assess the effect of the stop codon in qGW7/GL7 (Sobic.001G335800) in these two.
species.
The GS5 gene controls grain width, weight and filling in rice (Li et al. 2011). In our analyses, stop codons were observed in GS5 (Sobic.009G053600) in S. laxiflorum, S. leiocladum and S. purpureosericeum. The grain weights of S. laxiflorum and S. leiocladum are low but except for one accession (S. purpureosericeum 326,068) not significantly different. However, the grain width of both accessions of S. purpureosericeum were significantly greater than the other two species. This might be due to the reduction or loss of function of the GS5 (Sobic.009G053600) gene. To determine the exact effect of these premature stop codons on the function of these grain size related genes in sorghum, further experiments with more samples are required.
The phylogenetic tree shown here based on the six selected grain size related genes had a similar tree topology to the phylogenetic tree published in the study of Ananda et al. (2021). In that study, we suggested that Sorghum genus could be divided in to two main groups based on chloroplast and nuclear genes phylogeny, with Eusorghum, Chaetosorghum and Heterosorghum in one group and Parasorghum and Stiposorghum in the other. Results presented here supports this view.
Our current study was targeted towards addressing some of those less studied features of the wild sorghums that will be important in efforts to use wild sorghum for re-wilding of the elite sorghum cultivars and overcome the “domestication syndrome” and gain plant vitality. However, the mapping percentages of the species were vastly different because the genome of the cultivated species S. bicolor was the only one able to be used as the reference. Thus, species more closely related to S. bicolor had a higher mapping percentage compared to other species which affects the variant analysis giving rise to higher number of variants. This study can be modified including more accessions covering all species from the genus and multiple populations representing the diversity within the specie. This provides a preliminary guide to identify the key gene targets in the wild sorghum species to improve the grain quality of sorghum. Multiple accessions of several wild sorghum species are currently being sequenced and, in the future, this will provide a broad range of reference sequences for more accurate mapping. Further sequence analysis and experimental data will also allow more accurate determination of ploidy levels and more accurate basic variant analysis between and within species can be done. The information about the diversity of grain size related genes of the wild accessions would be beneficial in future experiments.
Conclusions
The genus Sorghum has a wide variation in the grain size related parameters with the wild sorghum species having higher diversity. The selected six grain size related genes, Sobic.001G335800 (qGW7/GL7), Sobic.001G341700 (GS3), Sobic.002G257900 (GW8), Sobic.003G035400 (GW5/qSW5), Sobic.004G107300 (GW2), and Sobic.009G053600 (GS5) showed polymorphism in the coding sequence regions. Grain size related genes from wild sorghums have a higher degree of polymorphism compared to the cultivated sorghum species. Mutations which cause stop codons in the grain size related genes might led to reduction or loss of function of the genes and may explain the variation in grain sizes observed. These results suggest that analysis of the genomes of wild sorghum species should allow the discovery of useful genes for the control of grain size in sorghum and other grasses.
Data availability
All data and materials used and described in this study are made available for non-commercial research purposes. The datasets generated during and/or analysed during the current study are available in the Sequence Read Archive (SRA) under the BioProject number PRJNA692754.
References
Ananda GKS, Myrans H, Norton SL, Gleadow R, Furtado A, Henry RJ (2020) Wild sorghum as a promising resource for crop improvement. Front Plant Sci 11:1108. https://doi.org/10.3389/fpls.2020.01108
Ananda G, Norton S, Blomstedt C, Furtado A, Møller B, Gleadow R, Henry R (2021) Phylogenetic relationships in the Sorghum genus based on sequencing of the chloroplast and nuclear genes. Plant Genome 14(3):e20123. https://doi.org/10.1002/tpg2.20123
Botella JR (2012) Can heterotrimeric G proteins help to feed the world? Trends Plant Sci 17:563–568. https://doi.org/10.1016/j.tplants.2012.06.002
Brozynska M, Furtado A, Henry RJ (2016) Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol J 14:1070–1085. https://doi.org/10.1111/pbi.12454
Cowan M, Blomstedt CK, Norton S, Møller BL, Henry R, Gleadow R (2020) Crop wild relatives as a genetic resource for generating low-cyanide, drought-tolerant Sorghum. Environ Exp Bot 169:103884. https://doi.org/10.1016/j.envexpbot.2019.103884
Cowan M, Møller BL, Knudsen C, Furtado A, Henry RJ, Blomstedt CK, Gleadow RM (2022) Cyanogenesis in the Sorghum genus: from genotype to phenotype. Genes 13(1):140. https://doi.org/10.3390/genes13010140
Dillon SL, Shapter FM, Henry RJ, Cordeiro G, Izquierdo L, Lee LS (2007) Domestication to crop improvement: genetic resources for Sorghum and Saccharum (Andropogoneae). Ann Bot 100:975–989. https://doi.org/10.1093/aob/mcm192
Doebley JF, Gaut BS, Smith BD (2006) The molecular genetics of crop domestication. Cell 127:1309–1321. https://doi.org/10.1016/j.cell.2006.12.006
Doggett H (1988) Sorghum. Longman Scientific & Technical, Wiley, New York, Harlow, Essex, England
Fan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet 112:1164–1171. https://doi.org/10.1007/s00122-006-0218-1
Harlan JR (1992) Crops & man. American society of agronomy. Crop Sci Soc Am Madison Wis 16:63–262
House LR (1985) A guide to sorghum breeding. International Crops research Institute for the Semi-arid Tropics Patancheru P.O., Andhra Pradesh
Lazarides M, Hacker JB, Andrew MH (1991) Taxonomy, cytology and ecology of indigenous Australian sorghums (Sorghum Moench: Andropogoneae: Poaceae). Aust Syst Bot 4:591–635. https://doi.org/10.1071/SB9910591
Lee WJ, Pedersen JF, Shelton DR (2002) Relationship of sorghum kernel size to physiochemical, milling, pasting, and cooking properties. Int Food Res J 35:643–649. https://doi.org/10.1016/S0963-9969(01)00167-3
Li Y, Fan C, Xing Y, Jiang Y, Luo L, Sun L, Shao D, Xu C, Li X, Xiao J, He Y, Zhang Q (2011) Natural variation in GS5 plays an important role in regulating grain size and yield in rice. Nat Genet 43:1266–1269. https://doi.org/10.1038/ng.977
Liu J, Chen J, Zheng X, Wu F, Lin Q, Heng Y, Tian P, Cheng Z, Yu X, Zhou K, Zhang X, Guo X, Wang J, Wang H, Wan J (2017) GW5 acts in the Brassinosteroid signalling pathway to regulate grain width and weight in rice. Nat Plants 3:17043. https://doi.org/10.1038/nplants.2017.43
Mace ES, Tai S, Gilding EK, Li Y, Prentis PJ, Bian L, Campbell BC, Hu W, Innes DJ, Han X, Cruickshank A, Dai C, Frère C, Zhang H, Hunt CH, Wang X, Shatte T, Wang M, Su Z, Li J, Lin X, Godwin ID, Jordan DR, Wang J (2013) Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat Commun. https://doi.org/10.1038/ncomms3320
Manga VK, Yadav OP (1995) Effect of seed size on developmental traits andability to tolerate drought in pearl millet. J Arid Environ 29:169–172. https://doi.org/10.1016/S0140-1963(05)80087-4
Mann JA, Kimber CT, Miller FR (1983) The origin and early cultivation of Sorghums in Africa. In. Texas Farmer Collection
Mutegi E, Sagnard F, Labuschagne M, Herselman L, Semagn K, Deu M, De Villiers S, Kanyenji BM, Mwongera CN, Traore PCS (2012) Local scale patterns of gene flow and genetic diversity in a crop–wild–weedy complex of sorghum (Sorghum bicolor (L.) Moench) under traditional agricultural field conditions in Kenya. Conserv Genet 13:1059–1071. https://doi.org/10.1007/s10592-012-0353-y
Myrans H, Diaz MV, Khoury CK, Carver D, Henry RJ, Gleadow R (2020) Modelled distributions and conservation priorities of wild sorghums (Sorghum Moench). Divers Distrib 26:1727–1740. https://doi.org/10.1111/ddi.13166
Myrans H, Vandegeer R, Henry R, Gleadow RM (2021) Nitrogen availability and allocation in sorghum and its wild relatives: divergent roles for cyanogenic glucosides. J Plant Physiol 258–259:e153393. https://doi.org/10.1016/j.jplph.2021.153393
Nicolas ME, Gleadow RM, Dalling MJ (1984) Effects of drought and high temperature on grain growth in wheat. Funct Plant Biol 11:553–566. https://doi.org/10.1071/PP9840553
Nicolas ME, Gleadow RM, Dalling MJ (1985) Effect of post-anthesis drought on cell division and starch accumulation in developing wheat grains. Ann Bot 55:433–444
Paterson AH et al (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457:551–556. https://doi.org/10.1038/nature07723
Plucknett DL, Smith NJH (2014) Gene banks and the world’s food. Princeton University Press
Poehlman JM (2013) Breeding field crops. Springer Science & Business Media, Heidelberg
Price HJ, Dillon SL, Hodnett G, Rooney WL, Ross L, Johnston JS (2005) Genome evolution in the genus Sorghum (Poaceae). Ann Bot 95:219–227. https://doi.org/10.1093/aob/mci015
Shapter FM, Henry RJ, Lee LS (2008) Endosperm and starch granule morphology in wild cereal relatives. Plant Genet Res 6:85–97. https://doi.org/10.1017/S1479262108986512
Song X-J, Huang W, Shi M, Zhu M-Z, Lin H-X (2007) A QTL for rice grain width and weight encodes a previously unknown RING-type E3 ubiquitin ligase. Nat Genet 39:623–630. https://doi.org/10.1038/ng2014
Stalker HT (1980) Utilization of wild species for crop improvement. Advances in Agronomy Academic Press Inc, New York, pp 111–147
Takano-Kai N, Jiang H, Kubo T, Sweeney M, Matsumoto T, Kanamori H, Padhukasahasram B, Bustamante C, Yoshimura A, Doi K (2009) Evolutionary history of GS3, a gene conferring grain length in rice. Genetics 182:1323–1334. https://doi.org/10.1534/genetics.109.103002
Tao Y, Mace ES, Tai S, Cruickshank A, Campbell BC, Zhao X, Van Oosterom EJ, Godwin ID, Botella JR, Jordan DR (2017) Whole-genome analysis of candidate genes associated with seed size and weight in Sorghum bicolor reveals signatures of artificial selection and insights into parallel domestication in cereal crops. Front Plant Sci 8:1237. https://doi.org/10.3389/fpls.2017.01237
Tao Y, Zhao X, Wang X, Hathorn A, Hunt C, Cruickshank AW, Van Oosterom EJ, Godwin ID, Mace ES, Jordan DR (2020) Large-scale GWAS in sorghum reveals common genetic control of grain size among cereals. Plant Biotechnol J 18:1093–1105. https://doi.org/10.1111/pbi.13284
Venkateswaran K, Elangovan M, Sivaraj N (2019) Origin, domestication and diffusion of Sorghum bicolor. Breeding Sorghum for diverse end uses Elsevier, London
Wang E, Wang J, Zhu X, Hao W, Wang L, Li Q, Zhang L, He W, Lu B, Lin H (2008) Control of rice grain-filling and yield by a gene with a potential signature of domestication. Nat Genet 40:1370–1374. https://doi.org/10.1038/ng.220
Wang S, Wu K, Yuan Q, Liu X, Liu Z, Lin X, Zeng R, Zhu H, Dong G, Qian Q, Zhang G, Fu X (2012) Control of grain size, shape and quality by OsSPL16 in rice. Nat Genet 44:950–954. https://doi.org/10.1038/ng.2327
Wang Y, Tan L, Fu Y, Zhu Z, Liu F, Sun C, Cai H (2015a) Molecular evolution of the Sorghum maturity gene Ma3. PLoS One 10:e0124435. https://doi.org/10.1371/journal.pone.0124435
Wang Y, Xiong G, Hu J, Jiang L, Yu H, Xu J, Fang Y, Zeng L, Xu E, Xu J, Ye W, Meng X, Liu R, Chen H, Jing Y, Wang Y, Zhu X, Li J, Qian Q (2015b) Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat Genet 47:944–948. https://doi.org/10.1038/ng.3346
Weng J, Gu S, Wan X, Gao H, Guo T, Su N, Lei C, Zhang X, Cheng Z, Guo X, Wang J, Jiang L, Zhai H, Wan J (2008) Isolation and initial characterization of GW5, a major QTL associated with rice grain width and weight. Cell Res 18:1199–1209. https://doi.org/10.1038/cr.2008.307
Yang Z, Van Oosterom EJ, Jordan DR, Hammer GL (2009) Pre-anthesis ovary development determines genotypic differences in potential kernel weight in sorghum. J Exp Bot 60:1399–1408. https://doi.org/10.1093/jxb/erp019
Zou G, Zhai G, Yan S, Li S, Zhou L, Ding Y, Liu H, Zhang Z, Zou J, Zhang L (2020) Sorghum qTGW1a encodes a G-protein subunit and acts as a negative regulator of grain size. J Exp Bot 71:5389–5401. https://doi.org/10.1093/jxb/eraa277
Acknowledgements
Authors acknowledge the University of Queensland Research Computing Centre (UQ-RCC) for providing all the computing resources. Authors would like to acknowledge the contribution of Dr. Cecilia Blomstedt, School of Biological Sciences, Monash University for providing comments on the manuscript.
Funding
This research was funded by a grant from the Australian Research Council, (Discovery Project, Grant ID DP180101011), by the VILLUM Center for Plant Plasticity (VKR023054) (BLM); by the Novo Nordisk Foundation Distinguished Investigator 2019 programme (NNF 0054563, “The Black Holes in the Plant Universe”) (BLM), and the Carlsberg Foundation Semper Ardens grant (20–0352, Crops for the future-Tackling the challenges of changing climates).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by GKSA, SN and EB. The first draft of the manuscript was written by GKS Ananda and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ananda, G.K.S., Norton, S.L., Barnes, E. et al. Variant analysis of grain size related genes in the genus Sorghum. Genet Resour Crop Evol 70, 1377–1394 (2023). https://doi.org/10.1007/s10722-022-01508-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10722-022-01508-1