Scientific Papers

Molecular genetic diversity and linkage disequilibrium structure of the Egyptian faba bean using Single Primer Enrichment Technology (SPET) | BMC Genomics


Due to the high protein content in faba bean crop, it has been a pole of the Egyptian diet. Although it is considered the first leguminous food crop in Egypt, very few faba bean cultivars (~ 10–15 faba bean cultivars) are widely grown in Egypt. This small number of highly adapted faba bean genotypes may be vulnerable to extinction due to the consequences of climate change. Therefore, expanding the circle of genetic diversity in faba bean genotypes especially those with high adaptiveness to the Egyptian conditions should be highly considered. Therefore, faba bean genotypes were collected from different research institutes/universities to investigate the level of genetic diversity among faba bean genotypes which were originally collected from Egypt. As a result, a set of 102 faba bean genotypes were found at the Leibniz Institute of Plant Genetics and Crop Plant Research (Germany), nine of IFAPA (Selvia), and two from ICARDA (Syria). All these genotypes in addition to a total of 11 widely-grown Egyptian faba bean were used in this study to genetically dissect the genetic diversity of an Egyptian fava bean panel.

Understanding the genetic diversity of a target crop panel is an essential step for successful breeding programs to genetically improve important traits such as grain yield and tolerance to various biotic and abiotic stresses. For the past decades, the complicity of the faba bean genome hindered the dissection of genetic diversity in this important legume crop. Therefore, few research efforts have been made to explore genetic diversity in faba bean, compared to other crops, with a limited number of DNA markers [14, 33,34,35,36]. The utilization of transcriptome analysis has also led to the development of a high-density faba bean genotyping array named SPET genotyping which includes ~ 60 K probes [37]. The SPET method was previously used to genotype 2,678 faba bean genotypes and it generated 21,345 SNP markers [5]. In this study, a set of 6,759 SNPs resulted from the genotyping by SPET method and was used to dissect the genetic diversity of the Egyptian panel used in this study.

The 6, 759 SNPs were distributed across all faba bean chromosomes, indicating the usefulness of these markers for detecting as many marker-trait associations in this population. The PIC values of these SNP ranged from 0.04 to 0.375 which was approximately a similar range reported by [19] (0.09–0.375) and [11] (0.05–0.38). Compared to other markers (e.g. SSR), the highest PIC value for bi-allelic markers (e.g. SNPs) is 0.5. Therefore, SNP markers could be classified as moderate to low informative markers. However, the recent sequencing methods generate a lot of SNP markers that can be utilized to deeply study the genetic diversity among genotypes [37] reported that PIC values indicate the informativeness of SNP markers which could be utilized in investigating the genetic diversity in different organisms [38, 39]. The range of gene diversity of the markers in this study was also similar to those reported in earlier studies that used different population sizes and different numbers of SNPs. Moreover, the highly informative markers were distributed across all faba bean chromosomes, allowing the detection of target SNPs for important traits to be feasible. The distribution of informative markers reflects the overall diversity in the crop populations [39, 40]. The PIC values and gene diversity are very useful in assessing the polymorphism among the genotypes for breeding and genetic diversity programs [39, 41]. Based on the distribution of PIC and gene diversity values in our tested faba bean population, we can conclude that these markers explained the genetic diversity among the Egyptian faba bean population and could be used in other genetic studies such as genome-wide association studies to identify alleles controlling target traits.

The majority of SNPs had a minor allele frequency of ~ 0.1 and they were distributed across the genome. The distributions of allele frequencies shed light on the population genetic architecture of complex traits [42]. The lower allele-frequency SNPs will certainly help in discovering new loci and provide a further investigation of different population genetic models for further understanding of differences and similarities among genotypes [42].

The genetic properties of the SNP markers used in this study on the Egyptian faba bean populations indicated the efficiency of the SPET method in generating informative SNPs that can be utilized in genetic studies in faba bean.

Population structure and relationship

The analysis of population structure (PS) provides important information on the level of genetic diversity among the tested genotypes, and it is one of the basic analyses that should be conducted before performing genome-wide mapping studies. Among the clustering methods, the STRUCTURE software used in this study is the most recommended analysis to explore the possible subgroups in the tested population [43]. The STRUCTURE analysis divided the Egyptian faba bean population into possible five subgroups (SP1, SP2, SP3, SP4, and SP5). The PCA analysis was in agreement with the STRUCTURE results. The presence of five subgroups in the current population was surprising as all genotypes were originally from Egypt. The cross-pollination nature of the faba bean crop and the effect of the environment may explain the structure found in the current population. Most of the faba bean genotypes (102 genotypes) were collected from IPK (Germany) in which this material was grown for many years in an open-pollination field. Also, a set of 11 faba bean genotypes were bred in IFAPA (Spain), INRAe (Hungary), and ICARDA (Syria). Therefore, it is expected that new genes were integrated into, by pollinators, these materials by the gene flow due to the immigration (seed exchange) of these genotypes from out of the origin (Egypt). The IPK group genotypes were distributed on four subpopulations (SP1, SP2, SP3, and SP4). The EURI group bred in Egypt shared the IPK group in SP2 and SP3. The PCA revealed that the EURI and IPK were very near to each other. The Nei genetic distance of 0.060 further supported the closeness of these two groups (IPK and EURI), while it was 0.135 and 0.164 between CRI and IPK and between CRI and EURI, respectively (supplementary Table 4). Remarkably, the CRI group (ICARDA, IFAPA, and INRAe) was clearly separated from the IPK and EURI groups creating a distinct subgroup (SP5). The distinct separation between CRI and the other two groups may be due to the fact that the genotypes collected from CRI were exposed to strong gene flow before breeding these lines by single seed descent. The Nm (gene flow) value was less than 1 between EURI and IPK (0.89) and between EURI and CRI (0.46) (Supplementary Table 2). When Nm < 1, this indicates that populations have different genetic structures that may be due to the evolutionary change through the adaptation to the local environments via natural selection or through genetic drift [44]. Bearing in mind that the IPK and CRI were bred and collected from European countries (Germany, Sapin, and Hungary). The Nm between CRI and IPK was 1.8, indicating considerable variation in gene frequencies among these two population populations [45]. Therefore, the gene flow could be one of the main reasons for this clear separation of CRI from the other two groups (IPK and EURI). The IPK genotypes were distributed in the two subpopulations. Among the 15 EURI genotypes, 14 were assigned to subpopulation 2 and only one was assigned to subpopulation 1. The structure analysis was performed on the 128 genotypes (Fig. 3e), EURI 14 genotypes were assigned to SP2 where all other members were from the IPK and only one genotype was assigned to SP3 where the other members were from the IPK group. Another reason for this clear separation of the CRI could be due to that these 11 genotypes had shared faba bean ancestor and after migration, it could be grown in an open field, then collected by the research institutes that bred this ancestor by single seed descent to get highly homozygous breeding lines. Interestingly, the CRI (SP5) genotypes were previously included among 2678 faba bean genotypes in a genetic diversity study reported by [5] who assigned the 11 genotypes in the same subpopulation (SP2), confirming the results of structure analysis performed in this study. Although STRUCTURE revealed the possible genetic subpopulations, however, the analysis of genetic features based on the breeding institute (IPK, EURI, and CRI) was very useful in unlocking the genetic diversity and population structure among these genotypes.

The analysis of AMOVA revealed a high percentage of variation among subpopulations (12%) and groups (19%) which was higher than was reported by Zhang et al. [19] who found only 1% among subpopulations from a set of 410 global faba bean accession. This indicates the presence of high diversity among Egyptian faba bean genotypes.

Genetic distance among the genotypes

The genetic distance found among genotypes further indicates the existence of considerable genetic diversity among genotypes. The heatmap of genetic distance revealed high similarity among the 11 genotypes and higher genetic distance with all genotypes. Therefore, crossing among genotypes will be fruitful for improving target traits such as seed yield and tolerance to biotic and abiotic stress tolerance. The EUC_VF_194 had a high genetic distance with all genotypes, therefore these genotypes may be very useful to be integrated into future breeding programs. Crossing among highly divergent genotypes will lead to producing cultivars having high-yield traits [41, 46]. Fortunately, the IPK groups were tested for two growing seasons under the Egyptian conditions for their yield traits and they are highly adapted under the Egyptian conditions (unpublished data – Ahmed Sallam, personal communication). Moreover, the same population was phenotyped under severe drought stress in 2022/2023 [47]. High significant genetic variation was found among genotypes in plant height (32.7 – 86.95 cm), stay green (1.5–9), day to flowering (31.18—62 days), and days to maturing (124.5 – 146.22 days). Therefore, the genetic improvement of yield traits under various biotic and abiotic stress is feasible are feasible in this population. As previously mentioned, there are ~ 15 known faba bean genotypes in Egypt that are widely grown, this very low number of genotypes could threaten the food security of Egypt for this important legume. Therefore, integrating the genotypes for the IPK and CRI into Egypt will not only improve the production and productivity of faba bean in Egypt but also it will expand the circle of genetic diversity of this important crop to face the dangerous challenges of climate change.

The allelic pattern among the subpopulation

The allelic pattern among the subpopulations indicated that SP5 (CRI) had the highest values of all the indices. So, the SP5 subpopulation could provide a good source of genetic diversity in faba bean. As previously discussed, including genotypes from SP5 (CRI) will be a very future faba bean breeding program after evaluation for the target traits—private alleles. More importantly, private alleles either among subpopulations or among genotypes shed light on a remarkable differentiation in the loci. Estimating private alleles provides important information on those presented in only subpopulations. In this regard and as was expected, genotypes from SP5 (CRI) had the highest number of private alleles among other subpopulations. Again, the EUC_VF_194 genotype had the highest number of private alleles ( PA < 140), distinguishing this genotype from the other genotypes in the population. The private alleles highlight the unique genetic variability in certain loci and the identification of highly diverse genotypes that could be utilized in crop breeding programs as candidate parents to maximize the allele richness in the population [48]. The EURI had the lowest percentage of private alleles compared to CRI and IPK. Therefore, it is probable that the genotypes in CRI and IPK might acquire new loci and genes, before and during breeding research, that did not exist in the EURI.

The structure and extent of LD in the genome of the Egyptian faba bean population

Understanding the LD magnitude and the decay are essential for obtaining high mapping resolution in genome association analyses because they provide an idea of the number of SNP markers required for performing association analyses [49]. At the genomic level, LD extent differs by species [49]. The characters of haplotype blocks in faba bean genome in the current population were studied and they varied across all chromosomes. The haplotype blocks in the genome shed light on the important genomic regions of interest that may include candidate genes or studying different genotype groups at specific loci [50]. Interestingly, the haplotype blocks were distributed across each chromosome indicating the possibility to identify candidate genes for target traits in this population.

Most of the marker pairs were found to have r2 of between 0–0.1, indicating the presence of very low LD in the genome of the Egyptian faba bean. Interestingly, in the whole genome and each chromosome, high LD genomic regions (between marker pairs) were observed at high genetic distances (Fig. 7) when the r2 between each marker was plotted against the phsyical distance. These high LD genomic regions which were among low LD (non-significant) regions can be considered LD hotspots. Therefore, it is essential to identify the structure of LD in the faba bean population and the distribution of LD hotspot regions across the genome. In the current population, the LD decayed below r2 of 0.1 at a drop point of 57.417 Mbp which at a lower physical distance than reported by Skovbjerg et al. [5] in seven-parent MAGIC (LD decay =  ~ 68 Mbp) and four-way cross (LD decay =  ~ 77 Mbp) faba bean populations. The LD decay in some other faba bean populations was found to be faster e.g. at 681, 730 bp in EUCLEG, 678,648 bp in NORFAB, and 672,877 bp in ProFaba [5]. This difference in LD decay at different physical distances is due to the number of recombinations in each faba bean population. Therefore, it is very important to measure the LD decay in the faba bean population, especially in genetic association studies. Genetic diversity has a definite relationship with linkage disequilibrium (LD) decay, which successively affects the diversity and LD-based association mapping [51].

It is worth investigating the LD decay on each chromosome for further understanding of the extent of LD in the Egyptians. Compared to the LD decay across the genome (Fig. 6c), the LD decay dropped very fast in Chr.1 at 42.248 Mbp, while it was at a high physical distance of 91.709 Mbp in Chr.4. This can be interpreted by observing the number of significant LD blocks which was the highest in Chr. 1 and lowest in chromosome 4. Selection, genetic drift, migration, and the nature of pollination (partially allogamous) could be strong reasons for this rapid decay and low LD in the Egyptian faba bean.

In conclusion, the detailed genetic diversity and population structure analyses performed on the Egyptian faba bean promised the genetic improvement of faba bean crop under Egyptian conditions. Selection of the promising genotypes as candidate parents in this panel is feasible as considerable genetic distance among genotypes was found. The LD structure performed on this process will be helpful in the genetic association study to identify candidate genes associated with target traits in Faba bean (e.g. grain yield, protein content, and tolerance to various biotic and abiotic stress tolerance).



Source link