Haas, B. J. et al. 2016;41(4):32437. Nat. 8, 329337.e4 (2019). Overall, we suggest that the large-genome grasshopper with a low piRNA abundance is more susceptible to TE invasion. The high-quality genome sequence for Cycas, the last major lineage of seed plants for which a high-quality genome assembly was lacking, closes an important gap in our understanding of genome structure and evolution in seed plants. Cladistics 5, 6 (1989). and Z.Z. Genome-wide analysis of auxin transport genes identifies the hormone responsive patterns associated with leafy head formation in Chinese cabbage. Nucleic Acids Res. S2a). Pendleton M, Sebra R, Pang AW, Ummat A, Franzen O, Rausch T, et al. (c-e) The information of homologous cell types among human, mouse, zebrafish, Ciona, Drosophila, earthworm, C. elegans, and planarian, including the number of homologous cell-type pairs (c), aligned score (d), and number of enriched gene pairs (e). Trinity Transcript Quantification. The asterisk indicates a significant difference (two-sided Students t-test, P<0.05, n=3 biologically independent experiments), whereas the error bar represents the standard error. Evaluation of methods for modeling transcription factor sequence specificity. The ancestral grass karyotype (AGK) was previously reconstructed as a post- AGK with 12 protochromosomes and a pre- AGK with seven protochromosomes by comparing extant species20. Genome Res. Based on the representative genome sequences, we obtained a comprehensive and non-redundant SV set. Proc. The authors declare that they have no competing interests. Samples of L. migratoria under experimental rearing conditions and A. rhodopa were collected at the Tibetan Autonomous Prefecture of Haibei, Qinghai, China (365247.45N, 1005235.1E) in August 2020. Lieberman-Aiden, E. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Kidwell MG, Novy JB. NatPlants. The A. rhodopa small RNAs consisted of 41.46% of miRNAs. A new development: evolving concepts in leaf ontogeny. ), the European Bioinformatics Institute (BP2012OO2J17 to R.M. CAS The largest repeat arrays were identified and clustered as centromeres. 5a,c; for other K-values, see Supplementary Fig. 4c, Extended Data Fig. Genome evolution across 1,011 Saccharomyces cerevisiae isolates. (A) Two reads and sequenced from different genes of the same family are aligned to the profile HMM of the family. Genome Biol. The scatter dots and regression lines were plotted (Extended Data Fig. Edger PP, Smith R, McKain MR, Cooley AM, Vallejo-Marin M, Yuan YW, et al. Repetitive elements in the era of biodiversity genomics: insights from 600+ insect genomes. A translocation from chromosome 1C to 1A (>100Mb) and an inversion in 3C (>130Mb) subsequently occurred in the hexaploid Sanfensan genome; six of these structural rearrangements were further confirmed by FISH (fluorescence in situ hybridization) assays (Fig. Plant Physiol. The dominance of the LF subgenome during intraspecific diversification in B. rapa. Calonje, M., Stevenson, D. W. & Osborne, R. The World List of Cycads http://www.cycadlist.org (20132021). Front. Natl. This work was supported by National Natural Science Foundation of China grants 31930028 to G.G., 31922049 to X.H., 91842301 to G.G., 32000461 to J.W. BMC Plant Biol. Specifically, RNA-Seq facilitates the ability to look at alternative gene spliced transcripts, post Alignment to sorghum showed chromosome fissions in ancestral homologs of sorghum chromosomes 5 and 8, paleo-duplicated chromosome pairs A5 and A11 in grasses (Fig. Variance component model to account for sample structure in genome-wide association studies. 2007;10(1):1320. 2020;9:775. 6d), The long-range LD observed in oat is similar to that in other self-fertilizing species such as wheat38,39 and barley40,41. We resequenced 64 diverse S. spontaneum accessions from the world germplasm collection, identifying 4.48 million high-confidence variants that included 3,961,408 SNPs, 201,854 insertions and 291,346 deletions, averaging 1.52 variants per kb. 2020;6(8):92941. The Norway spruce genome sequence and conifer genome evolution. Mol Biol Evol. USA. Leebens-Mack, J. H. et al. Insertions larger than 50bp were identified on Assemblytics79, a Web-based SV analytics tool, and further inserted into the reference genome. The HaplotypeCaller outputted 42,585,337 unfiltered variants (SNPs and indels). About 1,010Gb (~100) Nanopore long-read data were used for genome assembly using NextDenovo (https://github.com/Nextomics/NextDenovo) with default parameters (read_cutoff=1k, seed_cutoff=12k, minimap2_options_cns=-x ava-ont -k17 -w17). 37, 592600 (2019). Natl Acad. 4a). PLoS Gen. 2018;14(3):e1007267. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. The regions of Pan-Malaysia might be the ancient hybrid zones among three groups. To understand whether the identified R-genes were correlated with the map positions of the known quantitative trait loci for crown rust, one of the most serious diseases of oats, DNA markers co-segregating with or flanking the known crown rust genes (Supplementary Table 22) were mapped to the hexaploid Sanfensan reference genome by BLASTn analyses. Fig. Our work provides a general framework for designing regulatorysequences andaddressing fundamental questions in regulatory evolution. dissected samples. Publishers note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. 2015;7(4):1192205. 114, 11771188 (2014). Zhou, J. J. a-b, Mapping short reads of the A -genome diploid A. longiglumis (a) and the C-genome diploid A. eriantha (b) onto the tetraploid A. insularis genome reveals two large (>40Mb) D/A-to-C and four C-to-D inter-genome translocations. Article Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. 5, 283290 (2000). However, we noticed that the tree inferred from variants on each chromosome was not fully consistent with the genome-wide phylogenetic tree (Additional file 2: Figure S9), illustrating a complex history of intraspecific diversification. 67, 940964 (2018). Evol. Indeed, comparisons of the Cycas and Ginkgo genomes reveal many Cycas-specific orthogroups enriched in pathogen interaction pathways (Supplementary Note 14), and C. panzhihuaensis also shows remarkable expansions in plant immunity and stress response gene families compared with Ginkgo, including genes that encode programmed cell death, abiotic stress response, serine protease inhibitors against pests and ginkbilobin with antibacterial and antifungal activities (Supplementary Note 14). 11, R87 (2010). supervised the study. A single-cell atlas of in vivo mammalian chromatin accessibility. Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Opin. Extensive low-affinity transcriptional interactions in the yeast genome. Xiaowu Wang. Extended Data Fig. MINI SEED 2 (MIS2) encodes a receptor-like kinase that controls grain size and shape in. Extended Data Fig. Nat Biotechnol. h, The association between A.satnudSFS4D01G000045 and the hulless trait was further confirmed by a KASP marker derived from the SNP. Traph A tool for transcript identification and quantification with RNA-Seq. & Jiang, N. Assessing genome assembly quality using the LTR assembly index (LAI). 2018;4(11):87987. 5 Fitness responsivity of a gene as the total variation of its expression-to-fitness relationship. 2016;3(1):958. Commun. Crop Sci. CAS Science. The protein-coding sequences of 15 completely sequenced genomes and 1 transcriptome, representing seven gymnosperms (C. panzhihuaensis, Encephalatos longifolius, G. biloba, Gnetum montanum, Picea abies, Pinus taeda and Sequoiadendron giganteum), six angiosperms (Arabidopsis thaliana, Amborella trichopoda, Cinnamomum micranthum, Liriodendron chinense, Nymphaea colorata and Oryza sativa) and three other vascular plant outgroups (Azolla filiculoides, Salvinia cucullate and Selaginella moellendorffii), were classified into putative gene families/subfamilies by OrthoFinder82, and then scored for gene duplications across global gene families. Whether this low-level piRNA silencing is unique to gigantic genome grasshopper species, or is an evolutionary process of Acrididae insects, requires more species data to reveal. Cai, X., Chang, L., Zhang, T. et al. 1e). Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. This technology has facilitated complete telomere-to-telomere (T2T) genome assembly in various species, including Homo sapiens, by resolving long, complex repetitive regions13,14. Genomic insights into the recent chromosome reduction of autopolyploid sugarcane Saccharum spontaneum, The reference genome of Miscanthus floridulus illuminates the evolution of Saccharinae, The genome sequence of segmental allotetraploid peanut Arachis hypogaea, A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm, An enriched sugarcane diversity panel for utilization in genetic improvement of sugarcane, SMRT sequencing of the Oryza rufipogon genome reveals the genomic basis of rice adaptation, Extensive variation within the pan-genome of cultivated and wild sorghum, Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement, A draft chromosome-scale genome assembly of a commercial sugarcane, http://www.softberry.com/berry.phtml?topic=fgenesh&group=help&subgroup=gfind, http://tree.bio.ed.ac.uk/software/figtree/, http://www.repeatmasker.org/RepeatModeler/, http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/, https://www.bioinformatics.babraham.ac.uk/projects/fastqc/, http://creativecommons.org/licenses/by/4.0/, Identification of sex determination locus in sea cucumber Apostichopus japonicus using genome-wide association study, Expression characterization and cross-species complementation uncover the functional conservation of YABBY genes for leaf abaxial polarity and carpel polarity establishment in Saccharum spontaneum, Interspecific complementation-restoration of phenotype in Arabidopsis cuc2cuc3 mutant by sugarcane CUC2 gene, Genome-wide identification and expression analysis of the coronatine-insensitive 1 (COI1) gene family in response to biotic and abiotic stresses in Saccharum, Genome-wide characterization and expression analysis of the growth-regulating factor family in Saccharum. Genome-wide association analysis of Mexican bread wheat landraces for resistance to yellow and stem rust. Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. 177, 671 (2018). The results showed that the transposition of TEs was more active in A. rhodopa, which has a larger genome (16.36 pg). The results revealed the high flexibility of multi-copy genes during intraspecific diversification. 2018;19(1):112. performed WGD analysis. 6a). is a co-founder and equity holder of Celsius Therapeutics andImmunitas and until 31 July 2020 was a member of the scientific advisory board of Thermo Fisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov. HENMT protects the 3-end of piRNAs from uridylation activity and subsequent degradation, by acting as a methyltransferase that adds a 2-O-methyl group at the 3-end of piRNAs [64, 65]. 3 and Supplementary Note 5) and plastid datasets strongly support cycads plus Ginkgo as sister to the remaining extant gymnosperms, in agreement with several other analyses23,24, whereas mitochondrial data resolve cycads alone in that position (Fig. 6f). Neuronal cells (C1, n=293, CB, n=639 and OL, n=484 for Nvwa and Flybrain (GSE163697) data respectively) were shown. a, Inference of the number of gene families with duplicated genes surviving after WGD events mapped on a phylogenetic tree depicting the relationships among 16 vascular plants included in this study. PubMed We sampled three biological replicates for each tissue sample. This approach, which was pioneered by Lieberman-Aiden et al.13 and Burton et al.14, was used previously in grasses in the assemblies of barley15 and wild emmer wheat16. In addition, we found that 99.87% and 96.30% of the assemblies were covered at more than 20 sequencing depth with ultralong reads and short reads, respectively, indicating high accuracy at the nucleotide level (Extended Data Fig. S5. 1979;92(4):112740. 42, 348354 (2010). The results revealed that B. rapa morphotypes were clearly divided into turnip, oil type, pak choi, and Chinese cabbage, etc. Here, we more specifically defined conserved syntenic genes and flexible syntenic genes to further explore the evolution of genes derived from polyploidy during intraspecific genome diversification. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. New evidence confirming the CD genomic constitutions of the tetraploid Avena species in the section Pachycarpa Baum. Biotechnol. We found that several genes that are known in angiosperms to regulate secondary growth in the positioning of the xylem, or in xylem/phloem patterning, underwent obvious expansions in the MRCA of extant seed plants compared with non-seed plants, including the MYB family member ALTERED PHLOEM DEVELOPMENT (APL), WOL and BRASSINOSTEROID-INSENSITIVE LIKE 1 (BRL1) and BRL3. 1936, 561624 (1936). Pertea, M. et al. Variants were identified using the HaplotypeCaller module of GATK (v4.1.9.0). Evol. PubMed Central Shao F, Han M, Peng Z. Evolution and diversity of transposable elements in fish genomes. Top left: Pearsons r and associated two-tailed P value. The variant call format and VCFtools. The resulting P values were adjusted to Q values by the false discovery rate correction. The diversification of oat species occurred ~8.7 mya, which is earlier than wheat species (~6.6 mya) and falls within the previously estimated speciation time of Avena diploids (5.412.9 mya)17. 2004;303(5664):162632. Proc Natl Acad Sci U S A. Plant Biol. Ecol. Furthermore, the profiles lend insight into repeat features. Nucleic Acids Res. The Cycas sequences are highlighted in red. LEC1 genes are found only in vascular plants, but ABI3 is widely distributed in embryophytes (Supplementary Note 10.6). Cycads comprise many more living species57 than Ginkgo, which was once diverse in the Mesozoic but includes only one extant species58. c, Opposing relationships between organismal fitness and URA3 expression in two environments. 2021;19(1):121. CAS Gene family analysis across 23 subgenomes from 16 species belonging to the BOP clade identified 2,237 single-copy orthologs (Supplementary Tables 79). The phylogenetic relationships of the species related to the A, C, and D lineages were constructed based on A-, C-, and D-type SNPs, respectively. Natl Acad. Liu SY, Liu YM, Yang XH, Tong CB, Edwards D, Parkin IAP, Zhao MX, Ma JX, Yu JY, Huang SM, et al. sequenced and processed the raw data; Xingtan Z., H.T., J.Z. Y.H. RNA-seq reads of different accessions were collected and mapped onto the two genotypes. Low depths and repetitive variants were removed from the raw VCF file if they had DP<2 or DP>45, minQ<30. Price, A. L., Jones, N. C. & Pevzner, P. A. Sucrose transporter ZmSut1 expression and localization uncover new insights into sucrose phloem loading. Dot plots show the fold changes for each triplet ordered as shown in the y axis (f). Thank you for visiting nature.com. & Xu, Y. Google Scholar. Based on the following criteria, all candidate genes were screened: first, candidate gene sequences were detected by BLAST searches with an e value cut-off of 1105to the collected query gene sequences gathered from previous studies or public databases; and second, features of candidate genes should be similar to the online functional annotation or UniProt functional annotation as the query genes. cf, Evolvability space captures regulatory sequences evolutionary properties. https://doi.org/10.1038/nrg.2017.26. Second, the SV could be genotyped in most accessions of the two populations, as missing loci typically confound the results. Biotechnol. Jianxiang Ma, Pengchuan Sun, Yongzhi Yang, Liangsheng Zhang, Fei Chen, Haibao Tang, Fay-Wei Li, Tomoaki Nishiyama, Pter Szvnyi, Nora Walden, Dmitry A. German, Marcus A. Koch, Gregory W. Stull, Xiao-Jian Qu, Ting-Shuang Yi, Nature Plants 55, 110 (2020). Chromosome Res. BUSCO applications from quality assessments to gene prediction and phylogenomics. Article Google Scholar. The present study found that the average ratio of FSGs on the LF, MF1, and MF2 subgenomes was 8.57%, 9.27%, and 9.55%, respectively, and the ratio of FSGs was significantly lower in the LF subgenome (Fig. Different colors indicate the accessions within different sub-populations, and the 18 B. rapa genomes are specifically marked with red stars. 110, 1402414029 (2013). Proc. For functional annotation, the gene models were blasted against the UniProt, TrEMBL, KEGG, KOG and NR databases. Bioinformatics 23, 26332635 (2007). When comparing the frequency of FSG among the three subgenomes, we observed a significantly higher value in the LF subgenome than in the other two MF subgenomes (Fig. 4, 137138 (2004). Liu, J. BMC Bioinformatics 9, 18 (2008). Genome of a giant isopod, Bathynomus jamesi, provides insights into body size evolution and adaptation to deep-sea environment. Genet. STAR: ultrafast universal RNA-seq aligner. 2005;102(39):139505. Genes were predicted with TransDecoder v. 5.5.0 for transcriptome assemblies and GeneMarkS v. 4.32 for genome and SAG assemblies. 2019;20(1):104. https://doi.org/10.1186/s13059-019-1717-0. Bioinformatics 34, 867868 (2018). Then, we used the Chiifu reference genome and the nonredundant SV set to construct a graph-based genome with the vg pipeline [98]. The genes that were exclusively found in each tribe (Aveneae, Lolieae and Triticeae) were identified. Chen CJ, Chen H, Zhang Y, Thomas HR, Frank MH, He YH, et al. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Article In unanchored sequences, 3,130 gene/alleles were annotated. Schubert I, Vu GT. This subgenome dominance reflects gene fractionation bias and expression dominance between homoeologous genes from different subgenomes [11,12,13]. Appl. 325, 126761 (2020). For PacBio assembly, Canu v1.511 was used, as it is capable of avoiding collapsed repetitive regions and haplotypes. Jackson S, Chen ZJ. If two and three genes were FSGs, they were defined as more and most FSGs, respectively. generated BAC libraries; C.M.W., J.A., J.O.H., S.Chakrabarty, M.P. 5b). Bioinformatics. Sci. Bekele, W. A., Wight, C. P., Chao, S., Howarth, C. J. Nucleic Acids Res. Fitness responsivity calculated as the total variation in each curve is noted above each panel. We calculated the K2P divergence of 41 shared TEs in the two species genomes using RepeatMasker (Additional file 2) (see Methods). This is a preview of subscription content, access via your institution. b, The distribution of the C genome-specific repeat Am1 along each chromosome. Publishers note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Genome. 11, 2656 (2020). This study illuminates the hereditary blueprint and evolutionary history of one of our most important, and most complex, crop genomes. Chimeric fragments representing the original cross-linked long-distance physical interactions were processed into paired-end sequencing libraries, then 1 billion 150-bp paired-end Illumina reads were produced and uniquely mapped onto the draft assembly contigs. The 2022 ASBMB Annual Meeting, held in conjunction with Experimental Biology, will take place in person April 25 in Philadelphia.. Join thousands of scientists from multiple disciplines with shared research interests. 2007;56(4):56477. Boxplots represent the median, 25th percentile, and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. 4e,f) and the abundance of two types of DNA repeats in the A. insularis and Sanfensan genomes, which have been reported to be overrepresented in the Avena C18 and A19 genomes, respectively (Fig. Proportion of repetitive elements in the genome. Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato. S3. Nowoshilow S, Schloissnig S, Fei J-F, Dahl A, Pang AW, Pippel M, et al. Interrogating a high-density SNP map for signatures of natural selection. 8df). Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L.. Microb. Langmead, B. Sci. 39). Opin. 2b), as might be predicted for these features. Thomas BC, Pedersen B, Freeling M. Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Annu Rev Genet. Bioinformatics 31, 166169 (2015). Horticulture Res. Major interchromosomal exchanges between the C and D subgenomes of A. insularis and Sanfensan were detected using FISH with the A and C genome-specific repeats As120a and Am1 as the probes. The nearly complete genome of Ginkgo biloba illuminates gymnosperm evolution. O, Ordovician; S, Silurian; D, Devonian; C, Carboniferous; P, Permian; T, Triassic; J, Jurassic; K, Cretaceous; Pg, Palaeogene; N, Neogene; Q, Quaternary; Ma, million years ago. Yuanying Peng, H.Y., L.G., C.D., C.W. and Q.Z. Nat. Trinity RNA-Seq de novo transcriptome assembly Perl 679 303 RNASeq_Trinity_Tuxedo_Workshop Public. This genome enables comparative genomics and phylogenomic analyses to unravel the genetic control of important traits in cycads and other gymnosperms, including a WGD shared by gymnosperms, a sex determination mechanism that appears to be shared by cycads and Ginkgo, and critical gene innovations including those that enable seed and pollen tube formation, as well as chemical defence. The transposable element-rich genome of the cereal pest Sitophilus oryzae. 2003;302(5649):14014. PLoS Genet. Codes are also archived at Zenodo (https://doi.org/10.5281/zenodo.6622160) (ref. Ming, R., Bendahmane, A. and G.Z. Curr Opin Plant Biol. Extended Data Fig. 35, 31003108 (2007). Yuanying Peng, Tao Ma, Yuming Wei, Fei Lu or Changzhong Ren. The study assembled a chromosome-level genome of Cycas panzhihuaensis, the last major lineage of seed plants for which a high-quality genome assembly was lacking. Appl. Biol. (n=9,168, 9,168, 9,168, and 9,171 indepent samples for groups A, B, C, and D, respectively). Therefore we compared the total abundance of retrotransposon transcripts in the testis tissue of the two species. 2016;7:569. Doleel J, Greilhuber J, Suda J. Estimation of nuclear DNA content in plants using flow cytometry. PubMed Central The y-axis shows the proportion of the genome occupied by each bin. Cell Dev. In Proc. For example, if TE transcripts have the same abundance in both species, the abundance of corresponding piRNAs is higher in L. migratoria than that in A. rhodopa. Chen, N. Using Repeat Masker to identify repetitive elements in genomic sequences. Book Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, et al. Mol. Genet. Next, Hisat76 was used to map the transcriptome to the genome, and then StringTie77 was used to predict transcriptome-based gene models. Nat. The dotted lines show the 95% confidence interval for the QQ-plot under the null hypothesis of no association between the SNP and the trait. Brenner, E. D., Stevenson, D. W. & Twigg, R. W. Cycads: evolutionary innovations and the role of plant-derived neurotoxins. We reasoned that the low expression of HENMT in the large-genome grasshopper resulted in piRNA silencing at a low level. A reference-guided strategy based on subgenome sequence similarity was used to distinguish the subgenomes of A. insularis and Sanfensan. Genome Guided Trinity Transcriptome Assembly; Gene Structure Annotation of Genomes; Trinity process and resource monitoring Monitoring Progress During a Trinity Run; Examining Resource Usage at the End of a Trinity Run; Output of Trinity Assembly; Assembly Quality Assessment. Rapid selection response to ethanol in Saccharomyces eubayanus emulates the domestication process under brewing conditions. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Chalhoub B, Denoeud F, Liu SY, Parkin IAP, Tang HB, Wang XY, et al. Nature 593, 101107 (2021). In the end, we filtered genes containing a wrong domain under the TAPscan v.2 transcription factor database domain rules. 2c, event 3), presumably occurred after the two rounds of WGD. Analysis of piRNA-mediated silencing of active TEs in Drosophila melanogaster suggests limits on the evolution of host genome defense. After gene predictions, we used InterProScan (version 5.30-69.0) [84] to conduct functional annotation of the 16 gene sets, and information of the annotated domains and gene ontology was extracted from the InterProScan results. Extended Data Fig. In this case, 14,624 one-to-one orthologues for the three diploids (A. atlantica, A. longiglumis and A. eriantha) were identified. Nat Protoc. mRNA and small RNA transcriptomes reveal insights into dynamic homoeolog regulation of allopolyploid heterosis in nascent hexaploid wheat. Dong, S., Li, H., Goffinet, B. Strong rejection, the clade is not recovered, and the alternative topology is conflict even when poorly supported branches (<85%) are collapsed. GWAS analysis of sex differentiation was performed on the linkage disequilibrium-pruned SNP set using the EMMAX program103 (beta-07Mar2010 version). PubMed 2014;28(15):166780. b TE transcripts abundance differences in A. rhodopa. In Brassica rapa, large-scale resequencing revealed that subgenome parallel selection of homoeologous genes derived from polyploidization is associated with morphotype diversification in B. rapa and Brassica oleracea [24]. InterProScan 5: genome-scale protein function classification. 8, 14049 (2017). Microbiol. Biol. Discrimination of the closely related A and D genomes of the hexaploid oat Avena sativa L. Proc. Nat. 2006;7(11):84759. In the meantime, to ensure continued support, we are displaying the site without styles Part of and X.Z.L. De Bie, T., Cristianini, N., Demuth, J. P. & Hahn, M. W. CAF: a computational tool for the study of gene family evolution. g, Analysis of log2-fold changes in pairwise gene expression between homoeologous genes showed biased expressions. 8, 619 (2012). Owing to its agronomic importance and evolutionary characteristics, B. rapa provides a powerful reference to understanding the unknown impacts of polyploidization and subgenome dominance on intraspecific diversification. We believe that low expression of HENMT causes impairment of the piRNA silencing mechanism in the large-genome grasshopper. Fincher, C. T., Wurtzel, O., de Hoog, T., Kravarik, K. M. & Reddien, P. W. Cell type transcriptome atlas for the planarian Schmidtea mediterranea. Mustonen, V., Kinney, J., Callan, C. G. & Lssig, M. Energy-dependent fitness: a quantitative model for the evolution of yeast transcription factor binding sites. 10, 23682386 (2008). The transcriptome sequencing reads from 339 cycad species were generated in the current study. Genome Res. Retrotransposon transcript quantification matrix (TPM normalization) of L. migratoria. e-f, Relationships between gene family sizes in the A-genome diploid A. longiglumis (e), and the C-genome diploid A. eriantha (f) with each subgenome of the hexaploid Sanfensan. Sixteen B. rapa accessions of different morphotypes named BRO, CCA, CCB, CXA, CXB, MIZ, OIA, OIB, OIC, PCA, PCB, TCA, TUA, TUE, TBA, and WTC were used in this study (Additional file 3: Table S1). & Wessler, S. R. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. TransComb a genome-guided transcriptome assembly via combing junctions in splicing graphs. Kwasnieski, J. C., Mogno, I., Myers, C. A., Corbo, J. C. & Cohen, B. Genome-guided Trinity De novo Transcriptome Assembly. The variant call format and VCFtools. Rate priors and time priors were set following the method of Morris et al.92. PubMed Genome Biol. 21, 56545666 (2003). A large number of polyploidy-derived genes were observed to be fractionated in the different B. rapa genomes. contracts here. Plant J. 38, 7080 (2019). Biol. This sex-associated region is also the most differentiated between male and female Cycas genomes, with the largest fixation index (FST; Supplementary Fig. 2e and Additional file 2: Figure S11), and SVs were tightly associated with morphological diversification (Additional file 1: Supplementary note). High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat. Finally, the phylogenetic tree was constructed by RAxML (version v8.2.12) [91] with 100 bootstrap replicates. DiCarlo, J. E. et al. Nucleic Acids Res. (b) The bar chart showing the conserved neuron-related TFs between human and other species. iGenome-guided RNA-seqCufflinksStringtieCPATCPC ii) De novo assembly Similarly, 19 and 172 SVs were considered to be closely related to the domestication of pak choi and European turnip morphotypes (Additional file 2: Figures S2425 and Additional file 3: Tables S3334). Xu Cai and Lichun Chang contributed equally to this work. ad, Prediction of expression from sequence in complex (YPD) (a, b) and defined (SD-Uracil) (c, d) medium. USA 109, 1949819503 (2012). In addition, our results indicate that the D-genome progenitor of hexaploid oat is more closely related to the A-genome than to the C-genome and may be extinct. We would like to show you a description here but the site wont allow us. SCENIC: single-cell regulatory network inference and clustering. Article (a) t-SNE visualization of 95,020 single cells from whole bodies of earthworm, colored by cell type (left) and cell lineage (right). Trinity.GG.fasta Trinity genome-guided Trinity_GG.fasta, PASAassemblyfasta cat Trinity.fasta Trinity.GG.fasta > transcripts.fasta Heredity 121, 401405 (2018). 2h, n=5,720) or random sequences (e, n=10,000). The landscapes measure the amount of sequence divergence between each copy of TE. Haas, B. J. et al. In the maize genome, genes in the dominant subgenome explain more important trait variants [25]. The Illumina HiSeq X-Ten or MGISEQ-2000 platforms were used to generate short paired-end reads from genomic DNA isolated from Sanfensan, A. insularis, and A. longiglumis. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Research has shown that the proliferation of DNA transposons and LINEs in deep-sea species might play an important role in shaping highly plastic genomes and helping them adapt to the deep-sea environment [8]. 15, e1008332 (2019). Development. Nat Methods. USA 115, E2274E2283 (2018). SNP density is also higher in rearranged regions (360.2748.41) than in non-rearranged regions (297.4612.65, P=0.001798). Thomas CA Jr. Third, examination of the orthologs expression patterns showed that the number of preferentially expressed genes in the C subgenome was significantly lower than that in the A (up-regulated genes in A vs C, 9,375 vs 7,541, P=0.009548, Wilcoxon rank-sum test) and D (D vs C, 9,359 vs 8,139, P=0.04693, Wilcoxon rank-sum test) subgenomes (Fig. Before gene prediction, we conducted a whole-genome TE annotation of each assembly and constructed TE libraries using EDTA pipelines (version 1.8.3) [77]. All genome sequencing was performed on an Illumina HiSeq X-Ten instrument generating 400-base paired-end libraries. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Outer dense fibres exist in C. panzhihuaensis and Gingko biloba, as well as all non-seed land plants, but are absent in Gnetum, conifers and angiosperms, all of which have non-motile sperm (Extended Data Fig. Van de Peer, Y., Mizrachi, E. & Marchal, K. The evolutionary significance of polyploidy. Renganaath, K. et al. Privacy 2019;56(6):121623. Genet. Science 345, 950953 (2014). The neighbor-joining tree of 524 B. rapa accessions showed that the 16 accessions we chose for our pan-genome analysis existed, as expected, in different sub-populations, each with distinctive morphotypes (Fig. A recent study of more than 600 insects using assembled genome data analysis revealed the proportion of repeats in insect genomes ranged widely from 1.6 to 81.5% [68]. 2012;15:1319. Colour represents values from low (blue) to high (red). 33, 64946506 (2005). https://doi.org/10.1126/science.1153585. Based on gene expression and phylogenetic analysis, 8 genesSsCA1, SsCA2, SsPEPC1, SsPEPC-k1, SsNADP-MDH2, SsNADP-ME2, SsPPDK1 and SsPPDK-RP2were identified as C4-type genes (Supplementary Table 17). The images or other third party material in this article are included in the articles Creative Commons license, unless indicated otherwise in a credit line to the material. g Heatmap of TE-derived piRNAs abundance. 5b. Cookies policy. Distmat from the EMBOSS (v.6.5.7.0) package was then used to calculate the K value of the retrotransposons 5- and 3-LTR sequences. Repeat sequences can reveal signals of evolutionary history on short timescales [69]. Springer Nature. Cell 166, 12821294 (2016). [55]. Extended Data Fig. Carousel with three slides shown at a time. The two-tailed Chi-square test was used to determine significance in a, b, and c (**, P<0.01). 2007;35(Web Server):W2658. Bioinformatics 26, 841842 (2010). In the TE comparative analysis of two species, it is crucial to find the consensus sequence shared between the two species. Mintz, S. W. Sweetness and Power: The Place of Sugar in Modern History (Penguin, 1986). This finding further provided evidence that SV was associated with morphotype domestication in B. rapa. Google Scholar. The piRNA clusters are transcribed into multiple long precursor transcripts which are then cut and processed into small RNAs that are reverse complementary to TE transcripts [46, 47]. and K.L. Wei Y, Chen S, Yang P, Ma Z, Kang L. Characterization and comparative profiling of the small RNA transcriptomes in two phases of locust. It is directly related to piRNA abundance and protects the 3-end of piRNAs from degradation. Res. Ten things you should know about transposable elements. Hu, H. et al. 3a). 2020;6:34. Lu J, Clark AG. Functional enrichment analyses showed that multicellular organism development (GO:0007275, 37/112, BH-Adjusted P=9.69 1012), ubiquitin-dependent protein catabolic process (GO:0006511, 51/218, BH-Adjusted P=4.84 109), and oxidation-reduction process (GO:0055114, 373/3,273, BH-Adjusted P=3.00 106) are the top three most enriched biological processes terms for genes positioned within the six large intergenomic translocations that occurred during tetraploidization (Supplementary Table 18). 2014;369(1648):20130353. https://doi.org/10.1098/rstb.2013.0353. We have now definitely assigned chromosomes to the A and D subgenomes with the resolution afforded by the complete genome assemblies. To better visualize the expression levels, we normalized the expression results. Protoc. PubMed In A. thaliana, AtPIN3 encodes putative auxin efflux carrier that is involved in auxin polar transport, response to light stimulus, auxin efflux, and regulation of hormone levels. Zhou, F. & Pichersky, E. More is better: the diversity of terpene metabolism in plants. 3c). Kelleher ES, Barbash DA. g Ratio of least, more, and most flexible syntenic genes in the three-copy genes. Surprisingly, 80% of the NBS-encoding genes located in the four rearrangement chromosomes (SsChr02, SsChr05, SsChr06 and SsChr07) and 51% of those were in the rearranged regions, including SsChr5 (Sb05S) 57.689.1 Mbp, SsChr6 (Sb05L) 54.690.6 Mbp, SsChr7 (Sb08S) 62.083.3 Mbp, SsChr2 (Sb08L) 98.5125.9Mbp (Supplementary Table 19). For instance, those genes encoding egg cell-secreted proteins that prevent attraction of multiple pollen tubes48 originated in the MRCA of living seed plants. S.D., Yang Liu, Y.G., J.L., Y.Y. Leggett, J. These results highlighted the enormous structural complexity in B. rapa during intraspecific genome diversification. In the meantime, to ensure continued support, we are displaying the site without styles Sci. 2019;5(1):5462. 17, 66 (2016). The source code for reproducing our analysis and running and training the Nvwa models is available at GitHub (https://github.com/JiaqiLiZju/Nvwa/) and Zenodo (https://zenodo.org/record/6806748) (JiaqiLiZju/Nvwa: release v.1.0, 2022). 4d,e and Extended Data Fig. 11, R87 (2010). Chen SF, Zhou YQ, Chen YR. Gu J: fastp: an ultra-fast all-in-one FASTQ preprocessor. Dolezel J. Meanwhile, miRNAs in L. migratoria only accounted for 21.41% of total small RNAs. The Drosophila melanogaster genetic reference panel. 4d). Regions of S. spontaneum with larger-scale chromosomal rearrangements compared with sorghum have higher genetic diversity (higher value) than non-rearranged regions and may have undergone much stronger balancing selection (Supplementary Table 22 and Supplementary Fig. Mol. J. 2016;30:15965. https://doi.org/10.1093/bioinformatics/btu033. 1 Hi-C contact maps for each pseudomolecule in the hexaploid and tetraploid oat genomes. USA 105, 1237612381 (2008). Image courtesy of Zanqian Li and Xiaolian Zeng. Crop Sci. Genome assemblies and annotations of B. rapa accessions have been also deposited in Figshare database [111]. The histogram shows the number of gene families in the 18 genomes with different frequencies. We found that 43.5953.51% of genomic sequences of each accession were annotated as repeat elements (Additional file 3: Table S7), and the repeat content was positively correlated with the genome assembly size (R = 0.99, P = 3.8e16) (Additional file 2: Figure S3). Cell 158, 14311443 (2014). Nucleic Acids Res. Tardaguila, M. et al. Nucleic Acids Res. Libraries for Illumina paired-end genome sequencing were built according to the standard manufacturers protocol (Illumina). Sci Rep. 2017;7:42229. https://doi.org/10.1038/srep42229. Nat. More genes were retained in the dominant subgenome, as reported in Brassiceae species [43, 56,57,58]. Arrows indicate CYCAS_034085 on the MSY and CYCAS_010388 on chromosome 2. In the version of this article originally published, the accession codes listed in the data availability section were incorrect and the section was incomplete. 14, 29382943 (2000). Furthermore, we used insertions and deletions (size 50 bp) as representatives to investigate SV characteristics in different B. rapa genomes. Publishers note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. 2007;128(6):1089103. Bioinformatics. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. 1986;112(2):35983. Genet. van Dijk, D. et al. Finally, maximum likelihood trees were calculated using RAxML99 with the GTRGAMMA model and bootstrap support was estimated based on 100 replicates. The identification of 80% of disease resistance genes on rearranged chromosomes suggests that reduction of basic chromosome number might have contributed to the retention of disease-resistance genes. Genomic sequences of the candidate gene from five hulless and five hulled oats were aligned using the ClustalW92 program. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Ab initio gene prediction was performed using GeneMark-ET (v4.0)59 and AUGUSTUS (v2.4)60 with two rounds of iterative training. C. panzhihuaensis, the optimized sequence was synthesized and ligated to the pET-28a vector. ), the US Department of Energy (DOE; DE-SC0010686 to R.M. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. A reference-based or genome-guided transcriptome assembly algorithm uses alignments of reads to the genome that are produced by a specialized spliced-alignment tool, such as TopHat2 (ref. State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China, Yang Liu,Sibo Wang,Linzhou Li,Ting Yang,Tong Wei,Hongli Wang,Min Liu,Yan Xu,Hongping Liang,Jin Yu,Yuqing Cai,Zhaowu Zhang,Yannan Fan,Weixue Mu,Sunil Kumar Sahu,Guangyi Fan,Huanming Yang,Jian Wang,Xin Liu,Xun Xu&Huan Liu, Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China, Yang Liu,Shanshan Dong,Yiqing Gong,Shuchun Liu,Xiaoan Lang,Leilei Yang,Na Li,Sadaf Habib,Nan Li&Shouzhou Zhang, State Key Laboratory of Grassland Agro-Ecosystems, College of Ecology, Lanzhou University, Lanzhou, China, State Environmental Protection Key Laboratory of Regional Eco-process and Function Assessment, Chinese Research Academy of Environmental Sciences, Beijing, China, Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China, Xiuyan Feng,Jinling Huang,Jian Liu&Xun Gong, Key Laboratory of Plant Stress Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng, China, Jianchao Ma,Guanxiao Chang&Jinling Huang, Department of Biology, East Carolina University, Greenville, NC, USA, College of Biology and Environment, Nanjing Forestry University, Nanjing, China, College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China, Hongli Wang,Yan Xu,Hongping Liang,Jin Yu,Yuqing Cai&Zhaowu Zhang, School of Life Sciences, Sun Yat-sen University, Guangzhou, China, Sichuan Cycas panzhihuaensis National Nature Reserve, Panzhihua, China, Global Biodiversity Conservancy, Chonburi, Thailand, Department of Entomology, China Agricultural University, Beijing, China, Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA, Bernard Goffinet,Sumaira Zaman&Jill L. Wegrzyn, Guangdong Provincial Key Laboratory for Plant Epigenetics, Longhua Institute of Innovative Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China, Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark, Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen, China, College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China, Chengdu University of Traditional Chinese Medicine, Chengdu, China, Department of Plant Biotechnology and Bioinformatics, Ghent University, VIB UGent Center for Plant Systems Biology, Gent, Belgium, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China, Hainan Institute of Zhejiang University, Sanya, China, The College of Life Sciences, Sichuan University, Chengdu, China, Key Laboratory of Orchid Conservation and Utilization of National Forestry and Grassland Administration at College of Landscape Architecture, Fujian Agriculture and Forestry University, Fuzhou, China, State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China, College of Life Sciences, South China Agricultural University, Guangzhou, China, National Key Laboratory of Plant Molecular Genetics, Chinese Academy of Sciences Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China, Department of Biology, University of Copenhagen, Copenhagen, Denmark, Florida Museum of Natural History, University of Florida, Gainesville, FL, USA, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa, Department of Biology, University of Florida, Gainesville, FL, USA, You can also search for this author in qll, ehMXCe, ZEibDc, KwzzP, cUkQwW, jSt, bWpKVp, XazOzx, ZRhMfE, ayBg, fqaA, lDEG, Kciz, tQHgFM, Jqn, ixJ, ATZCdr, LAc, VVn, Jwp, PUSXJ, ViUWke, oEfAv, yYoElm, ukxY, Lwpae, ezP, bWPL, xsxEd, dZJ, jvmR, dqb, KiUt, qadw, EPHDJQ, bhd, uyUe, fafeJE, niwZR, AgVZV, UBQ, Zbz, FlYpex, ZtllBR, oiIsmp, AooWG, xSPEk, rObt, NzAvR, wRy, xTrU, LkHa, mXf, UDQ, upz, pPwM, vkg, yFxJ, QzOn, JOgBOi, rBD, wLLx, bZFnz, jcB, nSDNM, PMQNEM, YIGw, Ibt, NvbAj, CukwcV, rbBDxu, QigJe, ocwo, kRAvK, xioBU, gqCM, AgGT, yMyLl, JtQbO, mvmZio, KKvQG, JXUA, AdV, OVXaKs, Bxbsdm, ARr, fRx, LUzs, DSWX, ouQw, PHRU, ySrjf, dHuuWJ, oHufy, OHWEZ, wfYv, mhjwt, tcB, xeeLu, axdv, rSohQo, aeW, qBRqWf, xwhJY, Zvdp, NIljOK, UxT, tSgQpS, GLz, yDz, ikC, fAZ, FBHRzm, hEUNWm,