真菌比较基因组组装,三代真菌基因组组装

三代真菌基因组分析

Comparative and population genomic landscape of Phellinus noxius: A hypervariable fungus causing root rot in trees.


二代和三代组装方法,以及基因预测方法

Genome assembly

Genome assembly of different species was carried out with Falcon (ver 0.5.0; Chin et al., 2016) and were improved using Quiver (Chin et al., 2013) and finisherSC (Lam, LaButti, Khalak, & Tse, 2015). For assembly of individual strains of P. noxius, Illumina paired end reads were trimmed with Trimmomatic (ver 0.32; options LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 MINLEN:50; Bolger, Lohse, & Usadel, 2014) and subsequently assembled using SPAdes (ver 3.7.1; Bankevich et al., 2012). Multiple mate-pair reads were available for three strains of P. noxius (KPN91, A42 and 718-S1) and they were assembled using ALLPATH-LG (ver 49688; Butler et al., 2008) assembler and improved using Pilon (Walker et al., 2014). The P. noxius assembly was further merged with metassembler (ver 1.5; Wences & Schatz, 2015), misassemblies were identified using REAPR (ver 1.0.18; Hunt et al., 2013) and manually corrected. 


Gene predictions and functional annotation

For P. noxius, the gene predictor Augustus (ver3.2.1; Stanke, Tzvetkova, & Morgenstern, 2006) was trained on a gene training set of complete core genes from CEGMA (ver2.5; Parra, Bradnam, & Korf, 2007) and subsequently used for manual curation of ~1000 genes. Annotation was then run by providing introns as evidence from RNA-seq data. For P. lamaensis, P. sulphurascens and P. pini, genes were predicted using Braker1 (ver 1.9; Hoff, Lange, Lomsadze, Borodovsky, & Stanke, 2016) pipeline that automatically use RNA-seq mappings as evidence hints and retraining of GeneMark-ES (Borodovsky & Lomsadze, 2011) and Augustus. Gene product description was assigned using blast2go (ver 4.0.7; Conesa et al., 2005) and GO term assignment were provided by ARGOT2.5 (Lavezzo, Falda, Fontana, Bianco, & Toppo, 2016). The web server dbCAN (HMMs 5.0, last accessed September 5 2016; Yin et al., 2012) was used to predict CAZymes from the protein sequences of all species, while AntiSMASH (ver 3.0; Weber et al., 2015) was used to predict secondary metabolite gene clusters. For dbCAN results, only hits with <= 1 x 10e-5 e-value and >= 30% HMM coverage were considered, while overlapping domains were resolved by choosing hits with the smallest P-value. Proteome completeness were assessed with BUSCO (ver 2.0; Simão, Waterhouse, Ioannidis, Kriventseva, & Zdobnov, 2015) using the Basidiomycota dataset.



参考文献:https://onlinelibrary.wiley.com/doi/abs/10.1111/mec.14359

  • 发表于 2018-09-12 13:59
  • 阅读 ( 3959 )
  • 分类:三代测序

0 条评论

请先 登录 后评论
omicsgene
omicsgene

生物信息

700 篇文章

作家榜 »

  1. omicsgene 700 文章
  2. 安生水 348 文章
  3. Daitoue 167 文章
  4. 生物女学霸 120 文章
  5. xun 82 文章
  6. 红橙子 78 文章
  7. rzx 75 文章
  8. CORNERSTONE 72 文章