利用工具建立数据库 rescript
qiime rescript get-silva-data \ --p-version '138' \ --p-target 'SSURef_NR99' \ --p-include-species-labels \ --o-silva-sequences silva-138-ssu-nr99-seqs.qza \ --o-silva-taxonomy silva-138-ssu-nr99-tax.qza
这个代码自动获取 99相似性的序列和分类信息,由于网络原因一般运行报错
可以直接下载qiime2官方网站i提供的文件:https://docs.qiime2.org/2020.8/data-resources/
wget -c https://data.qiime2.org/2020.8/common/silva-138-99-seqs.qza
wget -c https://data.qiime2.org/2020.8/common/silva-138-99-tax.qza
ln -s silva-138-99-tax.qza silva-138-ssu-nr99-tax.qza
ln -s silva-138-99-seqs.qza silva-138-ssu-nr99-seqs.qza
#remove sequences that contain 5 or more ambiguous bases (IUPAC compliant ambiguity bases) and any homopolymers that are 8 or more bases in length qiime rescript cull-seqs \ --i-sequences silva-138-ssu-nr99-seqs.qza \ --o-clean-sequences silva-138-ssu-nr99-seqs-cleaned.qza #长度过滤 qiime rescript filter-seqs-length-by-taxon \ --i-sequences silva-138-ssu-nr99-seqs-cleaned.qza \ --i-taxonomy silva-138-ssu-nr99-tax.qza \ --p-labels Archaea Bacteria Eukaryota \ --p-min-lens 900 1200 1400 \ --o-filtered-seqs silva-138-ssu-nr99-seqs-filt.qza \ --o-discarded-seqs silva-138-ssu-nr99-seqs-discard.qza #重复序列合并 qiime rescript dereplicate \ --i-sequences silva-138-ssu-nr99-seqs-filt.qza \ --i-taxa silva-138-ssu-nr99-tax.qza \ --p-rank-handles 'silva' \ --p-mode 'uniq' \ --o-dereplicated-sequences silva-138-ssu-nr99-seqs-derep-uniq.qza \ --o-dereplicated-taxa silva-138-ssu-nr99-tax-derep-uniq.qza #全长分类器构建 qiime feature-classifier fit-classifier-naive-bayes \ --i-reference-reads silva-138-ssu-nr99-seqs-derep-uniq.qza \ --i-reference-taxonomy silva-138-ssu-nr99-tax-derep-uniq.qza \ --o-classifier silva-138-ssu-nr99-classifier.qza ##特异引物分类器构建1 #截取序列 qiime feature-classifier extract-reads \ --i-sequences silva-138-ssu-nr99-seqs-derep-uniq.qza \ --p-f-primer GTGYCAGCMGCCGCGGTAA \ --p-r-primer GGACTACNVGGGTWTCTAAT \ --p-n-jobs 2 \ --p-read-orientation 'forward' \ --o-reads silva-138-ssu-nr99-seqs-515f-806r.qza #合并重复 qiime rescript dereplicate \ --i-sequences silva-138-ssu-nr99-seqs-515f-806r.qza \ --i-taxa silva-138-ssu-nr99-tax-derep-uniq.qza \ --p-rank-handles 'silva' \ --p-mode 'uniq' \ --o-dereplicated-sequences silva-138-ssu-nr99-seqs-515f-806r-uniq.qza \ --o-dereplicated-taxa silva-138-ssu-nr99-tax-515f-806r-derep-uniq.qza #构建分类器 qiime feature-classifier fit-classifier-naive-bayes \ --i-reference-reads silva-138-ssu-nr99-seqs-515f-806r-uniq.qza \ --i-reference-taxonomy silva-138-ssu-nr99-tax-515f-806r-derep-uniq.qza \ --o-classifier silva-138-ssu-nr99-515f-806r-classifier.qza ##特异引物分类器构建2 # 338F (5′-ACTCCTACGGGAGGCAGCAG-3′) and. 806R (5′-GGACTACHVGGGTWTCTAAT-3′) #截取序列 qiime feature-classifier extract-reads \ --i-sequences silva-138-ssu-nr99-seqs-derep-uniq.qza \ --p-f-primer ACTCCTACGGGAGGCAGCAG \ --p-r-primer GGACTACHVGGGTWTCTAAT \ --p-n-jobs 2 \ --p-read-orientation 'forward' \ --o-reads silva-138-ssu-nr99-seqs-338f-806r.qza #合并重复 qiime rescript dereplicate \ --i-sequences silva-138-ssu-nr99-seqs-338f-806r.qza \ --i-taxa silva-138-ssu-nr99-tax-derep-uniq.qza \ --p-rank-handles 'silva' \ --p-mode 'uniq' \ --o-dereplicated-sequences silva-138-ssu-nr99-seqs-338f-806r-uniq.qza \ --o-dereplicated-taxa silva-138-ssu-nr99-tax-338f-806r-derep-uniq.qza #构建分类器 qiime feature-classifier fit-classifier-naive-bayes \ --i-reference-reads silva-138-ssu-nr99-seqs-338f-806r-uniq.qza \ --i-reference-taxonomy silva-138-ssu-nr99-tax-338f-806r-derep-uniq.qza \ --o-classifier silva-138-ssu-nr99-338f-806r-classifier.qza
注意:qiime2建立分类数据库很消耗内存,至少50G以上;这里我建立好的数据库分享给大家:
链接:https://pan.baidu.com/s/11FRnvS_-qzPCFF5FKRmJGw
提取码:g3yr
复制这段内容后打开百度网盘手机App,操作更方便哦--来自百度网盘超级会员V7的分享
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!