使用ensembl下载的人gff3文件进行基因家族分析时,保留编码蛋白基因的命令无法识别人的gff3文件(描述中为日志)

[root@7f6d1fb335c2  10:35:58 /work/desaturase/01.data_prepare]# genome=Homo_sapiens.GRCh38.dna.toplevel.fa.gz

[root@7f6d1fb335c2  10:36:17 /work/desaturase/01.data_prepare]# gff=Homo_sapiens.GRCh38.112.chr.gff3.gz

[root@7f6d1fb335c2  10:36:31 /work/desaturase/01.data_prepare]# species=Homo_sapiens

[root@7f6d1fb335c2  10:36:39 /work/desaturase/01.data_prepare]# agat_sp_filter_feature_by_attribute_value.pl --gff  $gff --attribute biotype --value protein_coding -t '!' -o $species.protein_coding.gff3

Using standard /share/work/biosoft/perl/perl-5.22.1/lib/site_perl/5.22.1/auto/share/dist/AGAT/config.yaml file

08/01/2024 at 10h36m50s

usage: /share/work/biosoft/perl/latest/bin/agat_sp_filter_feature_by_attribute_value.pl --gff Homo_sapiens.GRCh38.112.chr.gff3.gz --attribute biotype --value protein_coding -t ! -o Homo_sapiens.protein_coding.gff3

We will discard all features that have the attribute biotype with the value ne protein_coding.

********************************************************************************

*                              - Start parsing -                               *

********************************************************************************

-------------------------- parse options and metadata --------------------------

=> Accessing the feature_levels YAML file

Using standard /share/work/biosoft/perl/perl-5.22.1/lib/site_perl/5.22.1/auto/share/dist/AGAT/feature_levels.yaml file

=> Attribute used to group features when no Parent/ID relationship exists (i.e common tag):

        * locus_tag

        * gene_id

=> merge_loci option deactivated

=> Machine information:

        This script is being run by perl v5.22.1

        Bioperl location being used: /share/work/biosoft/perl/perl-5.22.1/lib/site_perl/5.22.1/Bio/

        Operating system being used: linux

=> Accessing Ontology

        No ontology accessible from the gff file header!

        We use the SOFA ontology distributed with AGAT:

                /share/work/biosoft/perl/perl-5.22.1/lib/site_perl/5.22.1/auto/share/dist/AGAT/so.obo

        Read ontology /share/work/biosoft/perl/perl-5.22.1/lib/site_perl/5.22.1/auto/share/dist/AGAT/so.obo:

                4 root terms, and 2596 total terms, and 1516 leaf terms

        Filtering ontology:

                We found 1861 terms that are sequence_feature or is_a child of it.

--------------------------------- parsing file ---------------------------------

=> Number of line in file: 3516552

=> Number of comment lines: 63142

=> Fasta included: No

=> Number of features lines: 3453410

=> Number of feature type (3rd column): 25

        * Level1: 5 => gene chromosome biological_region pseudogene ncRNA_gene

        * level2: 16 => rRNA pseudogenic_transcript D_gene_segment J_gene_segment mRNA snoRNA lnc_RNA unconfirmed_transcript transcript scRNA tRNA miRNA processed_transcript C_gene_segment V_gene_segment snRNA

        * level3: 4 => CDS three_prime_UTR exon five_prime_UTR

        * unknown: 0 =>

=> Version of the Bioperl GFF parser selected by AGAT: 3

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10469   11240   1.3e+03 ..       external_name "oe = 0.79"  ; logic_name cpg

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10650   10657   0.999   +.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10655   10657   0.999   -.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10678   10687   0.999   +.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10681   10688   0.999   -.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10707   10716   0.999   +.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10708   10718   0.999   -.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10735   10747   0.999   -.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10737   10744   0.999   +.       logic_name eponine

gff3 reader error level1: No ID attribute found @ for the feature: 1    .       biological_region       10766   10773   0.999   +.       logic_name eponine

gff3 reader error level1: No ID attribute found  ************** Too much WARNING message we skip the next **************

请先 登录 后评论

1 个回答

omicsgene - 生物信息
擅长:重测序,遗传进化,转录组,GWAS

看看有结果输出吗?

这些都是警告可以忽略,有结果输出即可;

请先 登录 后评论