vcftools过滤除了等位基因选项外,其他参数通通没用

老师您好,我按照重测序课程的流程一步步处理数据,到最后一步vcftools过滤时出现了问题,起初按照该参数过滤:

vcftools --gzvcf all.varFilter.vcf.gz --recode --recode-INFO-all --stdout \

    --maf 0.05  --max-missing 0.4  --minDP 4  --maxDP 1000  \

    --minQ 30 --minGQ 80 --min-alleles 2  --max-alleles 2 |gzip - > all.clean.vcf.gz

结果是After filtering, kept 0 out of a possible 118111 Sites(中间很多Warning),于是把参数全部拆开一次只过滤一个参数,发现除了--min-alleles 2  --max-alleles 2,可以过滤成功:kept 106816 out of a possible 118111 Sites之外,其他参数全部失败了,结果信息都一样,如下:

--gzvcf 1.vcf.gz

      7         --recode-INFO-all

      8         --max-missing 0.8

      9         --recode

     10         --stdout

     11

     12 Using zlib version: 1.2.7

     13 Warning: Expected at least 2 parts in FORMAT entry: ID=PGT,Number=1,Type

     14 Warning: Expected at least 2 parts in FORMAT entry: ID=PID,Number=1,Type

     15 Warning: Expected at least 2 parts in FORMAT entry: ID=PL,Number=G,Type=

     16 Warning: Expected at least 2 parts in FORMAT entry: ID=RGQ,Number=1,Type

     17 Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=In

     18 Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=In

     19 Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Fl

     20 Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Fl

     21 Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type

     22 Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type

     23 Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type

     24 Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type

     25 After filtering, kept 560 out of 560 Individuals

     26 Outputting VCF file...

     27 After filtering, kept 118111 out of a possible 118111 Sites

     28 Run Time = 121.00 seconds

请问老师可能原因是什么?

请先 登录 后评论

1 个回答

omicsgene - 生物信息
擅长:重测序,遗传进化,转录组,GWAS

vcf格式有问题吧,看看带##号的开头行注释有问题:

建议去掉这个参数试试的:--minGQ 80 

请先 登录 后评论