vcf格式有问题吧,看看带##号的开头行注释有问题:
建议去掉这个参数试试的:--minGQ 80
老师您好,我按照重测序课程的流程一步步处理数据,到最后一步vcftools过滤时出现了问题,起初按照该参数过滤:
vcftools --gzvcf all.varFilter.vcf.gz --recode --recode-INFO-all --stdout \
--maf 0.05 --max-missing 0.4 --minDP 4 --maxDP 1000 \
--minQ 30 --minGQ 80 --min-alleles 2 --max-alleles 2 |gzip - > all.clean.vcf.gz
结果是After filtering, kept 0 out of a possible 118111 Sites(中间很多Warning),于是把参数全部拆开一次只过滤一个参数,发现除了--min-alleles 2 --max-alleles 2,可以过滤成功:kept 106816 out of a possible 118111 Sites之外,其他参数全部失败了,结果信息都一样,如下:
--gzvcf 1.vcf.gz
7 --recode-INFO-all
8 --max-missing 0.8
9 --recode
10 --stdout
11
12 Using zlib version: 1.2.7
13 Warning: Expected at least 2 parts in FORMAT entry: ID=PGT,Number=1,Type
14 Warning: Expected at least 2 parts in FORMAT entry: ID=PID,Number=1,Type
15 Warning: Expected at least 2 parts in FORMAT entry: ID=PL,Number=G,Type=
16 Warning: Expected at least 2 parts in FORMAT entry: ID=RGQ,Number=1,Type
17 Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=In
18 Warning: Expected at least 2 parts in INFO entry: ID=AC,Number=A,Type=In
19 Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Fl
20 Warning: Expected at least 2 parts in INFO entry: ID=AF,Number=A,Type=Fl
21 Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type
22 Warning: Expected at least 2 parts in INFO entry: ID=MLEAC,Number=A,Type
23 Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type
24 Warning: Expected at least 2 parts in INFO entry: ID=MLEAF,Number=A,Type
25 After filtering, kept 560 out of 560 Individuals
26 Outputting VCF file...
27 After filtering, kept 118111 out of a possible 118111 Sites
28 Run Time = 121.00 seconds
请问老师可能原因是什么?