关于annover建库问题

[root@108189d98d67  07:48:18 /work/my_reseq/ref]# sh index.sh pearref.fa peargff.gff

REF: pearref.fa

GFF: peargff.gff

GTF:


build Index  start:

RNN CMD:samtools faidx pearref.fa

RNN CMD: picard CreateSequenceDictionary R=pearref.fa O=pearref.dict

INFO    2024-04-21 07:49:38     CreateSequenceDictionary


********** NOTE: Picard's command line syntax is changing.

**********

********** For more information, please see:

********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)

**********

********** The command line looks like this in the new syntax:

**********

**********    CreateSequenceDictionary -R pearref.fa -O pearref.dict

**********



07:49:38.581 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/biosoft/miniconda/envs/reseq/share/picard-2.22.1-0/picard.jar!/com/intel/gkl/native/libgkl_com                         pression.so

[Sun Apr 21 07:49:38 CST 2024] CreateSequenceDictionary OUTPUT=pearref.dict REFERENCE=pearref.fa    TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 VERBOSITY=INFO QUIET=                         false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=                         false USE_JDK_INFLATER=false

[Sun Apr 21 07:49:38 CST 2024] Executing as root@108189d98d67 on Linux 5.15.0-89-generic amd64; OpenJDK 64-Bit Server VM 11.0.1-internal+0-adhoc..src; Deflater: Intel; Inflater: In                         tel; Provider GCS is not available; Picard version: 2.22.1

[Sun Apr 21 07:49:40 CST 2024] picard.sam.CreateSequenceDictionary done. Elapsed time: 0.04 minutes.

Runtime.totalMemory()=536870912

RNN CMD: bwa index pearref.fa

[bwa_index] Pack FASTA... 4.67 sec

[bwa_index] Construct BWT for the packed sequence...

[BWTIncCreate] textLength=1001242816, availableWord=82451092

[BWTIncConstructFromPacked] 10 iterations done. 100000000 characters processed.

[BWTIncConstructFromPacked] 20 iterations done. 200000000 characters processed.

[BWTIncConstructFromPacked] 30 iterations done. 300000000 characters processed.

[BWTIncConstructFromPacked] 40 iterations done. 396720512 characters processed.

[BWTIncConstructFromPacked] 50 iterations done. 482982368 characters processed.

[BWTIncConstructFromPacked] 60 iterations done. 559648096 characters processed.

[BWTIncConstructFromPacked] 70 iterations done. 627784816 characters processed.

[BWTIncConstructFromPacked] 80 iterations done. 688340880 characters processed.

[BWTIncConstructFromPacked] 90 iterations done. 742159216 characters processed.

[BWTIncConstructFromPacked] 100 iterations done. 789989104 characters processed.

[BWTIncConstructFromPacked] 110 iterations done. 832496400 characters processed.

[BWTIncConstructFromPacked] 120 iterations done. 870272944 characters processed.

[BWTIncConstructFromPacked] 130 iterations done. 903844848 characters processed.

[BWTIncConstructFromPacked] 140 iterations done. 933679616 characters processed.

[BWTIncConstructFromPacked] 150 iterations done. 960192768 characters processed.

[BWTIncConstructFromPacked] 160 iterations done. 983753712 characters processed.

[bwt_gen] Finished constructing BWT in 169 iterations.

[bwa_index] 330.23 seconds elapse.

[bwa_index] Update BWT... 4.96 sec

[bwa_index] Pack forward-only FASTA... 3.19 sec

[bwa_index] Construct SA from BWT and Occ... 130.46 sec

[main] Version: 0.7.17-r1188

[main] CMD: bwa index pearref.fa

[main] Real time: 483.211 sec; CPU: 473.519 sec


gtf file not provide, try get gtf from gff:

RUN CMD: gffread  peargff.gff -T -o peargff.gtf

build ANNOVAR index

RUN CMD: gtfToGenePred -genePredExt peargff.gtf unknown_refGene.txt

RUN CMD: retrieve_seq_from_fasta.pl --format refGene --seqfile pearref.fa  unknown_refGene.txt --out unknown_refGeneMrna.fa

NOTICE: Reading region file unknown_refGene.txt ... Done with 56553 regions from 17 chromosomes

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished reading 17 sequences from pearref.fa

NOTICE: Finished writting FASTA for 56553 genomic regions to unknown_refGeneMrna.fa

WARNING: 4 gene regions do not have complete ORF (for example, Chr3.g18794.m1Chr3:21366188, Chr13.g21997.m1Chr13:20997898, Chr2.g43411.m1Chr2:19982045, Chr6.g52176.m1Chr6:14420538)

请老师看看建库是否正确
请先 登录 后评论

1 个回答

omicsgene - 生物信息
擅长:重测序,遗传进化,转录组,GWAS

看输出日志建库没问题,如果注释还有问题,那可能是你的vcf格式问题,这个需要调试才行;

请先 登录 后评论
  • 1 关注
  • 0 收藏,895 浏览
  • huohongliang 提出于 2024-04-22 16:14

相似问题