举个例子:从数据库中下载基因组数据,基因组.gff文件中染色体ID较为复杂(第一列)
从全基因组序列.genome.fa文件中找到染色体ID对应简写
想将gff文档中染色体ID全部替换成第二列LC*命名
use Data::Dumper; use Getopt::Long; use strict; use Cwd qw(abs_path getcwd); my %opts; GetOptions (\%opts,"idlist=s","gff=s","out=s"); if(!defined($opts{idlist})||!defined($opts{gff})||!defined($opts{out})){ &USAGE; } ######################get gff############################## my $gff; open(IN,"$opts{gff}")||die "open gff failed\n"; { local $/=undef; $gff=<IN>; close(IN); } #####################get idlist for changing############### open(IN,"$opts{idlist}")|| die "open ID file failed\n"; while(<IN>){ chomp; my($newID,$gene)=split(/\s+/,$_); if($gff=~/$gene/){ $gff=~ s/$gene/$newID/g; } } close(IN); #####################print out new gff####################### open(OUT, ">$opts{out}")||die "open new gff failed\n"; print OUT $gff; close(OUT); sub USAGE{ print "usage:perl gffID.pl -idlist idlist.txt -gff genome.gff -out rename.gff \n"; exit; }
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!