Cufflinks,Stringtie 合并转录本之后，如何筛选新转录本？ - 组学大讲堂问答社区

Cufflinks,Stringtie 合并转录本之后，如何筛选新转录本？

GTF文件中class code 注释转录本的类型

长链非编码RNA转录组分析数据时，一般都是每个样本独立进行转录本的组装，之后采用cuffmerge将转录本进行合并，生成一个统一的基因注释GTF文件。

那我们需要筛选出新的转录本，那该如何筛呢？这个可以从GTF文件的class codes着手，该信息记录了每个转录本相对于已知转录本的位置信息。

1	=	Complete match of intron chain
2	c	Contained
3	j	Potentially novel isoform (fragment): at least one splice junction is shared with a reference transcript
4	e	Single exon transfrag overlapping a reference exon and at least 10 bp of a reference intron, indicating a possible pre-mRNA fragment.
5	i	A transfrag falling entirely within a reference intron
6	o	Generic exonic overlap with a reference transcript
7	p	Possible polymerase run-on fragment (within 2Kbases of a reference transcript)
8	r	Repeat. Currently determined by looking at the soft-masked reference sequence and applied to transcripts where at least 50% of the bases are lower case
9	u	Unknown, intergenic transcript
10	x	Exonic overlap with reference on the opposite strand
11	s	An intron of the transfrag overlaps a reference intron on the opposite strand (likely due to read mapping errors)
12	.	(.tracking file only, indicates multiple classifications)

通过这个class_code 我们一般选在3种类型的转录本，分别是:

i : 内含子区的转录本

u: 基因间区的新转录本

x: 已知外显子的反义链转录本

发表于 2018-06-28 09:51
阅读 ( 8217 )
分类：转录组

作家榜 »

omicsgene 745 文章
安生水 365 文章
Daitoue 167 文章
生物女学霸 120 文章
xun 90 文章
rzx 85 文章
红橙子 81 文章
CORNERSTONE 72 文章