usage: tcga_gene_exp_download.r [-h] -p project [-f files.per.chunk]
[-o outdir]
TCGA 基因表达数据下载及临床数据下载与整理,详细帮助见:https://www.omicsclass.com/article/1485
optional arguments:
-h, --help show this help message and exit
-p project, --project project
input project ID of TCGA, for example TCGA-STAD,more
project ID:https://www.omicsclass.com/article/1061
[required]
-f files.per.chunk, --files.per.chunk files.per.chunk
This will make the API method only download n
(files.per.chunk) files at a time. This may reduce the
download problems when the data size is too large
[default 10]
-o outdir, --outdir outdir
output file directory [default cwd]
下载使用的R包为:TCGAbiolinks 长链非编码基因数据:gencodev22: https://gdc.cancer.gov/about-data/gdc-data-processing/gdc-reference-files
https://www.gencodegenes.org/human/release_22.html
Rscript tcga_gene_exp_download.r -p TCGA-STAD
注意:基因表达文件中的基因ID已经替换成gene symbol,重复的gene symbol 随机取一个作为代表;
样本信息文件:
基因ID对应信息文件:
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!