subset_column.r 根据ID列表,取对应列的数据:

subset_column.r 根据ID列表,取对应列的数据子集

使用方法:

usage: /share/nas1/huangls/test/TCGA_GEO_immu/scripts/subset_column.r
       [-h] -i input -m idlist [-c colname] [-o outdir] [-p prefix]
get data by column ID
optional arguments:
  -h, --help            show this help message and exit
  -i input, --input input
                        input matrix data [required]
  -m idlist, --idlist idlist
                        input column ID list ,default first column [required]
  -c colname, --colname colname
                        match column name in idlist [required]
  -o outdir, --outdir outdir
                        output file directory [default cwd]
  -p prefix, --prefix prefix
                        output file name prefix [default demo]


参数说明:

-i 输入要处理的表格,第一列会都成行名

TCGA-D7-A74A-01A-11R-A32D-31 TCGA-BR-7704-01A-11R-2055-13 TCGA-VQ-A91N-01A-11R-A414-31 TCGA-CD-A4MH-01A-11R-A251-31
TSPAN6 19.71229 15.16812 58.95336274 112.4479129
TNMD 0.128679 0 0 0.417115888
DPM1 166.1934 229.3096 160.0258308 223.6269896
SCYL3 8.721258 3.71577 18.44681977 8.307169238
C1orf112 6.935284 2.181263 16.46792983 7.387781561
FGR 2.251234 18.9202 3.145843585 7.801411858
CFH 3.535552 14.03032 12.53080064 10.27682391
FUCA2 103.1044 119.298 128.6210801 138.7011133
GCLC 30.47744 25.12227 30.24963988 25.52302692

-m 指定ID列表,-c colname 指定哪一列作为ID列表的列表;

barcode patient shortLetterCode definition stage
TCGA-B7-A5TK-01A-12R-A36D-31 TCGA-B7-A5TK TP Primary solid Tumor Stage IIIA
TCGA-BR-7959-01A-11R-2343-13 TCGA-BR-7959 TP Primary solid Tumor Stage IIIA
TCGA-IN-8462-01A-11R-2343-13 TCGA-IN-8462 TP Primary solid Tumor Stage IIB
TCGA-BR-A4CR-01A-11R-A24K-31 TCGA-BR-A4CR TP Primary solid Tumor Stage IIIC
TCGA-CG-4443-01A-01R-1157-13 TCGA-CG-4443 TP Primary solid Tumor Stage IA
TCGA-KB-A93J-01A-11R-A39E-31 TCGA-KB-A93J TP Primary solid Tumor Stage II
TCGA-IN-A6RO-01A-12R-A33Y-31 TCGA-IN-A6RO TP Primary solid Tumor Stage IA
TCGA-HU-A4H3-01A-21R-A251-31 TCGA-HU-A4H3 TP Primary solid Tumor Stage IIIC
TCGA-RD-A8MV-01A-11R-A36D-31 TCGA-RD-A8MV TP Primary solid Tumor Stage IIIB

使用命令举例:


Rscript  subset_column.r -i ../01.TCGA_download/TCGA-STAD_gene_expression_TPM.tsv -m metadata.tsv  -p TCGA-STAD_gene_expression_TPM


  • 发表于 2021-06-11 14:44
  • 阅读 ( 2495 )
  • 分类:TCGA

0 条评论

请先 登录 后评论
omicsgene
omicsgene

生物信息

702 篇文章

作家榜 »

  1. omicsgene 702 文章
  2. 安生水 351 文章
  3. Daitoue 167 文章
  4. 生物女学霸 120 文章
  5. xun 82 文章
  6. rzx 78 文章
  7. 红橙子 78 文章
  8. CORNERSTONE 72 文章