merge_metadata_genexpdata.r 在metadata中添加基因表达数据

merge_metadata_genexpdata.r 在metadata中添加基因表达数据


metadata添加基因表达结果。 基因表达文件行为不同基因,列为不同样本, 只要是列为不同样本的文件都可以用这个脚本合并到metadata中;

$Rscript $scriptdir/merge_metadata_genexpdata.r -h
usage: /share/nas1/huangls/test/TCGA_immu/scripts/merge_metadata_genexpdata.r
       [-h] -m metadata -g expset -b by [--log2] [-o outdir] [-p prefix]
merge metadata and gene expression:
optional arguments:
  -h, --help            show this help message and exit
  -m metadata, --metadata metadata
                        input metadata file path with suvival time [required]
  -g expset, --expset expset
                        input gene expression set file [required]
  -b by, --by by        input sample ID column name in metadata [required]
  --log2                whether do log2 transfrom for expression data
                        [optional, default: False]
  -o outdir, --outdir outdir
                        output file directory [default cwd]
  -p prefix, --prefix prefix
                        out file name prefix [default cox]


-m  指定metadata文件:

barcode patient sample shortLetterCode definition sample_submitter_id sample_type_id sample_id sample_type
TCGA-B7-A5TK-01A-12R-A36D-31 TCGA-B7-A5TK TCGA-B7-A5TK-01A TP Primary solid Tumor TCGA-B7-A5TK-01A 1 58937d2c-b4c3-4992-a95c-d0d1fa73f1a9 Primary Tumor
TCGA-BR-7959-01A-11R-2343-13 TCGA-BR-7959 TCGA-BR-7959-01A TP Primary solid Tumor TCGA-BR-7959-01A 1 c8fc5fb2-ded2-48af-a87d-36c367a3330d Primary Tumor
TCGA-IN-8462-01A-11R-2343-13 TCGA-IN-8462 TCGA-IN-8462-01A TP Primary solid Tumor TCGA-IN-8462-01A 1 26509d1e-253b-463c-8654-589929889fbe Primary Tumor
TCGA-BR-A4CR-01A-11R-A24K-31 TCGA-BR-A4CR TCGA-BR-A4CR-01A TP Primary solid Tumor TCGA-BR-A4CR-01A 1 76165733-85ca-47d0-a82a-f64fa9b1b834 Primary Tumor
TCGA-CG-4443-01A-01R-1157-13 TCGA-CG-4443 TCGA-CG-4443-01A TP Primary solid Tumor TCGA-CG-4443-01A 1 f4fb736a-42c9-4367-a327-d0d1c4cba359 Primary Tumor
TCGA-KB-A93J-01A-11R-A39E-31 TCGA-KB-A93J TCGA-KB-A93J-01A TP Primary solid Tumor TCGA-KB-A93J-01A 1 888711a8-8ffd-49bb-aa85-7455c07f1ad5 Primary Tumor
TCGA-BR-4371-01A-01R-1157-13 TCGA-BR-4371 TCGA-BR-4371-01A TP Primary solid Tumor TCGA-BR-4371-01A 1 95d6e839-a21b-4266-b7f8-e47fab262af3 Primary Tumor
TCGA-IN-A6RO-01A-12R-A33Y-31 TCGA-IN-A6RO TCGA-IN-A6RO-01A TP Primary solid Tumor TCGA-IN-A6RO-01A 1 c6e17043-a145-4cc3-b889-4f499b17dbf3 Primary Tumor
TCGA-HU-A4H3-01A-21R-A251-31 TCGA-HU-A4H3 TCGA-HU-A4H3-01A TP Primary solid Tumor TCGA-HU-A4H3-01A 1 cd33e854-1bdf-42e0-83e7-256c723c5b55 Primary Tumor
TCGA-RD-A8MV-01A-11R-A36D-31 TCGA-RD-A8MV TCGA-RD-A8MV-01A TP Primary solid Tumor TCGA-RD-A8MV-01A 1 f7a464d1-9939-4ab8-a03b-f2962e618817 Primary Tumor
TCGA-VQ-A91X-01A-12R-A414-31 TCGA-VQ-A91X TCGA-VQ-A91X-01A TP Primary solid Tumor TCGA-VQ-A91X-01A 1 288b0130-6744-495e-bd99-da6f6b5f6953 Primary Tumor
TCGA-D7-8575-01A-11R-2343-13 TCGA-D7-8575 TCGA-D7-8575-01A TP Primary solid Tumor TCGA-D7-8575-01A 1 71efd38a-03a9-488d-bde2-18b17559c775 Primary Tumor
TCGA-BR-4257-01A-01R-1131-13 TCGA-BR-4257 TCGA-BR-4257-01A TP Primary solid Tumor TCGA-BR-4257-01A 1 97b44b05-97eb-486c-94db-42838831de0b Primary Tumor
TCGA-BR-8485-01A-11R-2402-13 TCGA-BR-8485 TCGA-BR-8485-01A TP Primary solid Tumor TCGA-BR-8485-01A 1 2f1460ea-827b-4c51-86a5-2d85771888bb Primary Tumor
TCGA-BR-4370-01A-01R-1157-13 TCGA-BR-4370 TCGA-BR-4370-01A TP Primary solid Tumor TCGA-BR-4370-01A 1 cfb7901b-e4e1-42fe-802d-dcd34f8c4912 Primary Tumor
TCGA-D7-A748-01A-12R-A32D-31 TCGA-D7-A748 TCGA-D7-A748-01A TP Primary solid Tumor TCGA-D7-A748-01A 1 308bca2d-6e27-4da9-9a07-b0eb88437953 Primary Tumor
TCGA-VQ-A91Z-01A-11R-A414-31 TCGA-VQ-A91Z TCGA-VQ-A91Z-01A TP Primary solid Tumor TCGA-VQ-A91Z-01A 1 ec6ed61a-1d7e-4057-9ac2-dd1ed2accfb0 Primary Tumor
TCGA-RD-A7C1-01A-11R-A32D-31 TCGA-RD-A7C1 TCGA-RD-A7C1-01A TP Primary solid Tumor TCGA-RD-A7C1-01A 1 b905ac72-aae1-4e6e-b560-46b9f4f9ef5f Primary Tumor

-g  指定要合并的 文件:

cell_type TCGA-B7-A5TK-01A-12R-A36D-31 TCGA-BR-7959-01A-11R-2343-13 TCGA-IN-8462-01A-11R-2343-13 TCGA-BR-A4CR-01A-11R-A24K-31 TCGA-CG-4443-01A-01R-1157-13 TCGA-KB-A93J-01A-11R-A39E-31 TCGA-BR-4371-01A-01R-1157-13 TCGA-IN-A6RO-01A-12R-A33Y-31 TCGA-HU-A4H3-01A-21R-A251-31 TCGA-RD-A8MV-01A-11R-A36D-31
B cells naive 0.041806 0.119034 0.275451 0.243789 0.118753 0.097526 0.087438 0.110736 0.091899 0.114157
B cells memory 0 0 0 0 0 0 0 0 0 0
Plasma cells 0.005778 0.009967 0.04908 0 0 0.009401 0.006596 0.012819 0.004326 0.001989
T cells CD8 0.399813 0.081383 0.052203 0.032568 0 0.017233 0.042146 0.074234 0.192331 0.10848
T cells CD4 naive 0 0 0 0 0 0 0 0 0 0
T cells CD4 memory resting 0 0.163605 0.158559 0.205805 0.161354 0.197031 0.445467 0.207407 0.095721 0.173691
T cells CD4 memory activated 0.222587 0.056429 0.022786 0.026488 0.008663 0.049117 0.016507 0.062309 0.132577 0.077914
T cells follicular helper 0.006564 0 0 0.007444 0 0.01711 0.009081 0.060454 0.040637 0.056979
T cells regulatory (Tregs) 0 0.034066 0.08263 0.040602 0.001932 0.038297 0 0.052859 0.052045 0.045091
T cells gamma delta 0 0 0 0 0 0.008108 0 0.00487 0 0
NK cells resting 0 0.034875 0 0.02808 0.031821 0 0.018045 0.006281 0.023437 0.029263
NK cells activated 0.00696 0 0.029651 0 0 0.047286 0 0.004472 0.007489 0
Monocytes 0.005411 0.013788 0.007133 0 0.02435 0.002008 0.013182 0 0 0.00139
Macrophages M0 0.014514 0.090352 0.025032 0.104725 0.16705 0.224299 0.078476 0.117533 0.068365 0.109002
Macrophages M1 0.135967 0.103223 0.100643 0.012271 0 0.080839 0.078876 0.122465 0.063667 0.103607
Macrophages M2 0.089259 0.159524 0.044703 0.137346 0.428193 0.156385 0.101052 0.10287 0.071646 0.077649
Dendritic cells resting 0.014316 0.019249 0.059299 0.006178 0 0.022708 0.043902 0 0 0.004566



Rscript $scriptdir/merge_metadata_genexpdata.r -m  ../08.Nomogram/nomogram_metadata.tsv -g ../03.TIME/immu/timer.res.tsv \
   -b barcode -p metadata_risk_score_timer


  • 发表于 2021-08-30 17:44
  • 阅读 ( 2575 )
  • 分类:TCGA

0 条评论

请先 登录 后评论


712 篇文章

作家榜 »

  1. omicsgene 712 文章
  2. 安生水 353 文章
  3. Daitoue 167 文章
  4. 生物女学霸 120 文章
  5. xun 82 文章
  6. rzx 81 文章
  7. 红橙子 78 文章
  8. CORNERSTONE 72 文章