输入生存数据与基因表达量可以做批量单因素cox回归分析
$ Rscript $scriptdir/univariate_cox_batch.r --h
usage: univariate_cox_batch.r [-h] -m metadata -g
expset [-t time]
[-e event]
[-l pvalue]
[-b blocksize]
[--log2] [-o outdir]
[-p prefix]
batch unvariate cox regression gene expression
optional arguments:
-h, --help show this help message and exit
-m metadata, --metadata metadata
input metadata file path with suvival time [required]
-g expset, --expset expset
input gene expression set file [required]
-t time, --time time set suvival time column name in metadata [default
TIME]
-e event, --event event
set event column name in metadata [default EVENT]
-l pvalue, --pvalue pvalue
pvalue cutoff to choose sig gene [default 0.01]
-b blocksize, --blocksize blocksize
Number of variables Parallel to test in each [default
2]
--log2 whether do log2 transfrom for expression data
[optional, default: False]
-o outdir, --outdir outdir
output file directory [default cwd]
-p prefix, --prefix prefix
out file name prefix [default cox]
-m 输入生存数据:
event 列: 0表示事件没有发生,1表示事件发生; 0表示alive,1表示死亡;
barcode | TIME | EVENT |
TCGA-B7-A5TK-01A-12R-A36D-31 | 288 | 0 |
TCGA-BR-7959-01A-11R-2343-13 | 1010 | 0 |
TCGA-IN-8462-01A-11R-2343-13 | 572 | 0 |
TCGA-CG-4443-01A-01R-1157-13 | 912 | 0 |
TCGA-KB-A93J-01A-11R-A39E-31 | 1124 | 0 |
TCGA-HU-A4H3-01A-21R-A251-31 | 882 | 0 |
TCGA-RD-A8MV-01A-11R-A36D-31 | 3720 | 0 |
TCGA-VQ-A91X-01A-12R-A414-31 | 289 | 1 |
TCGA-D7-8575-01A-11R-2343-13 | 554 | 1 |
TCGA-BR-8485-01A-11R-2402-13 | 280 | 0 |
TCGA-D7-A748-01A-12R-A32D-31 | 132 | 1 |
TCGA-VQ-A91Z-01A-11R-A414-31 | 1690 | 0 |
-g 输入基因表达量文件
ID | TCGA-B7-A5TK-01A-12R-A36D-31 | TCGA-BR-7959-01A-11R-2343-13 | TCGA-IN-8462-01A-11R-2343-13 | TCGA-BR-A4CR-01A-11R-A24K-31 | TCGA-CG-4443-01A-01R-1157-13 | TCGA-KB-A93J-01A-11R-A39E-31 | TCGA-BR-4371-01A-01R-1157-13 | TCGA-IN-A6RO-01A-12R-A33Y-31 | TCGA-HU-A4H3-01A-21R-A251-31 |
FGR | 16.34408 | 11.96739 | 5.350846 | 2.209351 | 1.53802 | 15.24016 | 4.501118 | 2.602437 | 6.261761 |
CD38 | 86.86772 | 15.79451 | 3.111342 | 1.240707 | 0.862955 | 13.3047 | 3.728708 | 1.673952 | 2.675173 |
ITGAL | 40.26903 | 7.358566 | 3.769125 | 2.387869 | 2.37351 | 38.08591 | 8.305283 | 3.622781 | 7.025886 |
CX3CL1 | 603.0132 | 26.91353 | 20.22238 | 4.195262 | 19.04097 | 14.15295 | 13.75885 | 6.675374 | 4.050271 |
CEACAM21 | 1.868536 | 2.571917 | 0.610839 | 0.674558 | 1.092127 | 3.483559 | 1.134309 | 4.471274 | 0.584159 |
MATK | 2.28342 | 0.864116 | 0.519776 | 2.442093 | 0.760348 | 3.192951 | 1.161881 | 0.347882 | 1.039336 |
CD79B | 3.453198 | 1.879957 | 2.822192 | 0.523587 | 1.926592 | 3.651742 | 0.831288 | 0.883643 | 1.979214 |
MMP25 | 13.72829 | 3.451148 | 1.106563 | 1.131217 | 0.878735 | 10.43186 | 1.475852 | 1.914284 | 2.312993 |
TRAF3IP3 | 5.24401 | 1.880186 | 0.875264 | 0.756153 | 0.603251 | 3.325013 | 2.347473 | 0.570462 | 1.315916 |
CD4 | 77.74691 | 51.83719 | 22.77076 | 11.07811 | 35.20445 | 122.5578 | 31.10107 | 15.06619 | 15.41347 |
BTK | 6.856235 | 4.362261 | 1.482688 | 1.371599 | 1.981236 | 6.91154 | 3.187848 | 0.955499 | 1.48269 |
FMO1 | 7.168567 | 7.711817 | 3.223174 | 0.979034 | 0.450307 | 1.093412 | 1.001808 | 0.910204 | 1.558515 |
SYT7 | 1.153105 | 81.94068 | 2.673384 | 191.6112 | 82.49394 | 0.510373 | 4.470482 | 1.28506 | 0.91944 |
TYROBP | 591.7796 | 338.0271 | 184.8133 | 69.18483 | 150.6397 | 480.5691 | 121.096 | 72.4588 | 116.9793 |
CD22 | 0.819295 | 2.521607 | 1.588505 | 0.41259 | 0.387288 | 1.123633 | 0.488244 | 0.258094 | 0.713988 |
Rscript $scriptdir/univariate_cox_batch.r -m metadata_survival_time.tsv \
-g deg_gene_exp_tpm.tsv -e EVENT -t TIME -p imm.unicox --pvalue 0.01
批量cox分析 结果:
Variable | Term | Beta | StandardError | Z | P | LRT | Wald | LogRank | HR | HRlower | HRupper |
SYT12 | SYT12 | 0.091121 | 0.019495 | 4.674035 | 2.95E-06 | 0.000128 | 2.95E-06 | 1.80E-06 | 1.095402 | 1.054336 | 1.138067 |
CDH2 | CDH2 | 0.013266 | 0.003014 | 4.401803 | 1.07E-05 | 0.001993 | 1.07E-05 | 2.11E-07 | 1.013354 | 1.007386 | 1.019357 |
GPNMB | GPNMB | 0.002759 | 0.000768 | 3.590118 | 0.000331 | 0.001313 | 0.000331 | 0.000409 | 1.002762 | 1.001253 | 1.004274 |
TMIGD3 | TMIGD3 | 0.06788 | 0.019314 | 3.514468 | 0.000441 | 0.001248 | 0.000441 | 0.000419 | 1.070237 | 1.03048 | 1.111528 |
LINC01094 | LINC01094 | 0.132441 | 0.040375 | 3.280293 | 0.001037 | 0.002242 | 0.001037 | 0.001026 | 1.141611 | 1.054754 | 1.235621 |
SLC22A20P | SLC22A20P | 0.049415 | 0.01583 | 3.12165 | 0.001798 | 0.012065 | 0.001798 | 0.000736 | 1.050656 | 1.018559 | 1.083765 |
IGHV4-61 | IGHV4-61 | 0.001791 | 0.000582 | 3.077573 | 0.002087 | 0.00859 | 0.002087 | 0.001742 | 1.001793 | 1.000651 | 1.002937 |
IGHV2-5 | IGHV2-5 | 0.002236 | 0.000737 | 3.034961 | 0.002406 | 0.007782 | 0.002406 | 0.002205 | 1.002238 | 1.000792 | 1.003686 |
SERPINA5 | SERPINA5 | 0.007681 | 0.002558 | 3.002428 | 0.002678 | 0.009799 | 0.002678 | 0.002064 | 1.007711 | 1.00267 | 1.012776 |
MS4A4A | MS4A4A | 0.01446 | 0.00505 | 2.863218 | 0.004194 | 0.008427 | 0.004194 | 0.004722 | 1.014565 | 1.004572 | 1.024657 |
FAM83A | FAM83A | 0.006237 | 0.002368 | 2.633445 | 0.008452 | 0.028523 | 0.008452 | 0.007295 | 1.006256 | 1.001596 | 1.010938 |
IGLV3-9 | IGLV3-9 | 0.000547 | 0.00021 | 2.608281 | 0.0091 | 0.039427 | 0.0091 | 0.010244 | 1.000547 | 1.000136 | 1.000958 |
STARD3 | STARD3 | 0.000928 | 0.000358 | 2.588913 | 0.009628 | 0.030258 | 0.009628 | 0.006743 | 1.000928 | 1.000225 | 1.001631 |
脚本获取与使用课程:https://study.163.com/course/introduction/1211864801.htm?share=1&shareId=1030291076
使用R包 survival 做cox分析
Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.
Terry M. Therneau, Patricia M. Grambsch (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!