硬件不够只能升级硬件,软件是不能解决的;
# 读取输入的参数
> work_dir <- "/Volume/TCGA-LUAD/lab"
> # 设置程序参数
> tcga_project <- "TCGA-LUAD"
> data_category <- "DNA Methylation"
> platform <- "Illumina Human Methylation 450"
> sample_type <- "Primary solid Tumor"
> legacy <- FALSE
> # 工作目录如果不存在,则创建目录
> if( !file.exists(work_dir) ){
+ if( !dir.create(work_dir, showWarnings = FALSE, recursive = TRUE) ){
+ stop(paste("dir.create failed: outdir=",work_dir,sep=""))
+ }
+ }
> # 设置工作目录(输出目录)
> setwd(work_dir)
> DataDirectory <- paste0(work_dir,"/GDC/",gsub("-","_",tcga_project))
> FileNameData <- paste0(DataDirectory, "_","methylation",".rda")
> # 查询需要下载的甲基化数据(本次分析只下载肿瘤样本)
> query.met <- GDCquery(project = tcga_project,
+ legacy = legacy,
+ data.category = data_category,
+ platform = platform,
+ sample.type = sample_type)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
|tissue.code |shortLetterCode |tissue.definition |
|:-----------|:---------------|:-------------------------------------------------|
|01 |TP |Primary Tumor |
|02 |TR |Recurrent Tumor |
|03 |TB |Primary Blood Derived Cancer - Peripheral Blood |
|04 |TRBM |Recurrent Blood Derived Cancer - Bone Marrow |
|05 |TAP |Additional - New Primary |
|06 |TM |Metastatic |
|07 |TAM |Additional Metastatic |
|08 |THOC |Human Tumor Original Cells |
|09 |TBM |Primary Blood Derived Cancer - Bone Marrow |
|10 |NB |Blood Derived Normal |
|11 |NT |Solid Tissue Normal |
|12 |NBC |Buccal Cell Normal |
|13 |NEBV |EBV Immortalized Normal |
|14 |NBM |Bone Marrow Normal |
|20 |CELLC |Control Analyte |
|40 |TRB |Recurrent Blood Derived Cancer - Peripheral Blood |
|50 |CELL |Cell Lines |
|60 |XP |Primary Xenograft Tissue |
|61 |XCL |Cell Line Derived Xenograft Tissue |
Error in checkBarcodeDefinition(sample.type) :
Primary solid Tumor was not found. Please select a difinition from the table above
> sample_type<-"Primary Tumor"
> # 查询需要下载的甲基化数据(本次分析只下载肿瘤样本)
> query.met <- GDCquery(project = tcga_project,
+ legacy = legacy,
+ data.category = data_category,
+ platform = platform,
+ sample.type = sample_type)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LUAD
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> sample_type<-"Primary Tumor"
> # 该癌症总样品数量
> samplesDown <- getResults(query.met,cols=c("cases"))
> cat("Total sample to download:", length(samplesDown))
Total sample to download: 473
> GDCdownload(query.met,directory = DataDirectory,files.per.chunk=6, method='api')
Downloading data for project TCGA-LUAD
Of the 473 files for download 473 already exist.
All samples have been already downloaded
> # 整理数据,FileNameData为R保存结果,方便下次再次提取数据
> prepare.met <- GDCprepare(query = query.met,
+ save = TRUE,
+ save.filename = FileNameData,
+ directory = DataDirectory,
+ summarizedExperiment = TRUE)
| | 0% |--------------------------------------------------|
|==================================================|
|==================== |40.80338% ~13 m remaining 错误: 无法分配大小为7.4 Mb的矢量
Killed after 9 m