所有释放内存的方法都试过了,不仅不能解决而且,R软件直接退出去了,请问老师该怎么解决,如果优化代码,该如何优化,希望老师解惑,这个问题一直困扰无法进行下一步。
> # 加载R包
> library(TCGAbiolinks)
> library(SummarizedExperiment)
载入需要的程辑包:GenomicRanges
载入需要的程辑包:stats4
载入需要的程辑包:BiocGenerics
载入需要的程辑包:parallel
载入程辑包:‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport,
clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply,
parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname,
do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl,
intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste,
pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames,
sapply, setdiff, sort, table, tapply, union, unique, unsplit, which,
which.max, which.min
载入需要的程辑包:S4Vectors
载入程辑包:‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
载入需要的程辑包:IRanges
载入程辑包:‘IRanges’
The following object is masked from ‘package:grDevices’:
windows
载入需要的程辑包:GenomeInfoDb
载入需要的程辑包:Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To
cite Bioconductor, see 'citation("Biobase")', and for packages
'citation("pkgname")'.
载入需要的程辑包:DelayedArray
载入需要的程辑包:matrixStats
载入程辑包:‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
载入程辑包:‘DelayedArray’
The following objects are masked from ‘package:matrixStats’:
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following objects are masked from ‘package:base’:
aperm, apply, rowsum
> # 读取输入的参数
> work_dir <- "/Volume/TCGA-LUAD/lab"
> # 设置程序参数
> tcga_project <- "TCGA-LUAD"
> data_category <- "DNA Methylation"
> platform <- "Illumina Human Methylation 450"
> sample_type <- "Primary solid Tumor"
> legacy <- FALSE
> # 工作目录如果不存在,则创建目录
> if( !file.exists(work_dir) ){
+ if( !dir.create(work_dir, showWarnings = FALSE, recursive = TRUE) ){
+ stop(paste("dir.create failed: outdir=",work_dir,sep=""))
+ }
+ }
> # 设置工作目录(输出目录)
> setwd(work_dir)
> DataDirectory <- paste0(work_dir,"/GDC/",gsub("-","_",tcga_project))
> FileNameData <- paste0(DataDirectory, "_","methylation",".rda")
> query.met <- GDCquery(project = tcga_project,
+ legacy = legacy,
+ data.category = data_category,
+ platform = platform,
+ sample.type = sample_type)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
|tissue.code |shortLetterCode |tissue.definition |
|:-----------|:---------------|:-------------------------------------------------|
|01 |TP |Primary Tumor |
|02 |TR |Recurrent Tumor |
|03 |TB |Primary Blood Derived Cancer - Peripheral Blood |
|04 |TRBM |Recurrent Blood Derived Cancer - Bone Marrow |
|05 |TAP |Additional - New Primary |
|06 |TM |Metastatic |
|07 |TAM |Additional Metastatic |
|08 |THOC |Human Tumor Original Cells |
|09 |TBM |Primary Blood Derived Cancer - Bone Marrow |
|10 |NB |Blood Derived Normal |
|11 |NT |Solid Tissue Normal |
|12 |NBC |Buccal Cell Normal |
|13 |NEBV |EBV Immortalized Normal |
|14 |NBM |Bone Marrow Normal |
|20 |CELLC |Control Analyte |
|40 |TRB |Recurrent Blood Derived Cancer - Peripheral Blood |
|50 |CELL |Cell Lines |
|60 |XP |Primary Xenograft Tissue |
|61 |XCL |Cell Line Derived Xenograft Tissue |
Error in checkBarcodeDefinition(sample.type) :
Primary solid Tumor was not found. Please select a difinition from the table above
> sample_type <- "Primary Tumor"
> query.met <- GDCquery(project = tcga_project,
+ legacy = legacy,
+ data.category = data_category,
+ platform = platform,
+ sample.type = sample_type)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-LUAD
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By sample.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> sample_type <- "Primary Tumor"
> samplesDown <- getResults(query.met,cols=c("cases"))
> cat("Total sample to download:", length(samplesDown))
Total sample to download: 473
> # 如果下载中断了,请再次执行,client 可以续传
> GDCdownload(query.met,directory = DataDirectory,files.per.chunk=6, method='api')
Downloading data for project TCGA-LUAD
Of the 473 files for download 473 already exist.
All samples have been already downloaded
> rm()
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 8273107 441.9 12433397 664.1 12433397 664.1
Vcells 14211477 108.5 22195444 169.4 18429536 140.7
> # 整理数据,FileNameData为R保存结果,方便下次再次提取数据
> prepare.met <- GDCprepare(query = query.met,
+ save = TRUE,
+ save.filename = FileNameData,
+ directory = DataDirectory,
+ summarizedExperiment = TRUE)