library(cBioPortalData)
library(AnVIL)
This vignette lays out the two main user-facing functions for downloading
and representing data from the cBioPortal API. cBioDataPack
makes use of the legacy distribution data method in cBioPortal
(via
tarballs). cBioPortalData
allows for a more flexibile approach to obtaining
data based on several available parameters including available molecular
profiles.
This function will access the packaged data from and return an integrative MultiAssayExperiment representation.
cBioDataPack("laml_tcga")
## A MultiAssayExperiment object of 11 listed
## experiments with user-defined names and respective classes.
## Containing an ExperimentList class object of length 11:
## [1] CNA: SummarizedExperiment with 24776 rows and 191 columns
## [2] RNA_Seq_expression_median: SummarizedExperiment with 19720 rows and 179 columns
## [3] RNA_Seq_mRNA_median_Zscores: SummarizedExperiment with 19720 rows and 179 columns
## [4] RNA_Seq_v2_expression_median: SummarizedExperiment with 20531 rows and 173 columns
## [5] RNA_Seq_v2_mRNA_median_Zscores: SummarizedExperiment with 20531 rows and 173 columns
## [6] cna_hg19.seg: RaggedExperiment with 13571 rows and 191 columns
## [7] linear_CNA: SummarizedExperiment with 24776 rows and 191 columns
## [8] methylation_hm27: SummarizedExperiment with 10919 rows and 194 columns
## [9] methylation_hm450: SummarizedExperiment with 10919 rows and 194 columns
## [10] mutations_extended: RaggedExperiment with 2584 rows and 197 columns
## [11] mutations_mskcc: RaggedExperiment with 2584 rows and 197 columns
## Features:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DFrame
## sampleMap() - the sample availability DFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DFrame
## assays() - convert ExperimentList to a SimpleList of matrices
This function provides a more flexible and granular way to request a MultiAssayExperiment object from a study ID, molecular profile, gene panel, sample list.
cbio <- cBioPortal()
acc <- cBioPortalData(api = cbio, by = "hugoGeneSymbol", studyId = "acc_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
)
## harmonizing input:
## removing 1 colData rownames not in sampleMap 'primary'
acc
## A MultiAssayExperiment object of 2 listed
## experiments with user-defined names and respective classes.
## Containing an ExperimentList class object of length 2:
## [1] acc_tcga_rppa: SummarizedExperiment with 57 rows and 46 columns
## [2] acc_tcga_linear_CNA: SummarizedExperiment with 339 rows and 90 columns
## Features:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DFrame
## sampleMap() - the sample availability DFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DFrame
## assays() - convert ExperimentList to a SimpleList of matrices
sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.11-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.11-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] cBioPortalData_2.0.1 MultiAssayExperiment_1.14.0
## [3] SummarizedExperiment_1.18.1 DelayedArray_0.14.0
## [5] matrixStats_0.56.0 Biobase_2.48.0
## [7] GenomicRanges_1.40.0 GenomeInfoDb_1.24.0
## [9] IRanges_2.22.1 S4Vectors_0.26.0
## [11] BiocGenerics_0.34.0 AnVIL_1.0.3
## [13] dplyr_0.8.5 BiocStyle_2.16.0
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.1 tidyr_1.0.3
## [3] bit64_0.9-7 jsonlite_1.6.1
## [5] splines_4.0.0 assertthat_0.2.1
## [7] askpass_1.1 TCGAutils_1.8.0
## [9] BiocManager_1.30.10 BiocFileCache_1.12.0
## [11] blob_1.2.1 Rsamtools_2.4.0
## [13] GenomeInfoDbData_1.2.3 RTCGAToolbox_2.18.0
## [15] progress_1.2.2 yaml_2.2.1
## [17] pillar_1.4.4 RSQLite_2.2.0
## [19] lattice_0.20-41 glue_1.4.0
## [21] limma_3.44.1 digest_0.6.25
## [23] XVector_0.28.0 rvest_0.3.5
## [25] htmltools_0.4.0 Matrix_1.2-18
## [27] XML_3.99-0.3 pkgconfig_2.0.3
## [29] biomaRt_2.44.0 bookdown_0.18
## [31] zlibbioc_1.34.0 purrr_0.3.4
## [33] RCircos_1.2.1 rapiclient_0.1.3
## [35] BiocParallel_1.22.0 openssl_1.4.1
## [37] tibble_3.0.1 ellipsis_0.3.0
## [39] GenomicFeatures_1.40.0 survival_3.1-12
## [41] RJSONIO_1.3-1.4 magrittr_1.5
## [43] crayon_1.3.4 memoise_1.1.0
## [45] evaluate_0.14 xml2_1.3.2
## [47] prettyunits_1.1.1 tools_4.0.0
## [49] data.table_1.12.8 hms_0.5.3
## [51] formatR_1.7 lifecycle_0.2.0
## [53] stringr_1.4.0 Biostrings_2.56.0
## [55] AnnotationDbi_1.50.0 lambda.r_1.2.4
## [57] compiler_4.0.0 rlang_0.4.6
## [59] GenomicDataCommons_1.12.0 futile.logger_1.4.3
## [61] grid_4.0.0 RCurl_1.98-1.2
## [63] rappdirs_0.3.1 bitops_1.0-6
## [65] rmarkdown_2.1 codetools_0.2-16
## [67] DBI_1.1.0 curl_4.3
## [69] R6_2.4.1 GenomicAlignments_1.24.0
## [71] rtracklayer_1.48.0 knitr_1.28
## [73] bit_1.1-15.2 futile.options_1.0.1
## [75] readr_1.3.1 stringi_1.4.6
## [77] RaggedExperiment_1.12.0 Rcpp_1.0.4.6
## [79] vctrs_0.2.4 dbplyr_1.4.3
## [81] tidyselect_1.0.0 xfun_0.13