query_combos {ccmap} | R Documentation |
Drugs with the largest positive and negative cosine similarity are predicted to, respectively, mimic and reverse the query signature. Values range from +1 to -1.
query_combos(query_genes, drug_info = c("cmap", "l1000"), method = c("average", "ml"), include = NULL, ncores = parallel::detectCores())
query_genes |
Named numeric vector of differentual expression values for
query genes. Usually 'meta' slot of |
drug_info |
Character vector specifying which dataset to query (either 'cmap' or 'l1000'). Can also provide a matrix of differential expression values for drugs or drug combinations (rows are genes, columns are drugs). |
method |
One of 'average' (default) or 'ml' (machine learning - see details and vignette). |
include |
Character vector of drug names for which combinations with all
other drugs will be predicted and queried. If |
ncores |
Integer, number of cores to use for method 'average'. Default is to use all cores. |
To predict and query all 856086 two-drug cmap combinations, the 'average'
method
can take as little as 10 minutes (Intel Core i7-6700). The 'ml'
(machine learning) method
takes two hours on the same hardware and
requires ~10GB of RAM but is slightly more accurate. Both methods will run
faster by specifying only a subset of drugs using the include
parameter.
To speed up the 'ml' method, the MRO+MKL distribution of R can help
substantially (link).
The combinations of LINCS l1000 signatures (~26 billion) can also be queried
using the 'average' method
. In order to compare l1000 results to those
obtained with cmap, only the same genes should be queried (see example).
Vector of cosine similarities between query and drug combination signatures.
library(lydata) library(crossmeta) # location of data data_dir <- system.file("extdata", package = "lydata") # gather GSE names gse_names <- c("GSE9601", "GSE15069", "GSE50841", "GSE34817", "GSE29689") # load previous analysis anals <- load_diff(gse_names, data_dir) # perform meta-analysis es <- es_meta(anals) # get dprimes dprimes <- get_dprimes(es) # query combinations of metformin and all other cmap drugs top_met_combos <- query_combos(dprimes$all$meta, include = 'metformin', ncores = 1) # previous query but with machine learning method # top_met_combos <- query_combos(dprimes$all$meta, method = 'ml', include = 'metformin') # query all cmap drug combinations # top_combos <- query_combos(dprimes$all$meta) # query all cmap drug combinations with machine learning method # top_combos <- query_combos(dprimes$all$meta, method = 'ml') # query l1000 and cmap using same genes # library(ccdata) # data(cmap_es) # data(l1000_es) # cmap_es <- cmap_es[row.names(l1000_es), ] # met_cmap <- query_combos(dprimes$all$meta, cmap_es, include = 'metformin') # met_l1000 <- query_combos(dprimes$all$meta, l1000_es, include = 'metformin')