biomarkertmle {biotmle}R Documentation

Biomarker Evaluation with Targeted Minimum Loss-Based Estimation of the ATE

Description

Computes the causal target parameter defined as the difference between the biomarker expression values under treatment and those same values under no treatment, using Targeted Minimum Loss-Based Estimation.

Usage

biomarkertmle(se, varInt, normalized = TRUE, ngscounts = FALSE,
  parallel = TRUE, bppar_type = NULL, future_param = NULL,
  family = "gaussian", subj_ids = NULL, g_lib = c("SL.mean",
  "SL.glm", "SL.earth"), Q_lib = c("SL.mean", "SL.glm", "SL.earth",
  "SL.ranger"), ...)

Arguments

se

(SummarizedExperiment) - containing expression or next-generation sequencing data in the "assays" slot and a matrix of phenotype-level data in the "colData" slot.

varInt

(numeric) - indicating the column of the design matrix corresponding to the treatment or outcome of interest (in the colData slot of the SummarizedExperiment argument "se").

normalized

(logical) - whether the data included in the assay slot of the input SummarizedExperiment object has been normalized already. The default is set to TRUE since it is expected that most practitioners would apply normalization methods appropriate to the type of assay being analyzed. If set to FALSE, median normalization is performed for microarray (i.e., non-RNA-seq) data.

ngscounts

(logical) - whether the data are counts generated from a next-generation sequencing (NGS) experiment (e.g., RNA-seq). The default setting assumes continuous expression measures as generated by platforms that are microarray-type (i.e., so-called "targeted" assays).

parallel

(logical) - whether or not to use parallelization in the estimation procedure. Invoking parallelization happens through a combination of calls to future and BiocParallel. If this argument is set to TRUE, future::multiprocess is used, and if FALSE, future::sequential is used, alongside BiocParallel::bplapply. Other options for evaluation through futures may be invoked by setting the argument future_param.

bppar_type

(character) - specifies the type of backend to be used with the parallelization invoked by BiocParallel. Consult the manual page for BiocParallel::BiocParallelParam for possible types and descriptions on their appropriate uses. The default for this argument is NULL, which silently uses BiocParallel::DoparParam.

future_param

(character) - specifies the type of parallelization to be invoked when using futures for evaluation. For a list of the available types, please consult the documentation for future::plan. The default setting (this argument set to NULL) silently invokes future::multiprocess. Be careful if changing this setting.

family

(character) - specification of error family: "binomial" or "gaussian".

subj_ids

(numeric vector) - subject IDs to be passed directly to subject should have the exact same numerical identifier; coerced to class numeric if not provided in the appropriate form.

g_lib

(char vector) - library of learning algorithms to be used in fitting the propensity score E[A | W] (the nuisance parameter denoted "g" in the literature on targeted minimum loss-based estimation).

Q_lib

(char vector) - library of learning algorithms to be used in fitting the outcome regression E[Y | A, W] (the nuisance parameter denoted "Q" in the literature on targeted minimum loss-based estimation).

...

Additional arguments to be passed directly to tmle::tmle in fitting the targeted minimum loss-based estimator of the average treatment effect. Consult the documentation of that function for details.

Value

S4 object of class biotmle, generated by sub-classing SummarizedExperiment, with additional slots containing tmleOut and call, among others, containing TMLE-based estimates of the relationship between a biomarker and exposure or outcome variable and the original call to this function (for user reference), respectively.

Examples

library(dplyr)
library(biotmleData)
data(illuminaData)
library(SummarizedExperiment)
"%ni%" <- Negate("%in%")

colData(illuminaData) <- colData(illuminaData) %>%
  data.frame() %>%
  dplyr::mutate(age = as.numeric(age > median(age))) %>%
  DataFrame()

varInt_index <- which(names(colData(illuminaData)) %in% "benzene")

biomarkerTMLEout <- biomarkertmle(
  se = illuminaData[1:2, ],
  varInt = varInt_index,
  parallel = FALSE,
  family = "gaussian",
  g_lib = c("SL.mean", "SL.glm"),
  Q_lib = "SL.glm"
)
#

[Package biotmle version 1.8.0 Index]