Help for package priorityelasticnet

Type:

Package

Title:

Comprehensive Analysis of Multi-Omics Data Using an Offset-Based Method

Version:

1.2.1

Description:

Priority-ElasticNet extends the Priority-LASSO method (Klau et al. (2018) <doi:10.1186/s12859-018-2344-6>) by incorporating the ElasticNet penalty, allowing for both L1 and L2 regularization. This approach fits successive ElasticNet models for several blocks of (omics) data with different priorities, using the predicted values from each block as an offset for the subsequent block. It also offers robust options to handle block-wise missingness in multi-omics data, improving the flexibility and applicability of the model in the presence of incomplete datasets.

Note:

conceptual foundation and significant code structure inherited from the 'prioritylasso' package.

License:

GPL-3

Depends:

R (≥ 3.5.0)

Imports:

survival, glmnet (≥ 2.0-13), utils, checkmate, shiny, tidyr, dplyr, caret, pROC, PRROC, plotrix, ggplot2, magrittr, tibble, broom, cvms, glmSparseNet

Suggests:

ipflasso, rlang, knitr, rmarkdown

VignetteBuilder:

knitr

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.3

NeedsCompilation:

Packaged:

2026-02-11 10:57:42 UTC; eunicecarrasquinha

Author:

Laila Qadir Musib [aut], Helena Mouriño [aut], Eunice Carrasquinha [aut, cre]

Maintainer:

Eunice Carrasquinha <eitrigueirao@ciencias.ulisboa.pt>

Repository:

CRAN

Date/Publication:

2026-02-19 09:30:02 UTC

Simulated Patient Data for Binary Classification

Description

This dataset contains simulated data for a binary classification problem, representing patient data with clinical, proteomics, and RNA variables. The data is organized into three blocks of variables: clinical variables, proteomics variables, and RNA variables. The outcome is a binary variable generated based on a logistic function.

Usage

Pen_Data

Format

A data frame with 406 rows and 325 columns: The dataset includes:

5 clinical variables
174 proteomic variables
145 RNA variables
Outcome variable Pen_out

Source

Simulated dataset for package examples

Extract coefficients from a priorityelasticnet object

Description

Extract coefficients from a priorityelasticnet object

Usage

## S3 method for class 'priorityelasticnet'
coef(object, ...)

Arguments

object

model of type priorityelasticnet

...

additional arguments, currently not used

Value

List with the coefficients and the intercepts

priorityelasticnet with several block specifications

Description

Runs priorityelasticnet for a list of block specifications and gives the best results in terms of cv error.

Usage

cvm_priorityelasticnet(
  X,
  Y,
  weights,
  family,
  type.measure,
  blocks.list,
  max.coef.list = NULL,
  block1.penalization = TRUE,
  lambda.type = "lambda.min",
  standardize = TRUE,
  nfolds = 10,
  foldid,
  cvoffset = FALSE,
  cvoffsetnfolds = 10,
  alpha = 1,
  ...
)

Arguments

X

A numeric matrix of predictors.

Y

A response vector. For family = "multinomial", Y should be a factor with more than two levels.

weights

Optional observation weights. Default is NULL.

family

A character string specifying the model type. Options are "gaussian", "binomial", "cox", and "multinomial". Default is "gaussian".

type.measure

Loss function for cross-validation. Options are "mse", "deviance", "class", "auc". Default depends on the family.

blocks.list

list of the format list(list(bp1=...,bp2=...,), list(bp1=,...,bp2=...,), ...). For the specification of the entries, see priorityelasticnet.

max.coef.list

list of max.coef vectors. The first entries are omitted if block1.penalization = FALSE. Default is NULL.

block1.penalization

Logical. If FALSE, the first block will not be penalized. Default is TRUE.

lambda.type

Type of lambda to select. Options are "lambda.min" or "lambda.1se". Default is "lambda.min".

standardize

Logical flag for variable standardization, prior to fitting the model. Default is TRUE.

nfolds

Number of folds for cross-validation. Default is 10.

foldid

Optional vector of values between 1 and nfolds identifying what fold each observation is in. Default is NULL.

cvoffset

Logical. If TRUE, a cross-validated offset is used. Default is FALSE.

cvoffsetnfolds

Number of folds for cross-validation of the offset. Default is 10.

alpha

Elastic net mixing parameter. The elastic net penalty is defined as

(1 - \alpha)/2||\beta||_2^2 + \alpha||\beta||_1

Defaults to 1 (lasso penalty).

...

other arguments that can be passed to the function priorityelasticnet.

Value

object of class cvm_priorityelasticnet with the following elements. Any elements that are lists contain the detailed results (such as coefficients or cross-validation errors) corresponding to each penalized block of the final, optimal model.

lambda.ind: list with indices of lambda for lambda.type.
lambda.type: type of lambda which is used for the predictions.
lambda.min: list with values of lambda for lambda.type.
min.cvm: list with the mean cross-validated errors for lambda.type.
nzero: list with numbers of non-zero coefficients for lambda.type.
glmnet.fit: list of fitted glmnet objects.
name: a text string indicating type of measure.
block1unpen: if block1.penalization = FALSE, the results of either the fitted glm or coxph object.
best.blocks: character vector with the indices of the best block specification.
best.blocks.indices: list with the indices of the best block specification ordered by best to worst.
best.max.coef: A vector containing the maximum number of non-zero coefficients (selected variables) allowed for each penalized block, corresponding to the model chosen as optimal by cross-validation best.blocks.
best.model: complete priorityelasticnet model of the best solution.
coefficients: coefficients according to the results obtained with best.blocks.
call: the function call.

Note

The function description and the first example are based on the R package ipflasso.

Construct control structures for handling of missing data for `priorityelasticnet`

Description

Construct control structures for handling of missing data for priorityelasticnet

Usage

missing.control(
  handle.missingdata = c("none", "ignore", "impute.offset"),
  offset.firstblock = c("zero", "intercept"),
  impute.offset.cases = c("complete.cases", "available.cases"),
  nfolds.imputation = 10,
  lambda.imputation = c("lambda.min", "lambda.1se"),
  perc.comp.cases.warning = 0.3,
  threshold.available.cases = 30,
  select.available.cases = c("maximise.blocks", "max")
)

Arguments

handle.missingdata

how blockwise missing data should be treated. Default is none which does nothing, ignore ignores the observations with missing data for the current block, impute.offset imputes the offset for the missing values.

offset.firstblock

determines if the offset of the first block for missing observations is zero or the intercept of the observed values for handle.missingdata = ignore

impute.offset.cases

which cases/observations should be used for the imputation model to impute missing offsets. Supported are complete cases (additional constraint is that every observation can only contain one missing block) and all available observations which have an overlap with the current block.

nfolds.imputation

nfolds for the glmnet of the imputation model

lambda.imputation

which lambda-value should be used for predicting the imputed offsets in cv.glmnet

perc.comp.cases.warning

percentage of complete cases when a warning is issued of too few cases for the imputation model

threshold.available.cases

if the number of available cases for impute.offset.cases = available.cases is below this threshold, priorityelasticnet tries to reduce the number of blocks taken into account for the imputation model to increase the number of observations used for the imputation model.

select.available.cases

determines how the blocks which are used for the imputation model are selected when impute.offset.cases = available.cases. max selects the blocks that maximise the number of observations, maximise.blocks tries to include as many blocks as possible, starting with the blocks with the highest priority.

Value

list with control parameters

Predictions from priorityelasticnet

Description

Makes predictions for a priorityelasticnet object. It can be chosen between linear predictors or fitted values.

Usage

## S3 method for class 'priorityelasticnet'
predict(
  object,
  newdata = NULL,
  type = c("link", "response"),
  handle.missingtestdata = c("none", "omit.prediction", "set.zero", "impute.block"),
  include.allintercepts = FALSE,
  use.blocks = "all",
  ...
)

Arguments

object

An object of class priorityelasticnet.

newdata

(nnew x p) matrix or data frame with new values.

type

Specifies the type of predictions. link gives the linear predictors for all types of response and response gives the fitted values.

handle.missingtestdata

Specifies how to deal with missing data in the test data; possibilities are none, omit.prediction, set.zero and impute.block

include.allintercepts

should the intercepts from all blocks included in the prediction? If FALSE, only the intercept from the first block is included (default in the past).

use.blocks

determines which blocks are used for the prediction, the default is all. Otherwise one can specify the number of blocks which are used in a vector

...

Further arguments passed to or from other methods.

Details

handle.missingtestdata specifies how to deal with missing data. The default none cannot handle missing data, omit.prediction does not make a prediction for observations with missing values and return NA. set.zero ignores the missing data for the calculation of the prediction (the missing value is set to zero). impute.block uses an imputation model to impute the offset of a missing block. This only works if the priorityelasticnet object was fitted with handle.missingdata = "impute.offset". If impute.offset.cases = "complete.cases" was used, then every observation can have only one missing block. For observations with more than one missing block, NA is returned. If impute.offset.cases = "available.cases" was used, the missingness pattern in the test data has to be the same as in the train data. For observations with an unknown missingness pattern, NA is returned.

Value

Predictions that depend on type.

Examples


pl_bin <- priorityelasticnet(X = matrix(rnorm(50*190),50,190), Y = rbinom(50,1,0.5),
                       family = "binomial", type.measure = "auc",
                       blocks = list(block1=1:13,block2=14:80, block3=81:190),
                       block1.penalization = TRUE, lambda.type = "lambda.min",
                       standardize = FALSE, nfolds = 3, alpha = 1)

newdata_bin <- matrix(rnorm(10*190),10,190)

predict(object = pl_bin, newdata = newdata_bin, type = "response")

Priority Elastic Net for High-Dimensional Data

Description

This function performs penalized regression analysis using the elastic net method, tailored for high-dimensional data with a known group structure. Fits a priority elastic net model for multi-block or multi-omics data. An additional Shiny application for model evaluation and weighted threshold optimization is available through the weightedThreshold function.

Usage

priorityelasticnet(
  X,
  Y,
  weights = NULL,
  family = c("gaussian", "binomial", "cox", "multinomial"),
  alpha = 0.5,
  type.measure,
  blocks,
  max.coef = NULL,
  block1.penalization = TRUE,
  lambda.type = "lambda.min",
  standardize = TRUE,
  nfolds = 10,
  foldid = NULL,
  cvoffset = FALSE,
  cvoffsetnfolds = 10,
  mcontrol = missing.control(),
  scale.y = FALSE,
  return.x = TRUE,
  adaptive = FALSE,
  initial_global_weight = TRUE,
  verbose = FALSE,
  ...
)

Arguments

X

A numeric matrix of predictors.

Y

A response vector. For family = "multinomial", Y should be a factor with more than two levels.

weights

Optional observation weights. Default is NULL.

family

A character string specifying the model type. Options are "gaussian", "binomial", "cox", and "multinomial". Default is "gaussian".

alpha

The elastic net mixing parameter, with 0 \le \alpha \le 1. The penalty is defined as (1-\alpha)/2||\beta||_2^2 + \alpha||\beta||_1. Default is 1.

type.measure

Loss function for cross-validation. Options are "mse", "deviance", "class", "auc". Default depends on the family.

blocks

A list where each element is a vector of indices indicating the predictors in that block.

max.coef

A numeric vector specifying the maximum number of non-zero coefficients allowed in each block. Default is NULL, meaning no limit.

block1.penalization

Logical. If FALSE, the first block will not be penalized. Default is TRUE.

lambda.type

Type of lambda to select. Options are "lambda.min" or "lambda.1se". Default is "lambda.min".

standardize

Logical flag for variable standardization, prior to fitting the model. Default is TRUE.

nfolds

Number of folds for cross-validation. Default is 10.

foldid

Optional vector of values between 1 and nfolds identifying what fold each observation is in. Default is NULL.

cvoffset

Logical. If TRUE, a cross-validated offset is used. Default is FALSE.

cvoffsetnfolds

Number of folds for cross-validation of the offset. Default is 10.

mcontrol

Control parameters for handling missing data. Default is missing.control().

scale.y

Logical. If TRUE, the response variable Y is scaled. Default is FALSE.

return.x

Logical. If TRUE, the function returns the input matrix X. Default is TRUE.

adaptive

Logical. If TRUE, the adaptive elastic net is used, where penalties are adjusted based on the importance of the coefficients from an initial model fit. Default is FALSE.

initial_global_weight

Logical. If TRUE (the default), global initial weights will be calculated based on all predictors. If FALSE, initial weights will be calculated separately for each block.

verbose

Logical. If TRUE prints detailed logs of the process. Default is FALSE.

...

Additional arguments to be passed to cv.glmnet.

Value

A list with the following components:

lambda.ind

Indices of the selected lambda values.

lambda.type

Type of lambda used.

lambda.min

Selected lambda values.

min.cvm

Cross-validated performance measure (e.g., mean squared error, deviance, or AUC) for each block, depending on the value of type.measure.

nzero

Number of non-zero coefficients for each block.

glmnet.fit

Fitted glmnet objects for each block.

name

Name of the model.

block1unpen

Fitted model for the unpenalized first block, if applicable.

coefficients

Coefficients of the fitted models.

call

The function call.

X

The input matrix X, if return.x is TRUE.

missing.data

Logical vector indicating missing data.

imputation.models

Imputation models used, if applicable.

blocks.used.for.imputation

Blocks used for imputation, if applicable.

missingness.pattern

Pattern of missing data, if applicable.

y.scale.param

Parameters for scaling Y, if applicable.

blocks

The input blocks.

mcontrol

Control parameters for handling missing data.

family

The model family.

dim.x

Dimensions of the input matrix X.

Note

Ensure that glmnet version >= 2.0.13 is installed. The function does not support single missing values within a block.

References

Musib, L., Coletti, R., Lopes, M.B., Mouriño, H. & Carrasquinha, E. (2024). Priority-Elastic net for binary disease outcome prediction based on multi-omics data. BioData Mining, 17(45).https://doi.org/10.1186/s13040-024-00401-0 Klau, S., Jurinovic, V., Hornung, R., Herold, T. & Boulesteix, A.-L. (2018). Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics, 19, 322. https://doi.org/10.1186/s12859-018-2344-6 Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00527.x

Examples



  # Simulation of multinomial data:
  set.seed(123)
  n <- 100
  p <- 50
  k <- 3
  x <- matrix(rnorm(n * p), n, p)
  y <- sample(1:k, n, replace = TRUE)
  y <- factor(y)
  blocks <- list(bp1 = 1:10, bp2 = 11:30, bp3 = 31:50)
  
  # Run priorityelasticnet:
  fit <- priorityelasticnet(x, y, family = "multinomial", alpha = 0.5, 
                     type.measure = "class", blocks = blocks,
                     block1.penalization = TRUE, lambda.type = "lambda.min", 
                     standardize = TRUE, nfolds = 5, 
                     adaptive = FALSE)
                     
   fit$coefficients

A Shiny App for Model Evaluation and Weighted Threshold Optimization

Description

This function starts a Shiny application that enables users to interactively adjust the threshold for binary classification and view related metrics, the confusion matrix, ROC curve, and PR curve. The app also includes a feature for calculating the optimal threshold using a weighted version of Youden's J-statistic.

Usage

weightedThreshold(object, ...)

Arguments

object

A result from priorityelasticnet function with binomial model family.

...

Additional arguments

Details

To calculate the optimal threshold, a weighted version of Youden's J-statistic (Youden, 1950) is used. The optimal cutoff is the threshold that maximizes the distance from the identity (diagonal) line. The function optimizes the metric (w * sensitivity + (1 - w) * specificity), where 'w' is the weight parameter adjusted using the second slider. After selecting the desired value on the optimal threshold slider, the user must press the "Set" button to update the threshold slider with the calculated optimal value. Metrics will then be automatically recalculated based on the user's selection. This function adapted from 'Monahov, A. (2021). Model Evaluation with Weighted Threshold Optimization (and the “mewto” R package). Available at SSRN 3805911.'

#' @references Musib, L., Coletti, R., Lopes, M.B., Mouriño, H. & Carrasquinha, E. (2024). Priority-Elastic net for binary disease outcome prediction based on multi-omics data. BioData Mining, 17(45).https://doi.org/10.1186/s13040-024-00401-0

Value

No return value. This function launches a Shiny application for interactive model evaluation with weighted threshold optimization. It is intended for user interaction rather than returning a computed value. The Shiny app provides an interactive interface to visualize model performance metrics and optimize thresholds for classification models based on user-defined criteria.

Examples



  # Simulation of Pen_data data:
  data("Pen_Data", package = "priorityelasticnet")
  blocks <- list(block1 = 1:5, block2 = 6:179, block3 = 180:324)
  
  # Run priorityelasticnet:
  fit_bin <- priorityelasticnet(X = as.matrix(Pen_Data[, 1:324]), 
                                Y = Pen_Data[, 325],
                                family = "binomial", 
                                alpha = 0.5, 
                                type.measure = "auc",
                                blocks = blocks,
                                standardize = FALSE)
                     
   weightedThreshold(fit_bin)

Simulated Patient Data for Binary Classification

Description

Usage

Format

Source

Extract coefficients from a priorityelasticnet object

Description

Usage

Arguments

Value

priorityelasticnet with several block specifications

Description

Usage

Arguments

Value

Note

Construct control structures for handling of missing data for priorityelasticnet

Description

Usage

Arguments

Value

Predictions from priorityelasticnet

Description

Usage

Arguments

Details

Value

Examples

Priority Elastic Net for High-Dimensional Data

Description

Usage

Arguments

Value

Note

References

Examples

A Shiny App for Model Evaluation and Weighted Threshold Optimization

Description

Usage

Arguments

Details

Value

Examples

Construct control structures for handling of missing data for `priorityelasticnet`