| Type: | Package |
| Title: | Comprehensive Analysis of Multi-Omics Data Using an Offset-Based Method |
| Version: | 1.2.1 |
| Description: | Priority-ElasticNet extends the Priority-LASSO method (Klau et al. (2018) <doi:10.1186/s12859-018-2344-6>) by incorporating the ElasticNet penalty, allowing for both L1 and L2 regularization. This approach fits successive ElasticNet models for several blocks of (omics) data with different priorities, using the predicted values from each block as an offset for the subsequent block. It also offers robust options to handle block-wise missingness in multi-omics data, improving the flexibility and applicability of the model in the presence of incomplete datasets. |
| Note: | conceptual foundation and significant code structure inherited from the 'prioritylasso' package. |
| License: | GPL-3 |
| Depends: | R (≥ 3.5.0) |
| Imports: | survival, glmnet (≥ 2.0-13), utils, checkmate, shiny, tidyr, dplyr, caret, pROC, PRROC, plotrix, ggplot2, magrittr, tibble, broom, cvms, glmSparseNet |
| Suggests: | ipflasso, rlang, knitr, rmarkdown |
| VignetteBuilder: | knitr |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-02-11 10:57:42 UTC; eunicecarrasquinha |
| Author: | Laila Qadir Musib [aut], Helena Mouriño [aut], Eunice Carrasquinha [aut, cre] |
| Maintainer: | Eunice Carrasquinha <eitrigueirao@ciencias.ulisboa.pt> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-19 09:30:02 UTC |
Simulated Patient Data for Binary Classification
Description
This dataset contains simulated data for a binary classification problem, representing patient data with clinical, proteomics, and RNA variables. The data is organized into three blocks of variables: clinical variables, proteomics variables, and RNA variables. The outcome is a binary variable generated based on a logistic function.
Usage
Pen_Data
Format
A data frame with 406 rows and 325 columns: The dataset includes:
5 clinical variables
174 proteomic variables
145 RNA variables
Outcome variable
Pen_out
Source
Simulated dataset for package examples
Extract coefficients from a priorityelasticnet object
Description
Extract coefficients from a priorityelasticnet object
Usage
## S3 method for class 'priorityelasticnet'
coef(object, ...)
Arguments
object |
model of type priorityelasticnet |
... |
additional arguments, currently not used |
Value
List with the coefficients and the intercepts
priorityelasticnet with several block specifications
Description
Runs priorityelasticnet for a list of block specifications and gives the best results in terms of cv error.
Usage
cvm_priorityelasticnet(
X,
Y,
weights,
family,
type.measure,
blocks.list,
max.coef.list = NULL,
block1.penalization = TRUE,
lambda.type = "lambda.min",
standardize = TRUE,
nfolds = 10,
foldid,
cvoffset = FALSE,
cvoffsetnfolds = 10,
alpha = 1,
...
)
Arguments
X |
A numeric matrix of predictors. |
Y |
A response vector. For family = "multinomial", Y should be a factor with more than two levels. |
weights |
Optional observation weights. Default is NULL. |
family |
A character string specifying the model type. Options are "gaussian", "binomial", "cox", and "multinomial". Default is "gaussian". |
type.measure |
Loss function for cross-validation. Options are "mse", "deviance", "class", "auc". Default depends on the family. |
blocks.list |
list of the format |
max.coef.list |
list of |
block1.penalization |
Logical. If FALSE, the first block will not be penalized. Default is TRUE. |
lambda.type |
Type of lambda to select. Options are "lambda.min" or "lambda.1se". Default is "lambda.min". |
standardize |
Logical flag for variable standardization, prior to fitting the model. Default is TRUE. |
nfolds |
Number of folds for cross-validation. Default is 10. |
foldid |
Optional vector of values between 1 and |
cvoffset |
Logical. If TRUE, a cross-validated offset is used. Default is FALSE. |
cvoffsetnfolds |
Number of folds for cross-validation of the offset. Default is 10. |
alpha |
Elastic net mixing parameter. The elastic net penalty is defined as
Defaults to 1 (lasso penalty). |
... |
other arguments that can be passed to the function |
Value
object of class cvm_priorityelasticnet with the following elements. Any elements that are lists contain the detailed results (such as coefficients or cross-validation errors) corresponding to each penalized block of the final, optimal model.
lambda.indlist with indices of lambda for
lambda.type.lambda.typetype of lambda which is used for the predictions.
lambda.minlist with values of lambda for
lambda.type.min.cvmlist with the mean cross-validated errors for
lambda.type.nzerolist with numbers of non-zero coefficients for
lambda.type.glmnet.fitlist of fitted
glmnetobjects.namea text string indicating type of measure.
block1unpenif
block1.penalization = FALSE, the results of either the fittedglmorcoxphobject.best.blockscharacter vector with the indices of the best block specification.
best.blocks.indiceslist with the indices of the best block specification ordered by best to worst.
best.max.coefA vector containing the maximum number of non-zero coefficients (selected variables) allowed for each penalized block, corresponding to the model chosen as optimal by cross-validation
best.blocks.best.modelcomplete
priorityelasticnetmodel of the best solution.coefficientscoefficients according to the results obtained with
best.blocks.callthe function call.
Note
The function description and the first example are based on the R package ipflasso.
Construct control structures for handling of missing data for priorityelasticnet
Description
Construct control structures for handling of missing data for priorityelasticnet
Usage
missing.control(
handle.missingdata = c("none", "ignore", "impute.offset"),
offset.firstblock = c("zero", "intercept"),
impute.offset.cases = c("complete.cases", "available.cases"),
nfolds.imputation = 10,
lambda.imputation = c("lambda.min", "lambda.1se"),
perc.comp.cases.warning = 0.3,
threshold.available.cases = 30,
select.available.cases = c("maximise.blocks", "max")
)
Arguments
handle.missingdata |
how blockwise missing data should be treated. Default is |
offset.firstblock |
determines if the offset of the first block for missing observations is zero or the intercept of the observed values for |
impute.offset.cases |
which cases/observations should be used for the imputation model to impute missing offsets. Supported are complete cases (additional constraint is that every observation can only contain one missing block) and all available observations which have an overlap with the current block. |
nfolds.imputation |
nfolds for the glmnet of the imputation model |
lambda.imputation |
which lambda-value should be used for predicting the imputed offsets in cv.glmnet |
perc.comp.cases.warning |
percentage of complete cases when a warning is issued of too few cases for the imputation model |
threshold.available.cases |
if the number of available cases for |
select.available.cases |
determines how the blocks which are used for the imputation model are selected when |
Value
list with control parameters
Predictions from priorityelasticnet
Description
Makes predictions for a priorityelasticnet object. It can be chosen between linear predictors or fitted values.
Usage
## S3 method for class 'priorityelasticnet'
predict(
object,
newdata = NULL,
type = c("link", "response"),
handle.missingtestdata = c("none", "omit.prediction", "set.zero", "impute.block"),
include.allintercepts = FALSE,
use.blocks = "all",
...
)
Arguments
object |
An object of class |
newdata |
(nnew |
type |
Specifies the type of predictions. |
handle.missingtestdata |
Specifies how to deal with missing data in the test data; possibilities are |
include.allintercepts |
should the intercepts from all blocks included in the prediction? If |
use.blocks |
determines which blocks are used for the prediction, the default is all. Otherwise one can specify the number of blocks which are used in a vector |
... |
Further arguments passed to or from other methods. |
Details
handle.missingtestdata specifies how to deal with missing data.
The default none cannot handle missing data, omit.prediction does not make a prediction for observations with missing values and return NA. set.zero ignores
the missing data for the calculation of the prediction (the missing value is set to zero).
impute.block uses an imputation model to impute the offset of a missing block. This only works if the priorityelasticnet object was fitted with handle.missingdata = "impute.offset".
If impute.offset.cases = "complete.cases" was used, then every observation can have only one missing block. For observations with more than one missing block, NA is returned.
If impute.offset.cases = "available.cases" was used, the missingness pattern in the test data has to be the same as in the train data. For observations with an unknown missingness pattern, NA is returned.
Value
Predictions that depend on type.
Examples
pl_bin <- priorityelasticnet(X = matrix(rnorm(50*190),50,190), Y = rbinom(50,1,0.5),
family = "binomial", type.measure = "auc",
blocks = list(block1=1:13,block2=14:80, block3=81:190),
block1.penalization = TRUE, lambda.type = "lambda.min",
standardize = FALSE, nfolds = 3, alpha = 1)
newdata_bin <- matrix(rnorm(10*190),10,190)
predict(object = pl_bin, newdata = newdata_bin, type = "response")
Priority Elastic Net for High-Dimensional Data
Description
This function performs penalized regression analysis using the elastic net method, tailored for high-dimensional data with a known group structure. Fits a priority elastic net model for multi-block or multi-omics data. An additional Shiny application for model evaluation and weighted threshold optimization is available through the weightedThreshold function.
Usage
priorityelasticnet(
X,
Y,
weights = NULL,
family = c("gaussian", "binomial", "cox", "multinomial"),
alpha = 0.5,
type.measure,
blocks,
max.coef = NULL,
block1.penalization = TRUE,
lambda.type = "lambda.min",
standardize = TRUE,
nfolds = 10,
foldid = NULL,
cvoffset = FALSE,
cvoffsetnfolds = 10,
mcontrol = missing.control(),
scale.y = FALSE,
return.x = TRUE,
adaptive = FALSE,
initial_global_weight = TRUE,
verbose = FALSE,
...
)
Arguments
X |
A numeric matrix of predictors. |
Y |
A response vector. For family = "multinomial", Y should be a factor with more than two levels. |
weights |
Optional observation weights. Default is NULL. |
family |
A character string specifying the model type. Options are "gaussian", "binomial", "cox", and "multinomial". Default is "gaussian". |
alpha |
The elastic net mixing parameter, with |
type.measure |
Loss function for cross-validation. Options are "mse", "deviance", "class", "auc". Default depends on the family. |
blocks |
A list where each element is a vector of indices indicating the predictors in that block. |
max.coef |
A numeric vector specifying the maximum number of non-zero coefficients allowed in each block. Default is NULL, meaning no limit. |
block1.penalization |
Logical. If FALSE, the first block will not be penalized. Default is TRUE. |
lambda.type |
Type of lambda to select. Options are "lambda.min" or "lambda.1se". Default is "lambda.min". |
standardize |
Logical flag for variable standardization, prior to fitting the model. Default is TRUE. |
nfolds |
Number of folds for cross-validation. Default is 10. |
foldid |
Optional vector of values between 1 and |
cvoffset |
Logical. If TRUE, a cross-validated offset is used. Default is FALSE. |
cvoffsetnfolds |
Number of folds for cross-validation of the offset. Default is 10. |
mcontrol |
Control parameters for handling missing data. Default is |
scale.y |
Logical. If TRUE, the response variable Y is scaled. Default is FALSE. |
return.x |
Logical. If TRUE, the function returns the input matrix X. Default is TRUE. |
adaptive |
Logical. If |
initial_global_weight |
Logical. If TRUE (the default), global initial weights will be calculated based on all predictors. If FALSE, initial weights will be calculated separately for each block. |
verbose |
Logical. If TRUE prints detailed logs of the process. Default is FALSE. |
... |
Additional arguments to be passed to |
Value
A list with the following components:
lambda.ind |
Indices of the selected lambda values. |
lambda.type |
Type of lambda used. |
lambda.min |
Selected lambda values. |
min.cvm |
Cross-validated performance measure (e.g., mean squared error, deviance, or AUC) for each block, depending on the value of type.measure. |
nzero |
Number of non-zero coefficients for each block. |
glmnet.fit |
Fitted |
name |
Name of the model. |
block1unpen |
Fitted model for the unpenalized first block, if applicable. |
coefficients |
Coefficients of the fitted models. |
call |
The function call. |
X |
The input matrix X, if |
missing.data |
Logical vector indicating missing data. |
imputation.models |
Imputation models used, if applicable. |
blocks.used.for.imputation |
Blocks used for imputation, if applicable. |
missingness.pattern |
Pattern of missing data, if applicable. |
y.scale.param |
Parameters for scaling Y, if applicable. |
blocks |
The input blocks. |
mcontrol |
Control parameters for handling missing data. |
family |
The model family. |
dim.x |
Dimensions of the input matrix X. |
Note
Ensure that glmnet version >= 2.0.13 is installed. The function does not support single missing values within a block.
References
Musib, L., Coletti, R., Lopes, M.B., Mouriño, H. & Carrasquinha, E. (2024). Priority-Elastic net for binary disease outcome prediction based on multi-omics data. BioData Mining, 17(45).https://doi.org/10.1186/s13040-024-00401-0 Klau, S., Jurinovic, V., Hornung, R., Herold, T. & Boulesteix, A.-L. (2018). Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics, 19, 322. https://doi.org/10.1186/s12859-018-2344-6 Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00527.x
Examples
# Simulation of multinomial data:
set.seed(123)
n <- 100
p <- 50
k <- 3
x <- matrix(rnorm(n * p), n, p)
y <- sample(1:k, n, replace = TRUE)
y <- factor(y)
blocks <- list(bp1 = 1:10, bp2 = 11:30, bp3 = 31:50)
# Run priorityelasticnet:
fit <- priorityelasticnet(x, y, family = "multinomial", alpha = 0.5,
type.measure = "class", blocks = blocks,
block1.penalization = TRUE, lambda.type = "lambda.min",
standardize = TRUE, nfolds = 5,
adaptive = FALSE)
fit$coefficients
A Shiny App for Model Evaluation and Weighted Threshold Optimization
Description
This function starts a Shiny application that enables users to interactively adjust the threshold for binary classification and view related metrics, the confusion matrix, ROC curve, and PR curve. The app also includes a feature for calculating the optimal threshold using a weighted version of Youden's J-statistic.
Usage
weightedThreshold(object, ...)
Arguments
object |
A result from priorityelasticnet function with binomial model family. |
... |
Additional arguments |
Details
To calculate the optimal threshold, a weighted version of Youden's J-statistic (Youden, 1950) is used. The optimal cutoff is the threshold that maximizes the distance from the identity (diagonal) line. The function optimizes the metric (w * sensitivity + (1 - w) * specificity), where 'w' is the weight parameter adjusted using the second slider. After selecting the desired value on the optimal threshold slider, the user must press the "Set" button to update the threshold slider with the calculated optimal value. Metrics will then be automatically recalculated based on the user's selection. This function adapted from 'Monahov, A. (2021). Model Evaluation with Weighted Threshold Optimization (and the “mewto” R package). Available at SSRN 3805911.'
#' @references Musib, L., Coletti, R., Lopes, M.B., Mouriño, H. & Carrasquinha, E. (2024). Priority-Elastic net for binary disease outcome prediction based on multi-omics data. BioData Mining, 17(45).https://doi.org/10.1186/s13040-024-00401-0
Value
No return value. This function launches a Shiny application for interactive model evaluation with weighted threshold optimization. It is intended for user interaction rather than returning a computed value. The Shiny app provides an interactive interface to visualize model performance metrics and optimize thresholds for classification models based on user-defined criteria.
Examples
# Simulation of Pen_data data:
data("Pen_Data", package = "priorityelasticnet")
blocks <- list(block1 = 1:5, block2 = 6:179, block3 = 180:324)
# Run priorityelasticnet:
fit_bin <- priorityelasticnet(X = as.matrix(Pen_Data[, 1:324]),
Y = Pen_Data[, 325],
family = "binomial",
alpha = 0.5,
type.measure = "auc",
blocks = blocks,
standardize = FALSE)
weightedThreshold(fit_bin)