plotLoadings {mixOmics} | R Documentation |
This function provides a horizontal bar plot to visualise loading vectors. For discriminant analysis, it provides visualisation of highest or lowest mean/median value of the variables with color code corresponding to the outcome of interest.
## S3 method for class 'mixo_pls' plotLoadings(object, block, comp = 1, col = NULL, ndisplay = NULL, size.name = 0.7, name.var = NULL, name.var.complete = FALSE, title = NULL, subtitle, size.title = rel(2), size.subtitle = rel(1.5), layout = NULL, border = NA, xlim = NULL, ... ) ## S3 method for class 'mint.pls' plotLoadings(object, study = "global", comp = 1, col = NULL, ndisplay = NULL, size.name = 0.7, name.var = NULL, name.var.complete = FALSE, title = NULL, subtitle, size.title = rel(1.8), size.subtitle = rel(1.4), layout = NULL, border = NA, xlim = NULL, ... ) ## S3 method for class 'mixo_plsda' plotLoadings(object, contrib, method = "mean", block, comp = 1, plot = TRUE, show.ties = TRUE, col.ties="white", ndisplay = NULL, size.name = 0.7, size.legend = 0.8, name.var=NULL, name.var.complete=FALSE, title = NULL, subtitle, size.title = rel(1.8), size.subtitle = rel(1.4), legend = TRUE, legend.color = NULL, legend.title = 'Outcome', layout = NULL, border = NA, xlim = NULL, ... ) ## S3 method for class 'mint.plsda' plotLoadings(object, contrib = NULL, method = "mean", study = "global", comp = 1, plot = TRUE, show.ties = TRUE, col.ties = "white", ndisplay = NULL, size.name = 0.7, size.legend = 0.8, name.var = NULL, name.var.complete = FALSE, title = NULL, subtitle, size.title = rel(1.8), size.subtitle = rel(1.4), legend = TRUE, legend.color = NULL, legend.title = 'Outcome', layout = NULL, border = NA, xlim = NULL, ... )
object |
object |
contrib |
a character set to 'max' or 'min' indicating if the color of the bar should correspond to the group with the maximal or minimal expression levels / abundance. |
method |
a character set to 'mean' or 'median' indicating the criterion to assess the contribution. We recommend using median in the case of count or skewed data. |
study |
Indicates which study are to be plotted. A character vector containing some levels of |
block |
A single value indicating which block to consider in a |
comp |
integer value indicating the component of interest from the object. |
col |
color used in the barplot, only for object from non Discriminant analysis |
plot |
Boolean indicating of the plot should be output. If set to FALSE the user can extract the contribution matrix, see example. Default value is TRUE. |
show.ties |
Boolean. If TRUE then tie groups appear in the color set by |
col.ties |
Color corresponding to ties, only used if |
ndisplay |
integer indicating how many of the most important variables are to be plotted (ranked by decreasing weights in each PLS-component). Useful to lighten a graph. |
size.name |
A numerical value giving the amount by which plotting the variable name text should be magnified or reduced relative to the default. |
size.legend |
A numerical value giving the amount by which plotting the legend text should be magnified or reduced relative to the default. |
name.var |
A character vector indicating the names of the variables. The names of the vector should match the names of the input data, see example. |
name.var.complete |
Boolean. If |
title |
A set of characters to indicate the title of the plot. Default value is NULL. |
subtitle |
subtitle for each plot, only used when several |
size.title |
size of the title |
size.subtitle |
size of the subtitle |
legend |
Boolean indicating if the legend indicating the group outcomes should be added to the plot. Default value is TRUE. |
legend.color |
A color vector of length the number of group outcomes. See examples. |
legend.title |
A set of characters to indicate the title of the legend. Default value is NULL. |
layout |
Vector of two values (rows,cols) that indicates the layout of the plot. If |
border |
Argument from |
xlim |
Argument from |
... |
not used. |
The contribution of each variable for each component (depending on the object) is represented in a barplot where each bar length corresponds to the loading weight (importance) of the feature. The loading weight can be positive or negative.
For discriminant analysis, the color corresponds to the group in which the feature is most 'abundant'. Note that this type of graphical output is particularly insightful for count microbial data - in that latter case using the method = 'median'
is advised.
Note also that if the parameter contrib
is not provided, plots are white.
For MINT analysis, study="global"
plots the global loadings while partial loadings are plotted when study
is a level of object$study
. Since variable selection in MINT is performed at the global level, only the selected variables are plotted for the partial loadings even if the partial loadings are not sparse. See references.
Importantly for multi plots, the legend accounts for one subplot in the layout design.
none
Florian Rohart, Kim-Anh Lê Cao, Benoit Gautier
Rohart F. et al (2016, submitted). MINT: A multivariate integrative approach to identify a reproducible biomarker signature across multiple experiments and platforms.
Eslami, A., Qannari, E. M., Kohler, A., and Bougeard, S. (2013). Multi-group PLS Regression: Application to Epidemiology. In New Perspectives in Partial Least Squares and Related Methods, pages 243-255. Springer.
Singh A., Gautier B., Shannon C., Vacher M., Rohart F., Tebbutt S. and Lê Cao K.A. (2016). DIABLO - multi omics integration for biomarker discovery.
Lê Cao, K.-A., Martin, P.G.P., Robert-Granie, C. and Besse, P. (2009). Sparse canonical methods for biological data integration: application to a cross-platform study. BMC Bioinformatics 10:34.
Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris: Editions Technic.
Wold H. (1966). Estimation of principal components and related models by iterative least squares. In: Krishnaiah, P. R. (editors), Multivariate Analysis. Academic Press, N.Y., 391-420.
pls
, spls
, plsda
, splsda
,
mint.pls
, mint.spls
, mint.plsda
, mint.splsda
,
block.pls
, block.spls
, block.plsda
, block.splsda
,
mint.block.pls
, mint.block.spls
, mint.block.plsda
, mint.block.splsda
## object of class 'spls' # -------------------------- data(liver.toxicity) X = liver.toxicity$gene Y = liver.toxicity$clinic toxicity.spls = spls(X, Y, ncomp = 2, keepX = c(50, 50), keepY = c(10, 10)) plotLoadings(toxicity.spls) # with xlim xlim = matrix(c(-0.1,0.3, -0.4,0.6), nrow = 2, byrow = TRUE) plotLoadings(toxicity.spls, xlim = xlim) ## Not run: ## object of class 'splsda' # -------------------------- data(liver.toxicity) X = as.matrix(liver.toxicity$gene) Y = as.factor(liver.toxicity$treatment[, 4]) splsda.liver = splsda(X, Y, ncomp = 2, keepX = c(20, 20)) # contribution on comp 1, based on the median. # Colors indicate the group in which the median expression is maximal plotLoadings(splsda.liver, comp = 1, method = 'median') plotLoadings(splsda.liver, comp = 1, method = 'median', contrib = "max") # contribution on comp 2, based on median. #Colors indicate the group in which the median expression is maximal plotLoadings(splsda.liver, comp = 2, method = 'median', contrib = "max") # contribution on comp 2, based on median. # Colors indicate the group in which the median expression is minimal plotLoadings(splsda.liver, comp = 2, method = 'median', contrib = 'min') # changing the name to gene names # if the user input a name.var but names(name.var) is NULL, # then a warning will be output and assign names of name.var to colnames(X) # this is to make sure we can match the name of the selected variables to the contribution plot. name.var = liver.toxicity$gene.ID[, 'geneBank'] length(name.var) plotLoadings(splsda.liver, comp = 2, method = 'median', name.var = name.var, title = "Liver data", contrib = "max") # if names are provided: ok, even when NAs name.var = liver.toxicity$gene.ID[, 'geneBank'] names(name.var) = rownames(liver.toxicity$gene.ID) plotLoadings(splsda.liver, comp = 2, method = 'median', name.var = name.var, size.name = 0.5, contrib = "max") #missing names of some genes? complete with the original names plotLoadings(splsda.liver, comp = 2, method = 'median', name.var = name.var, size.name = 0.5,complete.name.var=TRUE, contrib = "max") # look at the contribution (median) for each variable plot.contrib = plotLoadings(splsda.liver, comp = 2, method = 'median', plot = FALSE, contrib = "max") head(plot.contrib$contrib) # change the title of the legend and title name plotLoadings(splsda.liver, comp = 2, method = 'median', legend.title = 'Time', title = 'Contribution plot', contrib = "max") # no legend plotLoadings(splsda.liver, comp = 2, method = 'median', legend = FALSE, contrib = "max") # change the color of the legend plotLoadings(splsda.liver, comp = 2, method = 'median', legend.color = c(1:4), contrib = "max") # object 'splsda multilevel' # ----------------- data(vac18) X = vac18$genes Y = vac18$stimulation # sample indicates the repeated measurements sample = vac18$sample stimul = vac18$stimulation # multilevel sPLS-DA model res.1level = splsda(X, Y = stimul, ncomp = 3, multilevel = sample, keepX = c(30, 137, 123)) name.var = vac18$tab.prob.gene[, 'Gene'] names(name.var) = colnames(X) plotLoadings(res.1level, comp = 2, method = 'median', legend.title = 'Stimu', name.var = name.var, size.name = 0.2, contrib = "max") # too many transcripts? only output the top ones plotLoadings(res.1level, comp = 2, method = 'median', legend.title = 'Stimu', name.var = name.var, size.name = 0.5, ndisplay = 60, contrib = "max") # object 'plsda' # ---------------- # breast tumors # --- data(breast.tumors) X = breast.tumors$gene.exp Y = breast.tumors$sample$treatment plsda.breast = plsda(X, Y, ncomp = 2) name.var = as.character(breast.tumors$genes$name) names(name.var) = colnames(X) # with gene IDs, showing the top 60 plotLoadings(plsda.breast, contrib = 'max', comp = 1, method = 'median', ndisplay = 60, name.var = name.var, size.name = 0.6, legend.color = color.mixo(1:2)) # liver toxicity # --- data(liver.toxicity) X = liver.toxicity$gene Y = liver.toxicity$treatment[, 4] plsda.liver = plsda(X, Y, ncomp = 2) plotIndiv(plsda.liver, ind.names = Y, ellipse = TRUE) name.var = liver.toxicity$gene.ID[, 'geneBank'] names(name.var) = rownames(liver.toxicity$gene.ID) plotLoadings(plsda.liver, contrib = 'max', comp = 1, method = 'median', ndisplay = 100, name.var = name.var, size.name = 0.4, legend.color = color.mixo(1:4)) # object 'sgccda' # ---------------- data(nutrimouse) Y = nutrimouse$diet data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid) design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, byrow = TRUE) nutrimouse.sgccda = wrapper.sgccda(X = data, Y = Y, design = design, keepX = list(gene = c(10,10), lipid = c(15,15)), ncomp = 2, scheme = "centroid") plotLoadings(nutrimouse.sgccda,block=2) plotLoadings(nutrimouse.sgccda,block="gene") # object 'mint.splsda' # ---------------- data(stemcells) data = stemcells$gene type.id = stemcells$celltype exp = stemcells$study res = mint.splsda(X = data, Y = type.id, ncomp = 3, keepX = c(10,5,15), study = exp) plotLoadings(res) plotLoadings(res, contrib = "max") plotLoadings(res, contrib = "min", study = 1:4,comp=2) # combining different plots by setting a layout of 2 rows and 4columns. # Note that the legend accounts for a subplot so 4columns instead of 2. plotLoadings(res,contrib="min",study=c(1,2,3),comp=2, layout = c(2,4)) plotLoadings(res,contrib="min",study="global",comp=2) ## End(Not run)