pdmClass {pdmclass} | R Documentation |
This function is used to classify microarray data. Since the underlying model fit is based on penalized discriminant methods, there is no need for a pre-filtering step to reduce the number of genes.
pdmClass(formula = formula(data), method = c("pls", "pcr", "ridge"), data = sys.frame(sys.parent()), weights, theta, dimension = J - 1, eps = .Machine$double.eps, ...)
formula |
A symbolic description of the model to be fit. Details given below. |
method |
One of "pls", "pcr", "ridge", corresponding to partial least squares, principal components regression and ridge regression. |
data |
An optional data.frame that contains the variables in the
model. If not found in data , the variables are taken from
environment(formula) , typically the environment from which
pdmClass is called. Note that unlike most microarray
analyses, in this case rows are samples and columns are genes. |
weights |
An optional vector of sample weights. Defaults to 1. |
theta |
An optional matrix of class scores, typically with less than J - 1 columns. |
dimension |
The dimension of the solution. This will be no greater than J - 1 for partial least squares and ridge regression, and no greater than J for principal components regression. Defaults to J - 1 and J, respectively. |
eps |
A threshold for excluding small discriminant
variables. Defaults to .Machine$double.eps . |
... |
Additional parameters to pass to method . |
The formula interface is identical to all other formula calls in R, namely Y ~ X, where Y is a numeric vector of class assignments and X is a matrix or data.frame containing the gene expression values. Note that unlike most microarray analyses, in this instance the columns of X are genes and rows are samples, so most calls will require something similar to Y ~ t(X).
an object of class "fda"
. Use predict
to extract
discriminant variables, posterior probabilities or predicted class
memberships. Other extractor functions are coef
,
and plot
.
The object has the following components:
percent.explained |
the percent between-group variance explained by each dimension (relative to the total explained.) |
values |
optimal scaling regresssion sum-of-squares for each
dimension (see reference). The usual discriminant analysis
eigenvalues are given by values / (1-values) , which are used
to define percent.explained . |
means |
class means in the discriminant space. These are also
scaled versions of the final theta's or class scores, and can be
used in a subsequent call to fda (this only makes sense if
some columns of theta are omitted—see the references). |
theta.mod |
(internal) a class scoring matrix which allows
predict to work properly. |
dimension |
dimension of discriminant space. |
prior |
class proportions for the training data. |
fit |
fit object returned by method . |
call |
the call that created this object (allowing it to be
update -able) |
James W. MacDonald and Debashis Ghosh, based on fda
in
the mda
package of Trevor Hastie and Robert Tibshirani, which
was ported to R by Kurt Hornik, Brian D. Ripley, and Friedrich Leisch.
http://www.sph.umich.edu/~ghoshd/COMPBIO/POPTSCORE
"Flexible Disriminant Analysis by Optimal Scoring" by Hastie, Tibshirani and Buja, 1994, JASA, 1255-1270.
"Penalized Discriminant Analysis" by Hastie, Buja and Tibshirani, Annals of Statistics, 1995 (in press).
library(fibroEset) data(fibroEset) y <- as.factor(pData(fibroEset)[,2]) x <- t(exprs(fibroEset)) pdmClass(y ~ x)