nsFilter {genefilter} | R Documentation |
This function removes unwanted probe sets from an
ExpressionSet
without using phenotype variables in the
filtering process. Hence the filter is non-specific with respect to
the phenotypes in the data.
nsFilter(eset, require.entrez = TRUE, require.symbol = TRUE, require.GOBP = FALSE, require.GOCC = FALSE, require.GOMF = FALSE, remove.dupEntrez = TRUE, var.func = IQR, var.cutoff = 0.5, var.filter = TRUE)
eset |
an ExpressionSet object |
require.entrez |
If TRUE , require that all probe sets
have an Entrez Gene ID annotation. Probe sets without such an
annotation will be filtered out. |
require.symbol |
If TRUE , require that all probe sets
have a gene symbol annotation. Probe sets without such an
annotation will be filtered out. |
require.GOBP |
If TRUE , require that all probe sets have
an annotation to at least one GO ID in the BP ontology. Probe
sets without such an annotation will be filtered out. |
require.GOCC |
If TRUE , require that all probe sets have
an annotation to at least one GO ID in the CC ontology. Probe
sets without such an annotation will be filtered out. |
require.GOMF |
If TRUE , require that all probe sets have
an annotation to at least one GO ID in the MF ontology. Probe
sets without such an annotation will be filtered out. |
remove.dupEntrez |
If TRUE and there are multiple probe
sets mapping to the same Entrez Gene ID, then the probe set with
the largest value of var.func will be retained and the
others removed. |
var.func |
a function that will be used to assess the
variance of a probe set across all samples. This function
should return a numeric vector of length one when given a
numeric vector as input. Probe sets with a var.func
value less than var.cutoff will be removed. The default
is IQR . |
var.cutoff |
a numeric value to use in filtering out probe sets
with small variance across samples. See the var.func
argument and the details section below. |
var.filter |
a logical indicating whether or not to perform
variance based filtering. The default is TRUE . |
A first step in many microarray analysis procedures is to carry out non-specific filtering. The goal is to remove uninteresting probe sets without regard to the phenotype data and reduce the number of probe sets that will be included in further analysis.
Annotation Based Filtering
Arguments require.entrez
, require.symbol
,
require.GOBP
, require.GOCC
, and require.GOMF
turn on a filter based on available annotation data. The annotation
package is determined by calling annotation(eset)
.
Variance Based Filtering
The var.func
and var.cutoff
arguments control the
variance based filtering. The intention is to remove probe sets
with little variation across samples. The default var.func
is IQR
and was selected because it is robust to outliers.
The deafult var.cutoff
is 0.5
and is motivated by the
common case where the platform is a genome-wide expression array and
the rule of thumb that in any given tissue only 40% of genes are
expressed.
A list consisting of:
eset |
the filtered ExpressionSet |
filter.log |
a list giving details of how many probe sets where removed for each filtering step performed. |
Seth Falcon
library("hgu95av2") data(sample.ExpressionSet) ans <- nsFilter(sample.ExpressionSet) ans$eset ans$filter.log