AEsetnorm quality metrics report


Summary

Array #Array NameMA plotsBoxplotsHeatmap
1459F.CEL
2459D.CEL
3459B.CEL
4515J.CEL
5515G.CEL
6459A.CEL
7459C.CEL
8515D.CEL
9515I.CEL
10515E.CEL
11515K.CEL
12515H.CEL
13515C.CEL
14515F.CEL
15515B.CEL
16515L.CEL
*outlier array

Index

PLEASE NOTE:
All figures below are links to PDF files: these contain images for every array in the report. The PDF files may be several pages long, this HTML report presents only the first page.

Section 1: Individual array quality

maplot.png
Figure 1: MA plots

Figure 1 shows the MA plot for each array. M and A are defined as :
M = log2(I1) - log2(I2)
A = 1/2 (log2(I1)+log2(I2)),
where I1 is the intensity of the array studied and I2 is the intensity of a "pseudo"-array, which have the median values of all the arrays. Typically, we expect the mass of the distribution in an MA plot to be concentrated along the M = 0 axis, and there should be no trend in the mean of M as a function of A. A trend in the lower range of A usually indicates that the arrays have different background intensities, this may be addressed by background correction. A trend in the upper range of A usually indicates saturation of the measurements, in mild cases, this may be addressed by non-linear normalisation (e.g. quantile normalisation).

Section 2: Array intensity distributions

boxplot.png
Figure 2: Boxplots

Figure 2 presents boxplots of the log2(Intensities). Each box corresponds to one array. It gives a simple summary of the distribution of probe intensities across all arrays. Typically, one expects the boxes to have similar size (IQR) and y position (median). If the distribution of an individual array is very different from the others, this may indicate an experimental problem. After normalisation, the distributions should be similar.
density.png
Figure 3: Density plots

Figure 3 shows density estimates (smoothed histograms) of the data. Typically, the distributions of the arrays should have similar shapes and ranges. Arrays whose distributions are very different from the others should be considered for possible problems. On raw data, a bimodal distribution can be indicative of an array containing a spatial artefact and an array shifted to the right of an array with abnormal higher background intensities.

Section 3: Between array comparison

heatmap.png
Figure 4: Heatmap representation of the distance between arrays

Figure 4 shows a false colour heatmap of between arrays distances, computed as the mean absolute difference (L1-distance) of the vector of M-values for each pair of arrays on every probes without any filtering. The colour scale is chosen to cover the range of L1-distances encountered in the dataset. Arrays for which the sum of the distances to the others is much different from the others, are detected as outlier arrays. The dendrogram on this plot also can serve to check if, the arrays cluster accordingly to a biological meaning.
dxy = mean|Mxi-Myi|
Here, Mxi is the M-value of the i-th probe on the x-th array, without preprocessing. Consider the following decomposition of Mxi: Mxi = zi + βxi + εxi, where zi is the probe effect for probe i (the same across all arrays), εxi are i.i.d. random variables with mean zero and βxi is such that for any array x, the majority of values βxi are negligibly small (i. e. close to zero). βxi represents differential expression effects. In this model, all values dxy are (in expectation) the same, namely 2 times the standard deviation of εxi.
pca.png
Figure 5: Principal Component Analysis

Figure 5 represents a biplot for the first two principal components from the dataset. The colours correspond to the group of interest given. We expect the arrays to cluster accordingly to a relevant experimental factor. The principal components transformation of a data matrix re-expresses the features using linear combination of the original variables. The first principal component is the linear combination chosen to possess maximal variance, the second is the linear combination orthogonal to the first possessing maximal variance among all orthogonal combination.

Section 4: Variance mean dependence

meansd.png
Figure 6: Standard deviation versus rank of the mean

For each feature, Figure 6 shows the standard deviation of the intensities across arrays on the y-axis versus the rank of their mean on the x-axis. The red dots, connected by lines, show the running median of the standard deviation. After normalisation and transformation to a logarithm(-like) scale, one typically expects the red line to be approximately horizontal, that is, show no substantial trend. In some cases, a hump on the right hand of the x-axis can be observed and is symptomatic of a saturation of the intensities.

This report has been created with arrayQualityMetrics 2.4.3 under R version 2.10.1 (2009-12-14)


(Page generated on Tue Feb 23 02:29:33 2010 by hwriter )