daglad {GLAD} | R Documentation |
This function allows the detection of breakpoints in genomic profiles obtained by array CGH technology and affects a status (gain, normal or lost) to each clone.
daglad.profileCGH(profileCGH, mediancenter=FALSE, normalrefcenter=FALSE, genomestep=FALSE, smoothfunc="lawsglad", lkern="Exponential", model="Gaussian", qlambda=0.999, bandwidth=10, sigma=NULL, base=FALSE, round=1.5, lambdabreak=8, lambdaclusterGen=40, param=c(d=6), alpha=0.001, msize=5, method="centroid", nmin=1, nmax=8, amplicon=1, deletion=-5, deltaN=0.10, forceGL=c(-0.15,0.15), nbsigma=3, MinBkpWeight=0.35, CheckBkpPos=TRUE, verbose=FALSE, ...)
profileCGH |
Object of class profileCGH |
mediancenter |
If TRUE , LogRatio are center on their median. |
genomestep |
If TRUE , a smoothing step over the whole
genome is performed and a "clustering throughout the genome" allows
to identify a cluster corresponding to the Normal DNA level. The threshold used in the daglad
function (deltaN, forceGL, amplicon, deletion ) and then
compared to the median of this cluster. |
normalrefcenter |
If TRUE , the LogRatio are centered
through the median of the cluster identified during the genomestep . |
smoothfunc |
Type of algorithm used to smooth LogRatio by a
piecewise constant function. Choose either aws or
laws . |
lkern |
lkern determines the location kernel to be used
(see laws for details). |
model |
model determines the distribution type of LogRatio
(see laws for details). |
qlambda |
qlambda determines the scale parameter qlambda for the
stochastic penalty (see laws for details). |
base |
If TRUE, the position of clone is the physical position onto the chromosome, otherwise the rank position is used. |
sigma |
Value to be passed to either argument sigma2
of aws function or shape of
laws . If NULL , sigma is calculated from
the data. |
bandwidth |
Set the maximal bandwidth hmax in the
aws or laws function. For
example, if bandwidth=10 then the hmax value is set
to 10*X_N where X_N is the position of the last clone. |
round |
The smoothing results of either aws
or laws function are rounded or not depending on
the round argument. The round value is passed to the
argument digits of the round function. |
lambdabreak |
Penalty term (λ') used during the "Optimization of the number of breakpoints" step. |
lambdaclusterGen |
Penalty term (λ*) used during the "clustering throughout the genome" step. |
param |
Parameter of kernel used in the penalty term. |
alpha |
Risk alpha used for the "Outlier detection" step. |
msize |
The outliers MAD are calculated on regions with a cardinality greater or equal to msize. |
method |
The agglomeration method to be used during the "clustering throughout the genome" steps. |
nmin |
Minimum number of clusters (N*max) allowed during the "clustering throughout the genome" clustering step. |
nmax |
Maximum number of clusters (N*max) allowed during the "clustering throughout the genome" clustering step. |
amplicon |
Level (and outliers) with a smoothing value (log-ratio value) greater than this threshold are consider as amplicon. Note that first, the data are centered on the normal reference value computed during the "clustering throughout the genome" step. |
deletion |
Level (and outliers) with a smoothing value (log-ratio value) lower than this threshold are consider as deletion. Note that first, the data are centered on the normal reference value computed during the "clustering throughout the genome" step. |
deltaN |
Region with smoothing values in between the interval [-deltaN,+deltaN] are supposed to be normal. |
forceGL |
Level with smoothing value greater (lower) than
rangeGL[1] (rangeGL[2] ) are considered as gain
(lost). Note that first, the data are centered on the normal reference value
computed during the "clustering throughout the genome" step. |
nbsigma |
For each breakpoints, a weight is calculated which is a function of absolute value of the Gap between the smoothing values of the two consecutive regions. Weight = 1- kernelpen(abs(Gap),param=c(d=nbsigma*Sigma)) where Sigma is the standard deviation of the LogRatio. |
MinBkpWeight |
Breakpoints which GNLchange ==0 and
Weight less than MinBkpWeight are discarded. |
CheckBkpPos |
If TRUE , the accuracy position of each
breakpoints is checked. |
verbose |
If TRUE some information are printed. |
... |
The function daglad
implements a slightly modified
version of the methodology described in the article : Analysis of array CGH data: from signal
ratio to gain and loss of DNA regions (Hupé et al., Bioinformatics 2004 20(18):3413-3422).
The daglad
function allows to choose some threshold to help the algorithm to identify the status of the genomic regions. The threshodls are given in the following parameters:
|
An object of class "profileCGH" with the following attributes: |
profileValues |
a data.frame with the following added information:
|
BkpInfo |
a data.frame sum up the information for each
breakpoint:
|
NormalRef |
If genomestep=TRUE and
normalrefcenter=FALSE , then NormalRef is the median of the cluster which has been used to set the normal
reference during the "clustering throughout the genome"
step. Otherwise NormalRef is 0. |
People interested in tools dealing with array CGH analysis can visit our web-page http://bioinfo.curie.fr.
Philippe Hupé, glad@curie.fr.
glad
.
data(snijders) gm13330$Clone <- gm13330$BAC profileCGH <- as.profileCGH(gm13330) ########################################################### ### ### daglad function ### ########################################################### res <- daglad(profileCGH, mediancenter=FALSE, normalrefcenter=FALSE, genomestep=FALSE, smoothfunc="lawsglad", lkern="Exponential", model="Gaussian", qlambda=0.999, bandwidth=10, base=FALSE, round=1.5, lambdabreak=8, lambdaclusterGen=40, param=c(d=6), alpha=0.001, msize=5, method="centroid", nmin=1, nmax=8, amplicon=1, deletion=-5, deltaN=0.10, forceGL=c(-0.15,0.15), nbsigma=3, MinBkpWeight=0.35, CheckBkpPos=TRUE) ### Genomic profile on the whole genome plotProfile(res, unit=3, Bkp=TRUE, labels=FALSE, Smoothing="Smoothing", main="Breakpoints detection: DAGLAD analysis") ###Genomic profile for chromosome 1 plotProfile(res, unit=3, Bkp=TRUE, labels=TRUE, Chromosome=1, Smoothing="Smoothing", main="Chromosome 1: DAGLAD analysis") ### The standard-deviation of LogRatio are: res$SigmaC ### The list of breakpoints is: res$BkpInfo