awsdens {aws} | R Documentation |
This function uses adaptive weights smoothing for density estimation. The estimate is constructed using poisson regression and the structural assumption of a local constant density. Implementation is for 1D, 2D and 3D problems, although it applications may be interesting only in 2D and 3D.
awsdens(y, ngrid = NULL, nempty = NULL, qlambda = NULL, eta = 0.5, lkern = "Triangle", fu = NULL, hinit = 1, hincr = 1.2, hmax = NULL, graph = FALSE, demo = FALSE, symmetric = TRUE)
y |
y contains the observed random variables or vectors.
In case of random vectors y is supposed to be an
observation matrix with columns containing the observed vectors |
ngrid |
ngrid determines the size of the grid used for binning.
If ngrid is a vector its components are supposed to determine
the number of intervals for the corresponding component of y ;
if it is a scalar rep(ngrid,dim(y)[1]) is used.
If is.null(ngrid) a default of 2*n^{1/d} , with
dim(y)=c(d,n) , is chosen, i.e. a grid with 2^d*n bins. |
nempty |
determines the width of a boundary region of the grid containing only empty bins.
nempty defaults to 0.1*ngrid . The grid generated is equispaced with
prod(ngrid) bins and grid-coordinates (nempty+1)[i] and (ngrid+1-nempty)[i]
determined by min(y[i,]) and max(y[i,]) . |
qlambda |
qlambda determines the scale parameter qlambda
for the stochastic penalty. The scaling parameter in the stochastic
penalty lambda is choosen as the qlambda -quantile
of a Chi-square-distribution with number of parameters in the polynomial
model as degrees of freedom. If qlambda=NULL a standard value
depending on model and symmetric is choosen. |
eta |
eta is a memory parameter used to stabilize the procedure.
eta has to be between 0 and 1 , with
eta=.5 being the default. |
lkern |
lkern determines the location kernel to be used. Options
are "Uniform" , "Triangle" , "Quadratic" ,
"Cubic" and "Exponential" . Default is "Triangle" .
The Kernel operates on the squared distance, so "Triangle"
corresponds to the use of an Epanechnikov kernel in kernel smoothing.
"Exponential" requires larger values of hmax and
therefore more iterations to reach comparable results. |
fu |
fu can be used to specify a function to calculate the
values of the true density on the grid. This may be used
for test purposes to calculate Mean Squared Error (MSE) and
Mean Absolute Error (MAE) |
hinit |
hinit Initial bandwidth for the location penalty.
Appropriate value is choosen in case of hinit==NULL |
hincr |
hincr hincr^(1/d) , with d the
dimensionality of the design, is used as a factor to increase the
bandwidth between iterations. Defauts to hincr=1.2 |
hmax |
hmax Maximal bandwidth to be used. Determines the
number of iterations and is used as the stopping rule. |
graph |
graph if TRUE results are displayed after each
iteration step. |
demo |
demo if TRUE after each iteration step results
are displayed and the process waits for user interaction. |
symmetric |
If symmetric==TRUE the stochastic penalty is
symmetrized, i.e. (sij + sji)/lambda is used instead of
sij/lambda . See references for details. |
The density estimate ist constructed using an approach proposed by
Lindsay (1974a,b). 1, 2 or 3-dimensional binning is used to produce
counts, i.e. numbers of observations, per bin, with bins defined by
a regular grid. THe bin-counts are then considered as poisson random
variables with intensity depending on the location within the grid.
Adaptive Weights Smoothing for poisson regression models, i.e. function
laws
with parameter model="Poisson"
, is used
to construct a location dependent intensity estimate which is then
transformed into a density estimate, see Section 6 in Polzehl and Spokoiny
(2002b) for details.
The returned object is a list with components
bin |
Bin counts |
dens |
density values for the bin. Values correspond to the grid-centers
determined by component xgrid |
xgrid |
list with components containing the coordinates of bin-centers.
The dim(y)[1] components of xgrid correspond to the
components of the grid-coordinates. |
call |
actual function call |
Joerg Polzehl, polzehl@wias-berlin.de
### ### univariate density estimation ### f0 <- function(x) 1.5*(x>=0)-(x>=.25)+(x>=.75)-1.5*(x>1) set.seed(1) m <- 1000 x1 <- runif(m) ind <- (runif(m)<f0(x1)/1.5) n <- 200 y <- x1[ind][1:n] tmp0 <- awsdens(y,440,20,hmax=2000) plot(tmp0$xgrid[[1]],tmp0$dens,type="l",lwd=2, ylim =range(0,1.5,tmp0$dens),xlab="x",ylab="Density") lines(tmp0$xgrid[[1]],f0(tmp0$xgrid[[1]]),lty=3,col=2,lwd=2) title("Estimated and true density (n=200)") ### ### bivariate density estimation ### f1 <- function(x,y) 7.5*pmax(x*(1-x^2-y^2),0)*(x>0)*(y>0) set.seed(1) m <- 10000 x1 <- runif(m) x2 <- runif(m) fx12 <- f1(x1,x2) ind <- (runif(m)<f1(x1,x2)/.385/7.5) n <- 2500 y <- rbind(x1[ind],x2[ind])[,1:n] tmp <- awsdens(y,120,10,hmax=5) image(tmp$xgrid[[1]],tmp$xgrid[[2]],tmp$dens,xlab="x1",ylab="x2",col=gray(.5+(255:0)/511),lwd=2) # thats the estimated density on the grid lines(seq(0,1,.01),sqrt(1-seq(0,1,.01)^2),col=2,lty=2,lwd=2) lines(c(0,1),c(0,0),col=2,lty=2,lwd=2) lines(c(0,0),c(0,1),col=2,lty=2,lwd=2) # thats the boundary of the support title("Estimated density (n=2500)") # now add contour lines of the estimated density contour(tmp$xgrid[[1]],tmp$xgrid[[2]],tmp$dens,nlevels=20,add=TRUE,col=1,lty=1,labcex=.1) rm(f0,m,x1,ind,n,y,tmp0,f1,x2,fx12,tmp)