read.SnpSetIllumina {beadarraySNP} | R Documentation |
A SnpSetIllumina object is created from the textfiles created by the Illumina GenCall or BeadStudio software.
read.SnpSetIllumina(samplesheet, manifestpath = NULL, reportpath = NULL, rawdatapath = NULL, reportfile = NULL, briefOPAinfo=TRUE, verbose=FALSE)
samplesheet |
a data.frame or filename, contains the sample sheet |
manifestpath |
a character string for the path containing the manifests / OPA definition files, defaults to path of samplesheet |
reportpath |
a character string for the path containing the report files, defaults to path of samplesheet |
rawdatapath |
a character string for the path containing the intensity data files, defaults to path of samplesheet |
reportfile |
a character string for the name of BeadStudio reportfile |
briefOPAinfo |
logical, if TRUE then only the SNP name, Illumi code,
chromosome and basepair position are put into the featureData slot of
the result, else all information from the OPA file is put into the featureData
slot |
verbose |
logical, if TRUE then some extra information is given
during the import |
The text files from Illumina software are imported to a SnpSetIllumina object.
Both result files from GenCall and BeadStudio can be used.
In both cases the sample sheets from the experiments are used to select the
proper data from the report or data files. The following columns from the
sample sheet file are used for this purpose: 'Sample_Name'
,
'Sentrix_Position'
, and 'Pool_ID'
. The values in columns
'Sample_Plate'
, 'Pool_ID'
, and 'Sentrix_ID'
should be the
same for all samples in the file, as this is the case for processed
experiments. The contents of the sample sheet are put into the phenoData slot.
Also the OPA definition file containing SNP annotation should be available,
these files are provided by Illumina. Columns 'IllCode'
, 'CHR'
,
and 'MapInfo'
are put into the featureData
slot.
In order to process experiments that were genotyped using the GenCall software,
the arrays should be scanned with the setting <SaveTextFiles>true</SaveTextFiles>
in the Illumina configuration file Settings.XML.
3 Types of files need to be present in the same folder: The sample sheet,
.csv files containing signal intensity data, and the report file that contains
the genotype information. For each sample in the sample sheet there should be
a .csv file with the following file mask: [sam_id]_R00[yy]_C00[xx].csv
,
where sam_id
is the Illumina ID for the SAM, and xx
and yy
are the column and row number respectively. From the report files the file
with mask [Pool_ID]_LocusByDNA[_ExpName].csv
is used. 'Pool_ID'
is the OPA panel used, and '_ExpName'
is optional.
To process experiments that were processed with BeadStudio, only two files are
needed. The sample sheet and the Final Report file. The sample sheet must
contain the same columns as for GenCall, the report file should contain the
following columns: 'SNP Name'
, 'Sample ID'
, 'GC Score'
,
'Allele1 - AB'
, 'Allele2 - AB'
, 'GT Score'
, 'X Raw'
,
and 'Y Raw'
. 'SNP Name'
and 'Sample ID'
are used to form
rows and columns in the experimental data, 'GC Score'
is put in the
callProbability
matrix, 'Allele1 - AB'
and 'Allele2 - AB'
are combined into the call
matrix, 'GT Score'
is added to the
featureData
slot, 'X Raw'
is put in the R
matrix and
'Y Raw'
in the G
matrix. Other columns in the report file are
added as matrices in the assayData
slot.
This function returns an SnpSetIllumina
object.
Jan Oosting
# read a SnpSetIllumina object using example textfiles in data directory datadir <- system.file("testdata", package="beadarraySNP") SNPdata <- read.SnpSetIllumina(paste(datadir,"4samples_opa4.csv",sep="/"),datadir)