Advanced User Guide - SangerRead (AB1)¶
SangerRead is the lowest level in sangeranalyseR showed in Figure_1 which corresponds to a single read (one AB1 file) in Sanger sequencing. It extends sangerseq S4 class from sangerseqR package and contains quality trimming as well as chromatogram input parameters and results. In this section, we are going to go through detailed sangeranalyseR data analysis steps in SangerRead level with AB1 file input.

Figure 1. Hierarchy of classes in sangeranalyseR, SangerRead level.¶
Preparing SangerRead AB1 input¶
The main input file format to create SangerRead instance is AB1. Before starting the analysis, users need to prepare one target AB1 file. The only hard regulation of the filename is that the input file must have .ab1 as its file extension. There are some suggestions about the filename in the note below:
Note
AB1 file should be indexed for better consistency with file naming regulation for SangerContig and SangerAlignment.
Forward or reverse direction should be specified in the filename.
Figure_2 shows the suggested file naming strategy. The filename should contain four main parts: "Contig name", "Index number", "Direction" and "ab1 file extension".
"Contig name" :
Achl_RBNII397-13
"Index number" :
1
"Direction" :
F
"ab1 file extension" :
.ab1

Figure 2. SangerRead filename regulation.¶
In SangerRead section, it is not compulsory to follow the file naming regulation because users can directly specify the filename in input (see Creating SangerRead instance from AB1); however, in the SangerContig and SangerAlignment, sangeranalyseR will automatically group files, so it is compulsory to have systematic file naming strategy. For more details, please read Advanced User Guide - SangerContig (AB1) and Advanced User Guide - SangerAlignment (AB1). Figure_3 shows the suggested AB1 file naming regulation.

Figure 3. Suggested AB1 file naming regulation - SangerRead.¶
Creating SangerRead instance from AB1¶
After preparing the SangerRead input AB1 file, the next step is to create the SangerRead S4 instance by running SangerRead
constructor function or new
method. The constructor function is a wrapper for new
method which makes instance creation more intuitive. The inputs include Basic Parameters, Trimming Parameters and Chromatogram Parameters and most of them have their own default values. In the constructor below, we list important parameters.
sangerReadF <- SangerRead(inputSource = "ABIF",
readFeature = "Forward Read",
readFileName = "Achl_RBNII397-13_1_F.ab1",
geneticCode = GENETIC_CODE,
TrimmingMethod = "M1",
M1TrimmingCutoff = 0.0001,
M2CutoffQualityScore = NULL,
M2SlidingWindowSize = NULL,
baseNumPerRow = 100,
heightPerRow = 200,
signalRatioCutoff = 0.33,
showTrimmed = TRUE)
The inputs of SangerRead
constructor function and new
method are same. For more details about SangerRead inputs and slots definition, please refer to sangeranalyseR reference manual (need update). The created SangerRead instance, sangerRead
, is used as the input for the following functions.
Visualizing SangerRead trimmed read¶
Before going to Writing SangerRead FASTA files (AB1) and Generating SangerRead report (AB1) pages, it is suggested to visualize the trimmed SangerRead. Run the qualityBasePlot
function to get the result in Figure_4. It shows the quality score for each base pairs and the trimming start/end points of the sequence.

Figure 4. SangerRead trimmed read visualization.¶
qualityBasePlot(sangerReadF)
Updating SangerRead quality trimming parameters¶
In the previous Creating SangerRead instance from AB1 part, the constructor function applies the quality trimming parameters to the read. After creating the SangerRead S4 instance, users can change the trimming parameters by running updateQualityParam
function which will change the QualityReport instance inside the SangerRead and update frameshift amino acid sequences.
newSangerRead <- updateQualityParam(sangerReadF,
TrimmingMethod = "M2",
M1TrimmingCutoff = NULL,
M2CutoffQualityScore = 29,
M2SlidingWindowSize = 15)
Writing SangerRead FASTA files (AB1)¶
Users can write the SangerRead instance to FASTA files. The trimmed read sequence will be written into a FASTA file. Below is the one-line function that users need to run. This function mainly depends on writeXStringSet
function in Biostrings R package. Users can set the compression level through writeFasta
function.
writeFasta(newSangerRead,
outputDir = tempdir(),
compress = FALSE,
compression_level = NA)
Users can download the output FASTA file of this example.
Generating SangerRead report (AB1)¶
Last but not least, users can save SangerRead instance into a report after the analysis. The report will be generated in HTML by knitting Rmd files. The results in the report are static.
generateReport(newSangerRead,
outputDir = tempdir())
SangerRead_Report_ab1.html is the generated SangerRead report html of this example. Users can access to 'Basic Information', 'DNA Sequence', 'Amino Acids Sequence', 'Quality Trimming' and 'Chromatogram' sections inside this report.