To illustrate the quantitative data and quality control of MS runs, dataProcessPlotsPTM takes the quantitative data from dataSummarizationPTM or dataSummarizationPTM_TMT to plot the following : (1) profile plot (specify "ProfilePlot" in option type), to identify the potential sources of variation for each protein; (2) quality control plot (specify "QCPlot" in option type), to evaluate the systematic bias between MS runs.

dataProcessPlotsPTM(
  data,
  type = "PROFILEPLOT",
  ylimUp = FALSE,
  ylimDown = FALSE,
  x.axis.size = 10,
  y.axis.size = 10,
  text.size = 4,
  text.angle = 90,
  legend.size = 7,
  dot.size.profile = 2,
  ncol.guide = 5,
  width = 10,
  height = 12,
  ptm.title = "All PTMs",
  protein.title = "All Proteins",
  which.PTM = "all",
  which.Protein = NULL,
  originalPlot = TRUE,
  summaryPlot = TRUE,
  address = ""
)

Arguments

data

name of the list with PTM and (optionally) Protein data, which can be the output of the MSstatsPTM dataSummarizationPTM or dataSummarizationPTM_TMT functions.

type

choice of visualization. "ProfilePlot" represents profile plot of log intensities across MS runs. "QCPlot" represents box plots of log intensities across channels and MS runs.

ylimUp

upper limit for y-axis in the log scale. FALSE(Default) for Profile Plot and QC Plot uses the upper limit as rounded off maximum of log2(intensities) after normalization + 3..

ylimDown

lower limit for y-axis in the log scale. FALSE(Default) for Profile Plot and QC Plot uses 0..

x.axis.size

size of x-axis labeling for "Run" and "channel in Profile Plot and QC Plot.

y.axis.size

size of y-axis labels. Default is 10.

text.size

size of labels represented each condition at the top of Profile plot and QC plot. Default is 4.

text.angle

angle of labels represented each condition at the top of Profile plot and QC plot. Default is 0.

legend.size

size of legend above Profile plot. Default is 7.

dot.size.profile

size of dots in Profile plot. Default is 2.

ncol.guide

number of columns for legends at the top of plot. Default is 5.

width

width of the saved pdf file. Default is 10.

height

height of the saved pdf file. Default is 10.

ptm.title

title of overall PTM QC plot

protein.title

title of overall Protein QC plot

which.PTM

PTM list to draw plots. List can be names of PTMs or order numbers of PTMs. Default is "all", which generates all plots for each protein. For QC plot, "allonly" will generate one QC plot with all proteins.

which.Protein

List of proteins to plot. Will plot all PTMs associated with listed Proteins. Default is NULL which will default to which.PTM.

originalPlot

TRUE(default) draws original profile plots, without normalization.

summaryPlot

TRUE(default) draws profile plots with protein summarization for each channel and MS run.

address

the name of folder that will store the results. Default folder is the current working directory. The other assigned folder has to be existed under the current working directory. An output pdf file is automatically created with the default name of "ProfilePlot.pdf" or "QCplot.pdf". The command address can help to specify where to store the file as well as how to modify the beginning of the file name. If address=FALSE, plot will be not saved as pdf file but showed in window.

Value

plot or pdf

Examples

head(raw.input$PTM)
#> # A tibble: 6 x 10 #> ProteinName PeptideSequence Condition BioReplicate Run Intensity #> <chr> <chr> <chr> <chr> <chr> <dbl> #> 1 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH1 CCCP-B1T1 1423906. #> 2 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH1 CCCP-B1T2 877045. #> 3 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH2 CCCP-B2T1 384418. #> 4 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH2 CCCP-B2T2 454858. #> 5 Q9UHD8_K262 DAGLK*QAPASR Combo BCH1 Combo-B1T1 1603377. #> 6 Q9UHD8_K262 DAGLK*QAPASR Combo BCH1 Combo-B1T2 676555. #> # ... with 4 more variables: PrecursorCharge <chr>, FragmentIon <lgl>, #> # ProductCharge <lgl>, IsotopeLabelType <chr>
head(raw.input$PROTEIN)
#> # A tibble: 6 x 10 #> ProteinName PeptideSequence Condition BioReplicate Run Intensity #> <chr> <chr> <chr> <chr> <chr> <dbl> #> 1 Q9UHD8 STLINTLFK CCCP BCH2 CCCP-B2T1 367944. #> 2 Q9UHD8 STLINTLFK CCCP BCH2 CCCP-B2T2 341207. #> 3 Q9UHD8 STLINTLFK Combo BCH2 Combo-B2T1 185843. #> 4 Q9UHD8 STLINTLFK Ctrl BCH2 Ctrl-B2T1 529224. #> 5 Q9UHD8 STLINTLFK Ctrl BCH2 Ctrl-B2T2 483355. #> 6 Q9UHD8 STLINTLFK USP30_OE BCH2 USP30_OE-B2T1 447795. #> # ... with 4 more variables: PrecursorCharge <chr>, FragmentIon <lgl>, #> # ProductCharge <lgl>, IsotopeLabelType <chr>
quant.lf.msstatsptm <- dataSummarizationPTM(raw.input)
#> Starting PTM summarization...
#> INFO [2021-04-30 11:44:09] ** Features with one or two measurements across runs are removed. #> INFO [2021-04-30 11:44:09] ** Fractionation handled. #> INFO [2021-04-30 11:44:09] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-04-30 11:44:09] ** Log2 intensities under cutoff = 13.751 were considered as censored missing values. #> INFO [2021-04-30 11:44:09] ** Log2 intensities = NA were considered as censored missing values. #> INFO [2021-04-30 11:44:09] ** Use all features that the dataset originally has. #> INFO [2021-04-30 11:44:09] #> # proteins: 125 #> # peptides per protein: 1-5 #> # features per peptide: 1-1 #> INFO [2021-04-30 11:44:09] Five or more proteins have only one feature: #> Q9UHD8_K028, #> Q9UHD8_K069, #> Q9UHD8_K141, #> Q9UHQ9_K046, #> Q9UHQ9_K062 ... #> INFO [2021-04-30 11:44:09] #> CCCP Combo Ctrl USP30_OE #> #> # runs 4 4 4 4 #> #> # bioreplicates 2 2 2 2 #> #> # tech. replicates 2 2 2 2 #> INFO [2021-04-30 11:44:09] Five or more features are completely missing in at least one condition, #> VYLK*GVHPK_2_NA_NA, #> GVHPK*FPEGGK_2_NA_NA, #> MSQYLDSLK*VGDVVEFR_3_NA_NA, #> DWAYSK*GFVTADMIR_2_NA_NA, #> DWAYSK*GFVTADMIR_3_NA_NA #> INFO [2021-04-30 11:44:09] #> == Start the summarization per subplot... #> | | | 0% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |=== | 4% | |=== | 5% | |==== | 6% | |===== | 7% | |====== | 8% | |====== | 9% | |======= | 10% | |======== | 11% | |======== | 12% | |========= | 13% | |========== | 14% | |=========== | 15% | |=========== | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 20% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================= | 24% | |================= | 25% | |================== | 26% | |=================== | 27% | |==================== | 28% | |==================== | 29%
#> Warning: Ran out of iterations and did not converge
#> | |===================== | 30% | |====================== | 31% | |====================== | 32% | |======================= | 33% | |======================== | 34% | |========================= | 35% | |========================= | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 40% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |=============================== | 44% | |=============================== | 45% | |================================ | 46% | |================================= | 47% | |================================== | 48% | |================================== | 49% | |=================================== | 50% | |==================================== | 51% | |==================================== | 52% | |===================================== | 53% | |====================================== | 54% | |======================================= | 55% | |======================================= | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 60% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================= | 64% | |============================================= | 65% | |============================================== | 66% | |=============================================== | 67% | |================================================ | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 74% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 80% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 86%
#> Warning: Ran out of iterations and did not converge
#> | |============================================================= | 87% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 93% | |================================================================== | 94% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 100%INFO [2021-04-30 11:44:11] == the summarization per subplot is done.
#> Starting Protein summarization...
#> INFO [2021-04-30 11:44:11] ** Features with one or two measurements across runs are removed. #> INFO [2021-04-30 11:44:11] ** Fractionation handled. #> INFO [2021-04-30 11:44:11] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-04-30 11:44:11] ** Log2 intensities under cutoff = 16.998 were considered as censored missing values. #> INFO [2021-04-30 11:44:11] ** Log2 intensities = NA were considered as censored missing values. #> INFO [2021-04-30 11:44:11] ** Use all features that the dataset originally has. #> INFO [2021-04-30 11:44:11] #> # proteins: 26 #> # peptides per protein: 1-9 #> # features per peptide: 1-1 #> INFO [2021-04-30 11:44:11] Five or more proteins have only one feature: #> Q9UHD8, #> Q9UHQ9, #> Q9UIF8, #> Q9UL25, #> Q9UNH7 ... #> INFO [2021-04-30 11:44:11] #> CCCP Combo Ctrl USP30_OE #> #> # runs 4 4 4 4 #> #> # bioreplicates 2 2 2 2 #> #> # tech. replicates 2 2 2 2 #> INFO [2021-04-30 11:44:11] Five or more features are completely missing in at least one condition, #> TVYSHLFDHVVNR_4_NA_NA, #> TDQFPLFLIIMGK_2_NA_NA, #> LMKMAR_2_NA_NA, #> TVYSHLFDHVVNR_4_NA_NA, #> TDQFPLFLIIMGK_2_NA_NA #> INFO [2021-04-30 11:44:11] #> == Start the summarization per subplot... #> | | | 0% | |=== | 4% | |===== | 8% | |======== | 12% | |=========== | 15% | |============= | 19% | |================ | 23% | |=================== | 27% | |====================== | 31% | |======================== | 35% | |=========================== | 38% | |============================== | 42% | |================================ | 46% | |=================================== | 50%
#> Warning: Ran out of iterations and did not converge
#> | |====================================== | 54% | |======================================== | 58% | |=========================================== | 62% | |============================================== | 65% | |================================================ | 69% | |=================================================== | 73% | |====================================================== | 77% | |========================================================= | 81% | |=========================================================== | 85% | |============================================================== | 88% | |================================================================= | 92% | |=================================================================== | 96% | |======================================================================| 100%INFO [2021-04-30 11:44:12] == the summarization per subplot is done.
# QCPlot dataProcessPlotsPTM(quant.lf.msstatsptm, type = 'QCPLOT', which.Protein = "allonly", address = FALSE)
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: Item 1 has 0 rows but longest item has 1; filled with NA
#> Warning: Item 3 has 0 rows but longest item has 1; filled with NA
#> Warning: Item 4 has 0 rows but longest item has 1; filled with NA
#> Error in if (unique(datafeature$LABEL) == "L") { datafeature$LABEL <- factor(datafeature$LABEL, labels = c("Endogenous"))}: argument is of length zero
#ProfilePlot dataProcessPlotsPTM(quant.lf.msstatsptm, type = 'PROFILEPLOT', which.Protein = "Q9UQ80_K376", address = FALSE)
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: Item 1 has 0 rows but longest item has 1; filled with NA
#> Warning: Item 3 has 0 rows but longest item has 1; filled with NA
#> Warning: Item 4 has 0 rows but longest item has 1; filled with NA
#> Error in if (unique(datafeature$LABEL) == "L") { datafeature$LABEL <- factor(datafeature$LABEL, labels = c("Endogenous"))}: argument is of length zero