R/ProgenesistoMSstatsPTMFormat.R
ProgenesistoMSstatsPTMFormat.Rd
Converts non-TMT Progenesis output into the format needed for MSstatsPTM
ProgenesistoMSstatsPTMFormat( ptm_input, annotation, global_protein_input = FALSE, fasta_path = FALSE, useUniquePeptide = TRUE, summaryforMultipleRows = max, fewMeasurements = "remove", removeOxidationMpeptides = FALSE, removeProtein_with1Peptide = FALSE, mod.num = "Single" )
ptm_input | name of Progenesis output with modified peptides, which is wide-format. 'Accession', Sequence', 'Modification', 'Charge' and one column for each run are required |
---|---|
annotation | name of 'annotation.txt' or 'annotation.csv' data which includes Condition, BioReplicate, Run information. It will be matched with the column name of input for MS runs. |
global_protein_input | name of Progenesis output with unmodified peptides, which is wide-format. 'Accession', Sequence', 'Modification', 'Charge' and one column for each run are required |
fasta_path | string containing path to the corresponding fasta file for the modified peptide dataset. |
useUniquePeptide | TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
summaryforMultipleRows | max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities. |
fewMeasurements | 'remove'(default) will remove the features that have 1 or 2 measurements across runs. |
removeOxidationMpeptides | TRUE will remove the modified peptides including 'Oxidation (M)' sequence. FALSE is default. |
removeProtein_with1Peptide | TRUE will remove the proteins which have only 1 peptide and charge. FALSE is default. |
mod.num | For modified peptide dataset, must be one of |
#> # A tibble: 6 x 10 #> ProteinName PeptideSequence Condition BioReplicate Run Intensity #> <chr> <chr> <chr> <chr> <chr> <dbl> #> 1 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH1 CCCP-B1T1 1423906. #> 2 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH1 CCCP-B1T2 877045. #> 3 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH2 CCCP-B2T1 384418. #> 4 Q9UHD8_K262 DAGLK*QAPASR CCCP BCH2 CCCP-B2T2 454858. #> 5 Q9UHD8_K262 DAGLK*QAPASR Combo BCH1 Combo-B1T1 1603377. #> 6 Q9UHD8_K262 DAGLK*QAPASR Combo BCH1 Combo-B1T2 676555. #> # ... with 4 more variables: PrecursorCharge <chr>, FragmentIon <lgl>, #> # ProductCharge <lgl>, IsotopeLabelType <chr>#> # A tibble: 6 x 10 #> ProteinName PeptideSequence Condition BioReplicate Run Intensity #> <chr> <chr> <chr> <chr> <chr> <dbl> #> 1 Q9UHD8 STLINTLFK CCCP BCH2 CCCP-B2T1 367944. #> 2 Q9UHD8 STLINTLFK CCCP BCH2 CCCP-B2T2 341207. #> 3 Q9UHD8 STLINTLFK Combo BCH2 Combo-B2T1 185843. #> 4 Q9UHD8 STLINTLFK Ctrl BCH2 Ctrl-B2T1 529224. #> 5 Q9UHD8 STLINTLFK Ctrl BCH2 Ctrl-B2T2 483355. #> 6 Q9UHD8 STLINTLFK USP30_OE BCH2 USP30_OE-B2T1 447795. #> # ... with 4 more variables: PrecursorCharge <chr>, FragmentIon <lgl>, #> # ProductCharge <lgl>, IsotopeLabelType <chr>