keep_abundant {tidybulk} | R Documentation |
keep_abundant() takes as input a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | and returns a 'tbl' with additional columns for the statistics from the hypothesis test.
keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 ) ## S4 method for signature 'spec_tbl_df' keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 ) ## S4 method for signature 'tbl_df' keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 ) ## S4 method for signature 'tidybulk' keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 ) ## S4 method for signature 'SummarizedExperiment' keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 ) ## S4 method for signature 'RangedSummarizedExperiment' keep_abundant( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, factor_of_interest = NULL, minimum_counts = 10, minimum_proportion = 0.7 )
.data |
A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | |
.sample |
The name of the sample column |
.transcript |
The name of the transcript/gene column |
.abundance |
The name of the transcript/gene abundance column |
factor_of_interest |
The name of the column of the factor of interest. This is used for defining sample groups for the filtering process. It uses the filterByExpr function from edgeR. |
minimum_counts |
A real positive number. It is the threshold of count per million that is used to filter transcripts/genes out from the scaling procedure. |
minimum_proportion |
A real positive number between 0 and 1. It is the threshold of proportion of samples for each transcripts/genes that have to be characterised by a cmp bigger than the threshold to be included for scaling procedure. |
At the moment this function uses edgeR (DOI: 10.1093/bioinformatics/btp616)
Underlying method: edgeR::filterByExpr( data, min.count = minimum_counts, group = string_factor_of_interest, min.prop = minimum_proportion )
A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A 'SummarizedExperiment' object
A 'SummarizedExperiment' object
keep_abundant( tidybulk::se_mini )