ROSeq

ROSeq - A rank based approach to modeling gene expression

Author: Krishan Gupta

Introduction

ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. Takes in the complete filtered and normalized read count matrix, the location of the two sub-populations and the number of cores to be used.

Installation

The developer version of the R package can be installed with the following R commands:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install('ROSeq')

or can be installed with the following R commands:

library(devtools)
install_github('krishan57gupta/ROSeq')

Vignette tutorial

This vignette uses a tung dataset already inbuilt in same package, to demonstrate a standard pipeline. This vignette can be used as a tutorial as well. Ref: Tung, P.-Y.et al.Batch effects and the effective design of single-cell geneexpression studies.Scientific reports7, 39921 (2017).

Example

Libraries need to be loaded before running.

library(ROSeq)
library(edgeR)
#> Loading required package: limma
library(limma)

Loading tung dataset

samples<-list()
samples$count<-ROSeq::L_Tung_single$NA19098_NA19101_count
samples$group<-ROSeq::L_Tung_single$NA19098_NA19101_group
samples$count[1:5,1:5]
#>                 NA19098.r1.A01 NA19098.r1.A02 NA19098.r1.A03 NA19098.r1.A04
#> ENSG00000237683              0              0              0              1
#> ENSG00000187634              0              0              0              0
#> ENSG00000188976              3              6              1              3
#> ENSG00000187961              0              0              0              0
#> ENSG00000187583              0              0              0              0
#>                 NA19098.r1.A05
#> ENSG00000237683              0
#> ENSG00000187634              0
#> ENSG00000188976              4
#> ENSG00000187961              0
#> ENSG00000187583              0

Data Preprocessing: cells and genes filtering then voom transformation

after TMM normalization

samples$count=apply(samples$count,2,function(x) as.numeric(x))
gkeep <- apply(samples$count,1,function(x) sum(x>0)>5)
samples$count<-samples$count[gkeep,]
samples$count<-limma::voom(ROSeq::TMMnormalization(samples$count))

ROSeq calling

output<-ROSeq(countData=samples$count, condition = samples$group, numCores=1)

Showing results are in the form of pval, padj and log2FC

output[1:5,]
#>            pVals        pAdj      log2FC
#> [1,] 0.296236870 0.578204626 -0.03544089
#> [2,] 0.000262790 0.005962628 -0.07959219
#> [3,] 0.649706402 0.830860755  0.03308186
#> [4,] 0.054904648 0.197242310 -0.08569478
#> [5,] 0.007752211 0.045825432 -0.05584928