process_vcf {supersigs}R Documentation

Function to transform VCF object into "matrix" format

Description

Transform a VCF object into a data frame of trinucleotide mutations with flanking bases in a wide matrix format. The function assumes that the VCF object contains only one sample and that each row in rowRanges represents an observed mutation in the sample.

Usage

process_vcf(vcf)

Arguments

vcf

a VCF object (from VariantAnnotation package)

Value

process_vcf returns a data frame of mutations, one row per mutation

Examples


# Use example vcf from VariantAnnotation
suppressPackageStartupMessages({library(VariantAnnotation)})
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
vcf <- VariantAnnotation::readVcf(fl, "hg19") 

# Subset to first sample
vcf <- vcf[, 1]
# Subset to row positions with homozygous or heterozygous alt
positions <- geno(vcf)$GT != "0|0" 
vcf <- vcf[positions[, 1],]
colData(vcf)$age <- 50        # Add patient age to colData (optional)

# Run function
dt <- process_vcf(vcf)
head(dt)


[Package supersigs version 1.0.0 Index]