proteinProperties {yeastExpData}R Documentation

Properties of Yeast proteins

Description

A data frame which details 33 properties of proteins in the Yeast Genome

Usage

data(proteinProperties)

Format

A data frame with 6718 observations on the following 33 variables.

yORF
a factor representing yeast ORF names, with levels Q0010, Q0017, etc.
SGDID
a factor representing SGD IDs
molwt
a numeric vector giving Molecular Weight in Daltons
pi
a numeric vector denoting the theoretical isoelectric point(pI), the pH at which the protein carries no net charge
cai
a numeric vector denoting Codon Adaptation Index
length
a numeric vector denoting length of the protein (number of amino acids)
nterm
a factor representing N Term Sequence with levels MAAACIC MAAAPWY, etc.
cterm
a factor representing N Term Sequence with levels AAAAMLL AAADKKT, etc.
codonBias
a numeric vector denoting Codon Bias

The next set of columns, designated by amino acids, is the number of times that particular residue appears in the protein sequence. For example, if the ALA column is 2, then the protein contains 2 alanines. These columns (should) add up to the length column.

ALA
a numeric vector
ARG
a numeric vector
ASN
a numeric vector
ASP
a numeric vector
CYS
a numeric vector
GLN
a numeric vector
GLU
a numeric vector
GLY
a numeric vector
HIS
a numeric vector
ILE
a numeric vector
LEU
a numeric vector
LYS
a numeric vector
MET
a numeric vector
PHE
a numeric vector
PRO
a numeric vector
SER
a numeric vector
THR
a numeric vector
TRP
a numeric vector
TYR
a numeric vector
VAL
a numeric vector

The remaining columns are:

fop
FOP score, a numeric vector, denoting Frequency of Optimal Codons
gravy
Gravy score, a numeric vector denoting Hydropathicity of Protein
aromaticity
Aromaticity score, a numeric vector denoting Frequency of aromatic amino acids: Phe, Tyr, Trp
type
Feature type, a factor with levels ORF|Dubious ORF|Uncharacterized ORF|Verified ORF|Verified|silenced_gene pseudogene transposable_element_gene

Details

This data frame is downloaded directly from SGD. It contains 33 characteristics for 6714 open reading frames (ORFS). From the SGD README:

“Contains basic protein information about each ORF in SGD. This file does not include information on deleted or merged ORFs. Note, however, that it includes ORFs of all other classifications (Verified, Uncharacterized, and Dubious).”

For more details see http://www.yeastgenome.org/help/protein_page.html.

Source

ftp://genome-ftp.stanford.edu/pub/yeast/protein_info/protein_properties.tab. This file is updated weekly (Saturday). The version used here was downloaded on 2006-11-03.

Examples

data(proteinProperties)
pairs(proteinProperties[, c("molwt", "pi", "cai", "gravy", "aromaticity")],
      pch = ".", col = as.numeric(proteinProperties$type))

[Package yeastExpData version 0.9.9 Index]