Immediate work

Set up link to svn web view of TODO.txt file.


TODO For Alpha Release (0.8?)
-----------------------------

Goals
-----

- Basic support for loading and manipulate genetic data

  Status: 

     - David will try importing his data week of May 15th

     - Scott will try importing his data week of May 22nd
     
- Support for calculating allele frequencies, HWE, and LD.

  Status: 

    - HWE: Inefficient implementation in place.

    - alleleFreq: Done

    - LD: Inefficient biallelic implementation in place.

    - LDMAX C++ code from the GOLD package is in and builds.

       - Greg, David, Nitin will give it a try week of May 15th

    - Insightfuls LD calculation code
 
      - David will start work during week of May 22nd,  probably done 2nd
        week of June. 

- Support for FBAT - see the fbat package

  Status: Done. Vignette in place.

  - Greg, David, Nitin will give it a try week of May 15th

- Basic tools for including genetic data in models (ie, carrier,
  allele.count, homozygote, ...)

  Status: Need to create vignette based on R-News article.

  - Greg to do week of May 22nd.

- Ability to produce a nice marker summary report

  Status: Done. Vignette in place, may neeed more editing.

  - Greg to do week of May 22nd.

- Access to haplo.score code.

  - Basic Wrappers in place.  Check with Weiliang to see if in SVN.

  - Ross will ensure it gets into GeneticsBase.  Done week of May 15th

- Tool for apply a model to each marker

  Status:

    Greg has code from another project that does this.

    - We should benchmark.

    - Done week of May 22nd.


- A vignette demonstrating how to

    - read data from various formats (genotypes and phenotypes) [GeneticsBase: DONE]

    - a nice marker summary report for each marker [GeneticsBase: DONE]

    - perform HWE and LD calculations [?]

    - apply FBAT [FBAT: DONE]

    - add annotation data to the markers

	Look at SNPer package.

    - apply a statistical model to each marker & produce a nice report 

  Status: 

    - Vignette for loading/summary + vignette for fbat

    - Need to add vignette for annotation.

         - Ross will look into using SNPer start week of May 15th finished
           by May 26th,


Specific Tasks 
--------------


- Make list of required and recommended fields for markerInfo,
  sampleInfo, and studyInfo.  

  - Everyone: Look at wiki page

  - For 1.0: put this information into .tex documents in GeneticsBase/inst/doc
 
- as.data.frame:  creates a data frame by taking the sampleInfo data frame 
   and adding  genotype data as columns.


- as.geneSet for "data.frame":  accepts data frame and a vector/list of column 
  indexes/names containing gene information, optional information on how to 
  interpret the gene information. IE, (..., format='sep', sep='/'), 
  (..., format='sep', sep=''),  (..., format='longstring', codes=c('r','a','h') )
  (count';sep=1, format=


- Create a sample-size / experimental design package
  "GeneticsDesign"

	pull in code from genetics / genutils / ...

	+ Weiliang to create, Ross will put into for May 26th.

	+ Scott + Greg to toss things in




TODO for Release 1.0
--------------------


<Jason+Nitin>
12.  Document treatment of missing genotype or allele values!
	      NA --> No knowledge of this genotype
	      A/NA or NA/A --> partial knowledge
	      NA/NA mapped to NA
</Jason+Nitin>

- Define actual objects / fields to store annotation information

TODO for Eventually
-------------------

<?>
- Create a comprehensive set of unit tests
</?>


---------------
Completed Tasks
---------------
<?> 
Split useful code out of readGenes.* for use in geneCodeSet
constructor.  
</?>


<Nitin+Greg>
13. Translate the object manipulation functions in the R genetics
    package
</Nitin+Greg>


<Scott>
2.  When creating geneCodeSet, need to check that all values in
    TransTable column of markerInfo slot match a value in names
    of transTables slot!

3.  Write check methods?

9.  We have decided to require row and column names in callCodes and
    errorMetrics slots.  We will need to check consistency with
    markerInfo and sampleInfo slots, and enforce that the dimnames exist.
</Scott>


<All>
Create 1 example data set each.
</All>

<Greg>
- allele list and missing value code specification to readGenes functions.

- code to decodeCallCodes to properly handle missing value codes

- code to alleleNames, and AlleleCodes to properly handle missing value codes

- methods for 'subset' '[' '[<-' '[[' '[[<-' 
</Greg>


<Greg>
Port LD code from 'genetics' to GeneticsBase.
</Greg>


<Greg>
Optimize performance of readGenes functions.  (Use scan instead of read.table).
</Greg>


<Greg>
Function to read hapmap .ped files
</Greg>

<Greg>
add print method -- fixed show method
</Greg>

<Scott>
1.  Break source file into multiple files?
</Scott>

<Jason+Nitin>
14. Create R helpfile templates --> Jason + Nitin
</Jason+Nitin>

6.  Should callCodes slot be of mode "integer", not "numeric"? 
    Currently, I AM using "integer".  --> KEEP AS INTEGER

<Jason+Scott>
7.  Write function decodeCallCodes.  Call this in, e.g. show method for
    class geneCallSet.   This function should loop over components of
    the transTables slot, NOT over rows of the callCodes slot!  The
    efficiency gain can be VERY large for large objects.


8.  Revise show method for class geneCallSet (see 7. above).
</Jason+Scott>

<Jason+Scott>
10.  Add ploidy slot?
11.  Add phase slot?  How will we encode phase info?

     1. Yes/No for all (logicial scaler)
     2. Yes/No for each Marker (logical vector)
     3. phaseObject (TBD): observation by marker by phase
        probabilities + definitions of contigs + probabilty of contigs
         
</Jason+Scott>

#####
# Old
#####

 

