Updates from March, 2009 Toggle Comment Threads | Keyboard Shortcuts

  • Jason Stajich 3:52 am on March 1, 2009 Permalink | Reply
    Tags: , Fungi, , seminars,   

    Brainstorming ideas 

    After a visit to Cornell visiting with Plant Pathology & Plant-Microbe Biology dept, Pseudomonas syringae groups, and SGN and also a breakfast with Tim Hubbard when he was in Berkeley I had a few ideas.

    • We need to be able to put the power of annotation in the hands of more people.  Community assisted annotation at the level of just function, linking to articles, and general curation should be accessible ala-wikipedia.
    • For genome annotation though, there is a more specialized need to be able to incorporate data from different sources. Git-like repository for genome annotation (in GFF) which can be served up to Gbrowse. Edits can be saved to ones own branch.  (all of this assumes the same reference genome assembly which is about the level I’m comfortable worrying about — tho some of the genome projector type tools would seem to make it easy to lift annotation from one assembly to the other).
    • Would probably necessitate a GenomeAnnotationDiff tool.  This might be already accomplished by tools that the Yandell lab has produced described in publication by Eilbeck et al.
    • Gene page with community annotation tools at SGN are ready to go and they have VMs to avoid having to install all the software. I even saw a cool QTL on the fly calculation.  The challenges I see in our data is always linking the data from one context to another how we make this useful. Will have to try and do a transformation of some of the different data we have here.
    • The SGN approach is to use aspects of Chado for the schema that deals with ontologies/controlled vocabularies but to also have domain specific databases for annotation and related info rather than the giant “everything is a feature” that is the Chado-way and doesn’t seem to scale.
    • It is about time to try out hadoop/MapReduce on our big datasets and to also earnestly start running the automated the all-vs-all ortholog prediction scripts on our genomes, there are just too many times it seems important to have an updated dataset – something to deploy on new hardware environment this summer.
    • No one has figured out how to interface with NCBI/GenBank/EMBL to deal with the updating of genomes in a sensible — basically all the really complicated systems are essentially keeping the bulk of the data in their own domain-specific databases and at some appointed times feeding that data back in, but often this is a huge process and only works where there is a real effort from both NCBI/GenBank/EMBL and the group.  E.g. Ensembl has the CCDS and RefSeq projects that can take the output from Ensembl and feed that back into the system.
      What would a comparative reannotation of X fungal genomes system be able to do with the data?

    On the Plant Path & fungal side of discussions

    • Looking at multiple genotypes of both the host and pathogen seem like a really smart way to start to explore the effects of mutations. With so many more tools now in both systems it seems like this would be next logical arraying of experimental designs.
    • I really need to get some movies made of Bd (Chytrid) zoospores swimming around, would make for better introductions to talks, I had to settle for showing oomycete zoospores which are cool but not the same.
    • There needs to be new/better tools for population genetics for systems where the populations are clearly not in Hardy-Weinberg equilibrium such as newly introduced pathogens
    • Closeup pictures of fungi are really cool especially through the boroscope
     
    • Paul Gardner 5:30 pm on March 3, 2009 Permalink | Reply

      This is a really interesting post. I heartily agree that we need more community assisted annotation. I’m getting incredibly frustrated about not being able to fix EMBL annotation directly – it is much too painful contacting the original authors about updates. Also, the published sequences that don’t make it into EMBL are a further source of pain.

      I’m surprised TH didn’t mention DAS – it’s rather similar in spirit to your gff/git idea. Except I like your concept better as I’m damned if I can figure DAS out yet.

    • Jason Stajich 6:30 pm on March 3, 2009 Permalink | Reply

      Actually I was implicitly thinking of DAS as the way the annotation goes from GIT (or whatever repository) into the genome browser/apollo view/etc. But that there was some simple data storage on the user’s side and that DAS was the middleware to connect things. One can then have these simple local servers advertised through the DAS registry. This was the sort of thing Tim and I was talking about at breakfast, but didn’t quite put down in the text there. Thanks for reminding me!

      I really think we’re not going to be able to map the curation resources to the data deluge without some sort of community annotation work and completely echo your frustration with what to do when something needs to be corrected in an EMBL/GenBank database.

  • Jason Stajich 3:26 am on February 7, 2007 Permalink | Reply  

    science blogging 

    Most of the science related blogging is moved to fungal genomes blog (or whatever better name we come up with).  Tom and I are trying to do a better job capturing the info from our discussions and papers we are reading through the site.

     
  • Jason Stajich 7:47 pm on January 19, 2007 Permalink | Reply  

    Phycomyces genome now available 

    phycomycesThe JGI has released the Phycomyces blakesleeanus genome. This represents the second Zygomycete genome sequence that has been released in addition to Rhizopus oryzae that was released by the Broad Institute last year. We are now getting a better look at the basal fungal genomes including the Chytrids and Zygomycetes. Much more on specifics of Phycomyces biology and history are on this site run by the group organizing the genome analysis.

    I find one of the most interesting things about P. blakesleeanus is its phototropism. We know light sensins is controlled in part by the gene white-collar 1. A homolog of this gene in Neurospora crassa is involved as an oscillator circadian rhythm. Of course many more genes are involve in pathways for light sensing including some really old proteins like phytochromes.

    There will be a lot of cool analyses to do with this genome beyond phototropism. I am looking forward to seeing what gene families are unique and expanded in this species relative to the other zygomycete. It also looks like it is quite intron rich much like the Basidiomycetes, further supporting the idea that fungi had intron rich ancestors.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel
Follow

Get every new post delivered to your Inbox.