Report to the Ciliate Research Community

February 27, 2002

Prepared by Eduardo Orias

 

Here is an update, with:

- News of recent activities in pursuit of funding for the Tetrahymena genome-sequencing project;

- A request for contributions of materials for a USDA/NSF grant application;

- News of the availability of additional Tetrahymena EST and genomic sequences;

- Recent or upcoming publications with direct relevance to the genome project.

 

1. NIGMS grant application. From my last report you may recall that in response to our Concept Paper, Trans-NIH Non Mammalian Models Committee (NMMC) recommended that we submit a "Letter of Intent" to the NIH General Medical Sciences Institute (NIGMS) to do the whole-genome shotgun (WGS) sequencing phase of the project and to establish the database resources required to make the genomic data accessible to the entire scientific community. On December 18, 2001 we submitted the "Letter of Intent", to which NIGMS responded by agreeing to accept an R01 grant application from us.

 

The grant application was submitted by the February 1, 2002 NIH deadline, as a collaborative effort of the ciliate research community with The Institute for Genome Research (TIGR) and the Saccharomyces Genome Database. The application has a December 1, 2002 start date and covers a three-year period. Total direct costs are just under $4.7M. TIGR is the lead institution for the project and Jonathan Eisen is the Principal Investigator. I will be a co-PI on the application with primary responsibilities for coordinating all the components, for providing strong representation of the ciliate community, and for facilitating the contribution of expert advice from the Board of Advisors (see below) and the ciliate community at large.

 

TIGR will carry out the requested 8x whole-genome shotgun sequence coverage of the macronuclear genome, the assembly and the electronic annotation of the sequence. TIGR's role in originating whole-genome shotgun sequencing and their experience with the sequencing and assembly of AT-rich genomes and with protist sequence annotation, adds great strength to this collaborative project. If fully funded, the requested depth of coverage will be sufficient to discover virtually every gene in the genome and to generate reliable sequence assemblies and accurate gene models. The gene sequence will be of high enough quality to be able to directly exploit the advanced tools for post genomic analysis available in Tetrahymena.

 

The application also establishes a manually-curated Tetrahymena Genome Database, in affiliation with the Saccharomyces Genome Database at Stanford University, under the supervision of J. Michael Cherry. (Some of you may remember that Mike Cherry got his Ph.D. degree working with Tetrahymena in Liz Blackburn's lab at UC Berkeley.) This high quality community database would include at first one -- and later two -- Tetrahymena/ciliate expert curators. The database would provide at least the following informative datasets:

- Manual annotation of subsets of genes, with high priority for those related to current areas of active research in the ciliate community and to homologs of human genes of unknown function and which lack homologs in yeast.

- Phenotypes associated with gene knockouts, replacements and other types of mutations.

- Information on specific gene regulation from the literature.

- Post-translational modifications and their experimentally determined roles.

- Linkage & physical maps of MIC and MAC genomes, including STS, RAPD and MAC chromosome breakpoints relative to the MIC.

- Information relevant to the use of mapped genetic markers and DNA polymorphisms (e.g., how to test, diagnostic phenotypes).

- Correlated set of all non-systematic sequences, including the growing number of EST entries.

- Experimental protocols, updating and complementing those in the Methods in Cell Biology volume on Tetrahymena edited by David Asai and Jim Forney.

Once the Tetrahymena database is functioning successfully, we envision that it will further evolve to become a general ciliate database.

 

The National Center for Biotechnology Information (NCBI) is willing to set up a Tetrahymena-specific web page as a component of the "Genomic Biology" division of their website. NCBI will provide web-based graphical viewers for browsing, in an integrated manner, Tetrahymena genomic data at multiple levels. These data include genetic maps of MIC linkage groups and MAC coassortment groups, physical maps, contig maps and transcript maps. In addition, the Tetrahymena genome data will be integrated with PubMed, DNA and protein sequence databases, 3D structure information, and BLAST sequence searching tools. These resources will be linked to Tetrahymena specific resources at TIGR and at the Tetrahymena Genome Database.

 

A Board of Advisors that includes investigators with broad expertise in Tetrahymena, Ciliate and Alveolate biology was set up to provide expert advise to TIGR and the databases. The members are: Kathy Collins, UC Berkeley; Joe Frankel, University of Iowa; Marty Gorovsky, University of Rochester; Patrick Keeling, University of British Columbia; Laura Landweber, Princeton University; Eduardo Orias, UC Santa Barbara; Ron Pearlman, York University; and Linda Sperling, Centre de Genetique Moleculaire, CNRS, France.

 

The application will follow the normal NIH review process. By summer 2002 we may learn what scientific merit rating the NIH review panel gave our application. The next level of review will be by the Council of the National Institute of General Medical Sciences, which is the Institute that would fund the project. Council review will take place in the Fall '02. It could easily be November or December 2002 before we learn the final outcome of our application.

 

2. USDA/NSF grant application. This year's call for proposals from the USDA/NSF Microbial Genome Sequencing initiative has recently been posted (http://www.reeusda.gov/1700/funding/rfamgsp.htm). The maximum funding level provides us with an opportunity to request a 2-3x WGS sequence coverage of the Tetrahymena genome. This could be very important, because NIGMS may not be able to fund the full 8x sequence coverage that we requested in our application. In collaboration with Jonathan Eisen and TIGR, we intend to submit a Letter of Intent by the March 15 deadline and a full application by the May 1, 2002 deadline.

 

We already have sufficient material to make an excellent case for the expected contributions of the Tetrahymena sequence to the investigation of fundamental biological problems for the NSF-relevant component of the application. Clearly it would help our application to make an excellent case for the Tetrahymena sequence contribution also in the area of agriculture. For example, how can the Tetrahymena sequence:

- Inform the molecular and cell biology of important food organisms (various classes of vertebrates, crustaceans, mollusks, crop plants, etc.);

- Lead to technology for protecting food organisms from disease, while avoiding the massive use of pesticides and antibiotics;

- Lead to better environmental monitoring and protection

- Anything else?

Some of Tetrahymena's sequence contributions to agriculture were already mentioned in the concept paper (which can be downloaded from http://www.lifesci.ucsb.edu/~genome/ftp) and are ready to be included in the application. If you can think of additional ones, please send me a concise referenced statement as soon as you can, and not later than April 1, 2002. There will be no reminders; when the time comes, I'll use what I have.

 

3. NHGRI white paper. Following up on another recommendation of the NMMC, we submitted a "white paper" to the National Human Genome Research Institute (NHGRI) by the February 10, 2002 deadline. This was a call for genomes to be sequenced, in between other contracted projects, in one of the three largest genome sequencing centers supported by NHGRI (http://www.nhgri.nih.gov/NEWS/org_request_release.html). Our white paper was submitted in consultation with the Whitehead Institute Center for Genome Research, which is one of those three centers. The genomes will be ranked for sequencing priority by a NHGRI panel. The white paper, which had a 10-page limit, is attached as a Word document, and can be downloaded from http://www.lifesci.ucsb.edu/~genome/ftp In the white paper we requested 10x WGS sequence coverage and as completely finished sequence as possible. We may know the results of the ranking by June 2002. If, as we hope, we get a high ranking, the actual sequencing would depend on when sequencing centers have gaps between projects.

 

4. Additional Tetrahymena sequence availability.

a) 8,936 additional Tetrahymena EST sequences were recently submitted to GenBank. They are from the Chilcoat & Turkewitz full-length cDNA library from exponentially growing cells, sequenced by Integrated Genomics Inc., with Tetrahymena community funds. Automatic blast analyses of these sequences will soon join those from earlier ESTs at Mike Reith's website (http://www.cbr.nrc.ca/reith/tetra/tetra.html).

 

b) In preparation for the NIGMS application, TIGR prepared sequencing libraries using purified macronuclear DNA from SB210, an inbred strain B derivative of T. thermophila. They cloned randomly sheared size-fractionated DNA of various sizes into their proprietary plasmid vector, pHOS2. Libraries with inserts smaller than 6 kb were highly stable. Inserts from several stable libraries were sequenced. Many sequence reads gave statistically significant matches to proteins in public databases. Jonathan Eisen will shortly make the sequences and blast results publicly available at the TIGR website (http://www.tigr.org).

 

5. Recent or upcoming publications with direct relevance to the genome project.

Chilcoat ND, Elde NC & Turkewitz AP (2001) An antisense approach to phenotype-based gene cloning in Tetrahymena. Proceedings of the National Academy of Sciences of the United States of America, 98:8709-13.

Turkewitz AP, Orias E & Kapler G (2002) Functional Genomics: The coming of age for Tetrahymena thermophila. Trends in Genetics, 18:35-40.

Shang Y, Song X, Bowen J, Corstanje R, Gao Y, Gaertig J. & Gorovsky MA (2002) A robust inducible-repressible promoter greatly facilitates gene knockouts, conditional expression, and overexpression of homologous and heterologous genes in Tetrahymena thermophila. Proc. Nat. Acad. Sci., in press.

Fillingham, J., N. Chilcoat, A. Turkewitz, E. Orias, M. Reith, and R. Pearlman, Analysis of expressed sequence tags (ESTs) in the ciliated protozoan Tetrahymena thermophila. J. Euk. Microbiol., 2002. In press.

 

I continue to be indebted and grateful to many of you for the statements that you submitted last summer for the concept paper. They enabled us make a very strong case for sequencing the Tetrahymena genome with high priority, and greatly facilitated the writing of both the NIGMS grant application and the NHGRI white paper. From the time our Letter of Intent was accepted by NIGMS, just before the holidays, to the submission of the NHGRI white paper, we had to proceed on a very tight schedule to formalize contractual arrangements and write up the various documents. In order to better deal with the short deadlines, I asked a subset of members of the Steering Committee to function as a quick-response team in advising me and reviewing document drafts. Kathy Collins, Marty Gorovsky, Larry Klobutcher and Ron Pearlman kindly agreed to serve and provided me with extraordinary help and support, often under extremely short notice, for which I am very grateful as well.