Report to the Ciliate Research Community
February
27, 2002
Prepared by Eduardo Orias
Here is an update, with:
- News of recent activities in pursuit of funding for the Tetrahymena
genome-sequencing project;
- A request for contributions of materials for a USDA/NSF
grant application;
- News of the availability of additional Tetrahymena
EST and genomic sequences;
- Recent or
upcoming publications with direct relevance to the genome project.
1. NIGMS grant application. From my
last report you may recall that in response to our Concept Paper, Trans-NIH Non
Mammalian Models Committee (NMMC) recommended that we submit a "Letter of
Intent" to the NIH General Medical Sciences Institute (NIGMS) to do the
whole-genome shotgun (WGS) sequencing phase of the project and to establish the
database resources required to make the genomic data accessible to the entire
scientific community. On December 18, 2001 we submitted the "Letter of
Intent", to which NIGMS responded by agreeing to accept an R01 grant
application from us.
The grant application was submitted by the February 1, 2002
NIH deadline, as a collaborative effort of the ciliate research community with
The Institute for Genome Research (TIGR) and the Saccharomyces Genome Database.
The application has a December 1, 2002 start date and covers a three-year
period. Total direct costs are just under $4.7M. TIGR is the lead institution
for the project and Jonathan Eisen is the Principal Investigator. I will be a
co-PI on the application with primary responsibilities for coordinating all the
components, for providing strong representation of the ciliate community, and
for facilitating the contribution of expert advice from the Board of Advisors
(see below) and the ciliate community at large.
TIGR will carry out the requested 8x whole-genome shotgun
sequence coverage of the macronuclear genome, the assembly and the electronic
annotation of the sequence. TIGR's role in originating whole-genome shotgun
sequencing and their experience with the sequencing and assembly of AT-rich
genomes and with protist sequence annotation, adds great strength to this
collaborative project. If fully funded, the requested depth of coverage will be
sufficient to discover virtually every gene in the genome and to generate
reliable sequence assemblies and accurate gene models. The gene sequence will be of high enough quality to be able to
directly exploit the advanced tools for post genomic analysis available in Tetrahymena.
The application also establishes a manually-curated Tetrahymena
Genome Database, in affiliation with the Saccharomyces Genome Database at
Stanford University, under the supervision of J. Michael Cherry. (Some of you
may remember that Mike Cherry got his Ph.D. degree working with Tetrahymena
in Liz Blackburn's lab at UC Berkeley.) This high quality community database
would include at first one -- and later two -- Tetrahymena/ciliate
expert curators. The database would provide at least the following informative
datasets:
- Manual annotation of subsets of genes, with high priority
for those related to current areas of active research in the ciliate community
and to homologs of human genes of unknown function and which lack homologs in
yeast.
- Phenotypes associated with gene knockouts, replacements
and other types of mutations.
- Information on specific gene regulation from the
literature.
- Post-translational modifications and their experimentally
determined roles.
- Linkage & physical maps of MIC and MAC genomes,
including STS, RAPD and MAC chromosome breakpoints relative to the MIC.
- Information relevant to the use of mapped genetic markers
and DNA polymorphisms (e.g., how to test, diagnostic phenotypes).
- Correlated set of all non-systematic sequences, including
the growing number of EST entries.
- Experimental protocols, updating and complementing those
in the Methods in Cell Biology volume
on Tetrahymena edited by David Asai and Jim Forney.
Once the Tetrahymena
database is functioning successfully, we envision that it will further evolve
to become a general ciliate database.
The National Center for Biotechnology Information (NCBI) is
willing to set up a Tetrahymena-specific web page as a component of the
"Genomic Biology" division of their website. NCBI will provide web-based
graphical viewers for browsing, in an integrated manner, Tetrahymena genomic data at multiple levels. These data include
genetic maps of MIC linkage groups and MAC coassortment groups, physical maps,
contig maps and transcript maps. In addition, the Tetrahymena genome data will be integrated with PubMed, DNA and
protein sequence databases, 3D structure information, and BLAST sequence
searching tools. These resources will be linked to Tetrahymena specific resources at TIGR and at the Tetrahymena Genome Database.
A Board of Advisors that includes investigators with broad
expertise in Tetrahymena, Ciliate and Alveolate biology was set up to
provide expert advise to TIGR and the databases. The members are: Kathy
Collins, UC Berkeley; Joe Frankel, University of Iowa; Marty Gorovsky,
University of Rochester; Patrick Keeling, University of British Columbia; Laura
Landweber, Princeton University; Eduardo Orias, UC Santa Barbara; Ron Pearlman,
York University; and Linda Sperling, Centre de Genetique Moleculaire, CNRS,
France.
The application will follow the normal NIH review process.
By summer 2002 we may learn what scientific merit rating the NIH review panel
gave our application. The next level of review will be by the Council of the
National Institute of General Medical Sciences, which is the Institute that
would fund the project. Council review will take place in the Fall '02. It
could easily be November or December 2002 before we learn the final outcome of
our application.
2. USDA/NSF grant application. This
year's call for proposals from the USDA/NSF Microbial Genome Sequencing
initiative has recently been posted (http://www.reeusda.gov/1700/funding/rfamgsp.htm).
The maximum funding level provides us with an opportunity to request a 2-3x WGS
sequence coverage of the Tetrahymena genome. This could be very
important, because NIGMS may not be able to fund the full 8x sequence coverage
that we requested in our application. In collaboration with Jonathan Eisen and
TIGR, we intend to submit a Letter of Intent by the March 15 deadline and a
full application by the May 1, 2002 deadline.
We already have sufficient
material to make an excellent case for the expected contributions of the Tetrahymena
sequence to the investigation of fundamental biological problems for the
NSF-relevant component of the application. Clearly it would help our
application to make an excellent case for the Tetrahymena sequence
contribution also in the area of agriculture. For example, how can the Tetrahymena
sequence:
- Inform the molecular and cell
biology of important food organisms (various classes of vertebrates,
crustaceans, mollusks, crop plants, etc.);
- Lead to technology for
protecting food organisms from disease, while avoiding the massive use of
pesticides and antibiotics;
- Lead to better environmental
monitoring and protection
- Anything else?
Some of Tetrahymena's
sequence contributions to agriculture were already mentioned in the concept
paper (which can be downloaded from http://www.lifesci.ucsb.edu/~genome/ftp)
and are ready to be included in the application. If you can think of additional
ones, please send me a concise referenced statement as soon as you can, and not
later than April 1, 2002. There will be no reminders; when the time comes, I'll
use what I have.
3. NHGRI white paper. Following up on another
recommendation of the NMMC, we submitted a "white paper" to the
National Human Genome Research Institute (NHGRI) by the February 10, 2002
deadline. This was a call for genomes to be sequenced, in between other
contracted projects, in one of the three largest genome sequencing centers
supported by NHGRI (http://www.nhgri.nih.gov/NEWS/org_request_release.html).
Our white paper was submitted in consultation with the Whitehead Institute
Center for Genome Research, which is one of those three centers. The genomes will
be ranked for sequencing priority by a NHGRI panel. The white paper, which had
a 10-page limit, is attached as a Word document, and can be downloaded from http://www.lifesci.ucsb.edu/~genome/ftp
In the white paper we requested 10x WGS sequence coverage and as completely
finished sequence as possible. We may know the results of the ranking by June
2002. If, as we hope, we get a high ranking, the actual sequencing would depend
on when sequencing centers have gaps between projects.
4. Additional Tetrahymena sequence availability.
a) 8,936 additional Tetrahymena EST sequences were
recently submitted to GenBank. They are from the Chilcoat & Turkewitz full-length
cDNA library from exponentially growing cells, sequenced by Integrated Genomics
Inc., with Tetrahymena community funds. Automatic blast analyses of
these sequences will soon join those from earlier ESTs at Mike Reith's website
(http://www.cbr.nrc.ca/reith/tetra/tetra.html).
b) In preparation for the NIGMS application, TIGR prepared
sequencing libraries using purified macronuclear DNA from SB210, an inbred
strain B derivative of T. thermophila. They cloned randomly sheared
size-fractionated DNA of various sizes into their proprietary plasmid vector,
pHOS2. Libraries with inserts smaller than 6 kb were highly stable. Inserts
from several stable libraries were sequenced. Many sequence reads gave
statistically significant matches to proteins in public databases. Jonathan
Eisen will shortly make the sequences and blast results publicly available at
the TIGR website (http://www.tigr.org).
5. Recent or upcoming publications with direct relevance to the
genome project.
Chilcoat ND, Elde NC & Turkewitz AP (2001) An antisense
approach to phenotype-based gene cloning in Tetrahymena. Proceedings of
the National Academy of Sciences of the United States of America, 98:8709-13.
Turkewitz AP, Orias E & Kapler G (2002) Functional Genomics:
The coming of age for Tetrahymena thermophila. Trends in Genetics,
18:35-40.
Shang Y, Song X, Bowen
J, Corstanje R, Gao Y, Gaertig J. & Gorovsky MA (2002) A robust
inducible-repressible promoter greatly facilitates gene knockouts, conditional
expression, and overexpression of homologous and heterologous genes in Tetrahymena
thermophila. Proc. Nat. Acad. Sci.,
in press.
Fillingham,
J., N. Chilcoat, A. Turkewitz, E. Orias, M. Reith, and R. Pearlman, Analysis of expressed sequence tags (ESTs)
in the ciliated protozoan Tetrahymena
thermophila. J. Euk. Microbiol., 2002. In press.
I continue to be indebted and grateful to many of you for
the statements that you submitted last summer for the concept paper. They
enabled us make a very strong case for sequencing the Tetrahymena genome
with high priority, and greatly facilitated the writing of both the NIGMS grant
application and the NHGRI white paper. From the time our Letter of Intent was accepted
by NIGMS, just before the holidays, to the submission of the NHGRI white paper,
we had to proceed on a very tight schedule to formalize contractual
arrangements and write up the various documents. In order to better deal with
the short deadlines, I asked a subset of members of the Steering Committee to
function as a quick-response team in advising me and reviewing document drafts.
Kathy Collins, Marty Gorovsky, Larry Klobutcher and Ron Pearlman kindly agreed
to serve and provided me with extraordinary help and support, often under
extremely short notice, for which I am very grateful as well.