A. 2010 PROJECT SUMMARY
Dr. Joseph R. Ecker, The Salk Institute for Biological Studies, Principal Investigator (P.I.).
2. Summary of the Proposed Project (Phase 2: 2003-2005)
The proposed study pertains to the development of research tools and resources for the plant biology community. The elucidation of the DNA sequence of the 125-Mb Arabidopsis genome revealed the presence of approximately 25,000 genes using various computational approaches. In order for this vast amount of information to be useful, a more precise understanding of each of the transcriptional units is required. To fulfill this need, the Salk, Stanford, and PGEC Consortium (SSP) Consortium initiated an ORFeome project to obtain clones for each protein coding open-reading-frame (ORF) in Arabidopsis. The project goals were to experimentally verify the annotation and construct full-length ORF clones for 10,500 of the Arabidopsis genes. This study resulted in the experimental verification of 30% of the annotated genes and revealed that the current genome annotation is only 68% correct. In addition, the project produced a set of 10,500 publically available ORF clones for plant functional and proteomics studies. The immediate near tem goal of this two-year 2010 project is to experimentally verify the annotation of an additional 3,330 genes by full-length cDNA sequencing and the construction/sequencing of 3,333 additional ORF clones for new transcriptional units. We will focus on genes that are known to be transcribed, so called "annotated expressed genes" but for which full-length ORF clones are not available. ORFs will be constructed using RT-PCR and they will be subcloned into universal recombination vector system. Transcription unit sequence information will be deposited in GenBank and displayed on our web accessible database (http;//signal.salk.edu). Clones produced by the project will be deposited with the Arabidopsis Biological Resource Center at Ohio State University.
The resources developed by this project will be widely available to a large number of researchers and will provide the basis for a wide variety of projects that rely upon whole genome information. An important feature of this proposal is that all of the ORF clones and sequences will be made freely available to the research community as soon as they are produced. The beneficiaries of this program will be the entire plant biology community as both the sequence information and the clones will provide plant biologists with essential reagents necessary to elucidate the functions of the Arabidopsis proteome. Another important feature of the project is the training of undergraduate students in bioinformatics and genomic methodologies. The completion of the proposed research will provide an important resource to the plant biology community, enabling a variety of future research endeavors such as whole genome protein arrays and the rapid creation of plants that ectopically express any cDNA that is regulated using any promoter of choice. The long-term impact of these enabling tools and technologies on agriculture is expected to be profound, providing fundamental knowledge for the construction of new plant species with superior agronomic traits.
3. Summary of the Proposed Project (Phase 3: 2005-2007)
This research program pertains to the development of tools and resources for the plant biology community. Computational studies of the genome sequence of the reference plant Arabidopsis thaliana have identified approximately 30,000 genes. In order to carry out a variety of functional genomics and proteomics applications, it is essential to identify all of the genes and determine their transcription unit structures. This project will utilize a newly developed single chip whole genome tiling array to experimentally map the transcription units in the Arabidopsis genome. The transcription unit mapping information will be used to amplify and clone, in recombination-based vector, 6,000 full-length (FL) cDNA and open-reading-frame (ORF) clones. The DNA sequences of each clone will be determined to high accuracy and this information can be used to improve the genome annotation. The construction of an error-free ORF clone for each protein-coding gene will enable a variety of functional genomics and proteomics studies. All cDNA/ORF clones will be deposited with the Arabidopsis Biological Resource Center and the clone sequences will be will be deposited in GenBank. DNA sequence and tiling array hybridization data will also be displayed on the SIGnAL project web site: http://signal.salk.edu.
Broader Impacts: The beneficiaries of this research program include the entire plant biology community. The transcription unit DNA sequence information and cDNA/ORF clones produced by this project will provide investigators with essential information necessary to elucidate the functions of the Arabidopsis proteome. The collection of Arabidopsis ORF clones will enable the construction of whole genome protein arrays, the development of protein-protein interaction maps and the ability to rapidly create plants that ectopically express any ORF using any promoter of choice. The long-term impact of these enabling tools and technologies on agriculture is expected to be profound, providing fundamental knowledge for the construction of plants with superior agronomic traits. Importantly, all of the ORF clones, array data and DNA sequences will be made freely available to the research community. Finally, an important feature of the program is the training of high school and undergraduate students in bioinformatics and functional genomic methodologies.
|© SIGnAL 2001-2017||