HistoryFANTOM is the abbreviation of the "Functional Annotation of the Mouse", or the "Functional Annotation of the Mammalians", as we have produced data to analyze the human transcriptome. The objectives of our research activities have been implemented in the last few years with support of the FANTOM International Consortium organized and led by RIKEN (The Institute of Physical and Chemical Research), and in the recent year, with the support of the Genome Network Project.
In the previous FANTOM-1 (Kawai et al., Nature, 2001 Feb 08; 409:685-690) and FANTOM-2 (Okazaki et al., Nature, 2002 Dec 05; 420:563-573; Genome Research special issue June 2003) Projects we have been focusing on full-length cDNA construction, sequencing and annotation, as discussed in the FANTOM-1 and FANTOM-2 scientific meetings. Through FANTOM-1 and FANTOM-2, respectively 21,076 and 39,694 cDNA clones were sequenced and annotated, for a total of 60,770 cDNA clones. The cDNA, clones ID information, have been open to public for years at FANTOM-2 server (http://fantom2.gsc.riken.jp/db/ ; ftp://fantom.gsc.riken.jp/fantomdb/2.1.1/).
In the preparation of the FANTOM 3 Project, we have modified our strategy, in order to achieve data that reveal the dynamic regulation of the transcriptome. In particular, we have prepared a novel dataset aiming at:
We have been preparing novel cDNAs, for a total of 103,000 full length cDNAs. These cDNAs have been annotated in a teleconference annotation called MATRICS-RELOADED, which is similar to the MATRICS annotation teleconference, which took place early in the 2002 for the FANTOM-2 project. The MATRICS-RELOADED teleconference involved more than 100 scientists from all over the world. The functional annotations of the full length cDNAs are available at FANTOM-3 server (http://fantom3.gsc.riken.jp/db/ ; ftp://fantom.gsc.riken.jp/fantomdb/3.0/).
As novel cDNAs are useful but not sufficient to provide functional annotation of the transcriptome, we prepared three completely new types of data. Among the completely novel datasets prepared, there are the cap-analysis gene expression libraries (CAGE) (Shiraki et al., PNAS, 2003 Dec 23;100:15776-15781). CAGE technology allows high-throughput gene expression analysis and the profiling of transcriptional start sites, including promoter usage analysis. We have prepared two additional datasets, the Gene Identification Signature (GIS)(Ng et al., Nature Methods, 2005 Feb;2:105-111) and the Genome Signature Cloning (GSC) (unpublished). These two technologies allow capturing tag signatures of both the 5' and 3' ends, and therefore high-throughput identification of mRNA variants. The GSC differs from the GIS for using subtracted libraries, which allow detection of rare transcripts.
After the preparation of the datasets, completed at the end of June 2004, we have been meeting twice with a consistent part of the members of the consortium. Around the time of the Japanese festival called "Tanabata", at which people make wishes to the stars, the Fantom-3 pre-meeting (Tanabata Meeting) took place from July 4th- 8th, 2004 at RIKEN Yokohama Institute in Yokohama, Japan, where we had the first look at the data, and decided the division of the tasks in working groups. Later, after two months of additional analysis, we have the final meeting, called the Fantom-3 Harvest Meeting, which took place from September 10th-14th 2004 at RIKEN Main Campus in Wako-city, Japan. We named this meeting "Harvest Meeting" as it coincided with the time of rice harvest in Japan in hope that the meeting would be as fruitful as the harvesting of this season after a long summer of calculation.
During the Harvest Meeting, we have discussed the data and the vision of the transcriptome as revealed by the novel dataset, in an interactive meeting. The meeting was intentionally organized without a fixed program, in order to maximize discussions and brainstorm meetings, and a final summary of the main findings, which were then prepared for the publication.
The work of the consortium was recognized by the publication of two papers in the special RNA issues of September 2nd 2005 of Science Magazine. The articles are "The transcriptional landscape of the mammalian genome" (Carninci et al., Science, 2005 Sep 02; 309:1559-1563) and "Antisense transcription in the mammalian transcriptome" (Katayama et al., Science, 2005 Sep 02; 309:1564-1566), and provide a novel view of the complexity of the transcriptome, the concept of gene and independent transcript, and the cross-regulation of sense-antisense RNA networks.
We have also prepared new databases and genome viewers, including genome/transcriptome viewers to take into account the complexity and dynamics of the transcriptome. In particular, two databases (the CAGE basic viewer and the CAGE analysis viewer) allow storing and analyzing the CAGE data. These databases are available without restrictions and we hope that these will benefit the science of the whole community.
See also the press-release (main,extended) for a quick summary of the findings of the Fantom-3.