= comprehensive 5'/3'-end boundary set This directory has the following data * comprehensive 5'-end clusters (=TCs) * comprehensive 3'-end clusters * comprehensive 5'/3'-end pairs == files === 5'-end clusters * end5_clusters.txt Comprehensive 5'-end clusters This file is based on tss>TC>tc.txt in the file exchange server Additionally reliability value is assigned ('Reliable' if there are two or more evidences, 'Unknown' if one evidence) 5'-ends are defined with the following transcript sequences * CAGE tags * 5'end of GIS ditags (<2.5 Mbps) * 5'end of GSC ditags (<2.5 Mbps) * RIKEN 5'ESTs * RIKEN mRNA (FANTOM3 103k full-length cDNA set) Tab-delimited text 1) TC_id 2) chr_no 3) strand (+/-) 4) genome region start 5) genome region end (start < end) 6) reliability (1=reliable/0=unknown) 7) the number of CAGE tags 8) the number of GIS ditags 9) the number of GSC ditags 10) the number of RIKEN 5'ESTs 11) the number of RIKEN mRNAs (FANTOM3 103k set) 12) the number of long-SAGE (not used) 13) the number of dbtss (not used) 14) representative position ID (CAGE:ctss_id) 15) representative genome position 16) the number of tags in this representative genome position * end5_clusters.gff GFF file of the above === 3'-end clusters * end3_clusters.txt Comprehensive 3'-end clusters ('Reliable' if there are two or more evidences, 'Unknown' if one evidence) 3'-ends are defined with the following transcript sequences * 3'end of GIS ditags (<2.5 Mbps) * 3'end of GSC ditags (<2.5 Mbps) * public (non-RIKEN) mRNAs * RIKEN mRNAs (FANTOM3 103k full-length cDNA set) * RIKEN 3'ESTs * public (non-RIKEN) 3'ESTs Tab-delimited text 1) 3'end_id 2) chr_no 3) strand (+/-) 4) genome region start 5) genome region end (start < end) 6) reliability (1=reliable/0=unknown) 7) the number of GIS ditags 8) the number of GSC ditags 9) the number of RIKEN mRNAs (FANTOM3 103k set) 10) the number of public (non-RIKEN) mRNAs 11) the number of RIKEN 3'ESTs 12) the number of public (non-RIKEN) 3'ESTs 13) representative position ID (CAGE:ctss_id) 14) representative genome position 15) the number of tags in this representative genome position * end3_clusters.gff GFF file of the above === 5'/3'-end pairs * pair53_clusters.txt Comprehensive 5'/3'-end pair clusters Tab-delimited text 1) 5'/3'-end pair ID 2) chr_no 3) strand (+/-) 4) genome region start 5) genome region end (start < end) 6) reliability (1=reliable/0=unknown) 7) the number of RIKEN mRNAs (FANTOM3 103k set) 8) the number of public (non-RIKEN) mRNAs 9) the number of RIKEN 5'/3'EST pairs 10) the number of GIS ditags 11) the number of GSC ditags * pair53_clusters.gff GFF file of the above 5'/3'-end pair clusters == contributors * 5'-end clusters (TC) Shintaro Katayama and Kenji Nakano * 3'-end clusters Mark Crowe, Christine Wells and Sean Grimmond * 5'/3'-EST pairs Martin Frith * 5'/3'-end pairs Takeya Kasukawa == note * "Reliablity" is different from "completeness".