Description of Tag Cluster annotation: A tag cluster(TC) is defined by overlapping tags on the same strand. Below are a few pointers due to common misunderstandings on promoters and this type of data: 1: It is important to realize that while many tag clusters are corresponding to 5' ends of known mRNAs, many known genes have multiple alternative promoters. Also, the CAGE data indicates many novel promoters. Thuis, there is no 1:1 correlation betwwn TCs and known genes. 2: The count of tags in TCs is roughly proportional to expression level, although it is important to realize that the total tag count in different libraries differ, so , for instance, the number of liver tags cannot be compared to the number of heart tags without normalization. This also means that TCs with few tags likely correspond to low-expressed transcripts 3: Remaining annotation for each TC is found within the stabdard CAGE databases: see the FAQ http://fantom.gsc.riken.jp/FAQ.html 4: The start position of the tags in the clusters form a distribution, whcih can be classified if the tag count is high enough. The "class" column below indicates this, where BR = Broad tss distribution SP = Single Peak tag distribution MU = Multimodal tag distribution PB = Broad distribution with a sharp peak Only TCs with 100 tags or more were classified this way. Other TCs are unclassified ("-") 5: We predicted TATA sites using the Bucher model in proximity to the TC, and overlap with CpG island (from the UCSC database). If TATA and/or CpG islands are present, a "1" insetad of "-" is present in the corresponding column.