Repository: http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE Date: September 18th 2014 Contact: sarah.djebali@crg.eu This is a repository for the (Gencode TSS + Fantom5 CAGE peak) clusters produced using different sets of Fantom5 CAGE peaks as input: - 1,048,124 permissive CAGE peaks - 184,827 robust CAGE peaks - 217,572 strict CAGE peaks Please see: - http://fantom.gsc.riken.jp/data/ or http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/Inputs/ to get the CAGE peaks - (Fantom consortium, Nature, 2014) or the http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/Documents/ for a description of how those CAGE peaks were obtained by the Fantom5 consortium. Method: the TSS of the Gencode transcripts (produced by the script https://github.com/sdjebali/MakeGencodeTSS) and for which the CDS start was found (not low confidence, http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/Inputs/gencode.v19.TSS.notlow.gff) were extended by 50bp on each side, strandedly merged between them, and then strandedly merged to the CAGE peaks provided by the Fantom5 consortium (3 cage peak sets, see above, files produced using the script https://github.com/sdjebali/MakeGencCAGETSS). The final 3 TSS cluster files include for each TSS cluster, the coordinates of the Gencode TSS clusters it is made of, as well as the coordinates of the original CAGE peaks it is made of. The three (Gencode TSS + Fantom5 CAGE peak) cluster files are provided at http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/GencTSS_FANTOM5CAGE_Clusters/ as gff version 2 files and include information about: - the coordinates of the cluster - the annotation class of the cluster (CAGEOnly | GencOnly | GencCAGE) - the list of the original Gencode TSS clusters composing the cluster - the list of gene ids of those - the list of the original CAGE peaks composing the cluster. Here is an example: chr1 rikcrg tss 564392 564492 . + . class: GencCAGE list_genctssclus: chr1_564392_564492_+, list_genctssclus: ENSG00000225972.1, list_cageclus: chr1_564452_564463_+,