This track contains merged transcript models obtained in the GENCODE Capture Long-Seq (CLS) project, phase 1.
This set of merged transcript models was obtained by merging aligned PacBio reads with compatible intron/exon structures with compmerge, using the non-anchored procedure. This approach merges all transcripts with compatible intron chains into a single "container", regardless of their end support, similarly to e.g. Cuffmerge.
All merged transcript models are derived from aligned PacBio reads with the following properties:
See Lagarde et al. for more details.
HiSeq-supported sets contain merged transcript models that are either mono-exonic (in that case HiSeq support is not applicable), or spliced. If spliced, their entire intron chain is supported by captured HiSeq data, in the form of at least one spliced HiSeq read with the exact same coordinates and strand.
This data was produced as part of the GENCODE project, with funding from the National Human Genome Research Institute (NHGRI).
For inquiries, please contact:
Julien Lagarde (CRG, Barcelona, Spain, julienlag AT gmail.com)
Rory Johnson (University of Bern, Switzerland, rory.johnson AT dkf.unibe.ch)
Roderic Guigo (CRG, Barcelona, Spain, roderic.guigo AT crg.cat)
Supplementary data can be accessed through the CLS portal.
High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing (CLS). Julien Lagarde, Barbara Uszczynska-Ratajczak, Silvia Carbonell, Carrie Davis, Thomas R Gingeras, Adam Frankish, Jennifer Harrow, Roderic Guigo, Rory Johnson. doi: https://doi.org/10.1101/105064.