Description

This track contains merged transcript models obtained in the GENCODE Capture Long-Seq (CLS) project, phase 1.

This set of merged transcript models was obtained by merging aligned PacBio reads with compatible intron/exon structures with compmerge, using the non-anchored procedure. This approach merges all transcripts with compatible intron chains into a single "container", regardless of their end support, similarly to e.g. Cuffmerge.

All merged transcript models are derived from aligned PacBio reads with the following properties:

If spliced, all their introns must be canonical (GT|GC / AG).
If monoexonic, they must bear a detectable polyA tail.

See Lagarde et al. for more details.

Hi-Seq Support

HiSeq-supported sets contain merged transcript models that are either mono-exonic (in that case HiSeq support is not applicable), or spliced. If spliced, their entire intron chain is supported by captured HiSeq data, in the form of at least one spliced HiSeq read with the exact same coordinates and strand.

Display Conventions and Configuration

Red Transcript models with novel intron chains (with respect to GENCODE 20/M3) according to comptr.
Black Transcript models with known intron chains (with respect to GENCODE 20/M3) according to comptr, or monoexonic.

Credits

This data was produced as part of the GENCODE project, with funding from the National Human Genome Research Institute (NHGRI).

For inquiries, please contact:
Julien Lagarde (CRG, Barcelona, Spain, julienlag AT gmail.com)
Rory Johnson (University of Bern, Switzerland, rory.johnson AT dkf.unibe.ch)
Roderic Guigo (CRG, Barcelona, Spain, roderic.guigo AT crg.cat)

References

Supplementary data can be accessed through the CLS portal.

High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing (CLS). Julien Lagarde, Barbara Uszczynska-Ratajczak, Silvia Carbonell, Carrie Davis, Thomas R Gingeras, Adam Frankish, Jennifer Harrow, Roderic Guigo, Rory Johnson. doi: https://doi.org/10.1101/105064.