Description
This track contains read alignments, polyA sites and merged transcript models obtained in the GENCODE Capture Long-Seq (CLS) project, phase 1.
The merged transcript models were obtained by merging aligned PacBio reads with compatible intron/exon structures using compmerge with two distinct procedures, anchored and non-anchored. Briefly,
- The anchored approach prevents the transcripts whose end(s) are supported by CAGE or polyA data from being merged into a longer, compatible transcript "container". This preserves all supported transcript ends in the output, including "internal" sites.
- Conversely, with the non-anchored method, all transcripts with compatible intron chains are merged into a single container, regardless of their end support, similarly to e.g. Cuffmerge.
All merged transcript models are derived from aligned PacBio reads with the following properties:
- If spliced, all their introns must be canonical (GT|GC / AG).
- If monoexonic, they must bear a detectable polyA tail.
See Lagarde et al. for more details.