Hidden treasures in unspliced EST data

Jan Engelhardt, Peter F. Stadler


Th. Biosci. 131: 49-57 (2012)


Several classes of exclusively – or at least predominantly – unspliced non-coding RNAs have been described in the last years, including totally and partially intronic transcripts and long intergenic RNAs. Functionally, they appear to be involved in regulating gene expression, at least in part by associating with the chromatin. Here we systematically analyze the distribution of unspliced ESTs in the human genome. Most appear in clusters overlapping or in the close vicinity of annotated RefSeq genes. Partially Intronic (PIN) unspliced ESTs show complex patterns of overlap with the intron/exon structure of the RefSeq gene. Distinctive patterns of CAGE tags indicate that a large class of unspliced EST clusters forms long extensions of 3’UTRs, at least several hundreds of which probably appear also as independent uaRNAs.


Preliminary version: Proceedings of the 6th International Symposium on Health Informatics and Bioinformatics (HIBIT), Izmir, Turkey 3-5 May 2011.