Bioinformatics Preprint 07-001
Download:
[PDF]
[Supplemental Material]
Titel:
Computational RNomics of Drosophilids
Author(s):
Dominic Rose,
Jörg Hackermüller,
Stefan Washietl,
Sven Findei&zslig;,
Kristin Reiche,
Jana Hertel,
Peter F. Stadler,
Sonja J. Prohaska
Submitted
Abstract:
Recent experimental and computational studies have provided overwhelming
evidence for a plethora of diverse transcripts that are unrelated to
protein-coding genes. One subclass consists of those RNAs that require
distinctive secondary structure motifs to exert their biological
function and hence exhibit distinctive patterns of sequence conservation
characteristic for positive selection on RNA secondary structure.
The deep-sequencing of 12 Drosophilid species by the BDGB provides an
ideal data set of comparative computational approaches to determine
those genomic loci that code for evolutionarily conserved RNA motifs.
This class of loci include the majority of the known small ncRNAs as well
as structured RNA motifs in mRNAs. We report here on a genome-wide survey
using RNAz.
We obtain 16.000 high quality predictions among which we recover the
majority of the known ncRNAs. Taking a pessimistically estimated false
discovery rate of 40% into account, this implies that at least some ten
thousand loci in the Drosophila genome show the hallmarks of
stabilizing selection action of RNA structure, and hence are most likely
functional at the RNA level. A subset of RNAz predictions
overlapping with TRF1 and BRF binding sites [Isogai et al., EMBO J.,
Epub Dec 2006], which are plausible candidates of pol-III transcripts
have been studied in more detail. Among these sequence we identify
several ``clusters'' of ncRNA candidates with striking structural
similarities.
The statistical evaluation of the RNAz predictions in comparison
with a similar analysis of vertebrate genomes [Washietl et al.,
Nat.Biotech. 23: 1383-1390 (2005)] shows that
qualitatively similar fractions of structured RNAs are found in introns,
UTRs, and intergenic regions. The intergenic RNA structures, however, are
concentrated much more closely around known protein-coding loci,
suggesting that flies have significantly smaller complement of
independent structured ncRNAs compared to mammals.
Keywords:
Drosophilid, comparative genomics, ncRNAs, RNA secondary structure
Return to 2007 working papers list.
Last modified: 2006-08-09 15:54:23 xtof