Bioinformatics Preprint 05-005
Mammalian Genomes Contain Thousands of Non-Coding RNAs with Conserved Secondary Structure
Stefan Washietl, Ivo L. Hofacker, Peter F. Stadler
In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comparable information is lacking for non-coding RNAs. We present a comparative screen of vertebrate genomes for structural non-coding RNAs, which evaluates sequence conservation, secondary structure conservation, and thermodynamic stability of putative RNA structures. We predict more than 30\,000 structured RNA elements in the human genome, almost 1000 of which are conserved across all vertebrates. Roughly a third is found in introns of known genes, a sixth are potential regulatory elements in untranslated regions, about half are located far away of any known gene. Only a small fraction of these sequences has been described previously. EST data demonstrate, however, that the majority of them is at least transcribed. The widespread conservation of secondary structure points to a large number of functional ncRNAs in the human genome, which we estimate to be comparable to the number of protein-coding genes.
noncoding RNAs, mammals, RNomics
Return to 2005 working papers list.
Last modified: 2004-03-28 19:56:33 studla