Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome

Stefan Washietl, Ivo L. Hofacker, Melanie Lukasser, Alexander Hüttenhofer, Peter F. Stadler


PREPRINT 05-005:
paperID


Nature Biotech. 23: 1383-1390 (2005)


In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comparable information is lacking for non-coding RNAs. We present a comparative screen of vertebrate genomes for structural non-coding RNAs, which evaluates sequence conservation, secondary structure conservation, and thermodynamic stability of putative RNA structures. We predict more than 30\,000 structured RNA elements in the human genome, almost 1000 of which are conserved across all vertebrates. Roughly a third is found in introns of known genes, a sixth are potential regulatory elements in untranslated regions, about half are located far away of any known gene. Only a small fraction of these sequences has been described previously. EST data demonstrate, however, that the majority of them is at least transcribed. The widespread conservation of secondary structure points to a large number of functional ncRNAs in the human genome, which we estimate to be comparable to the number of protein-coding genes.


noncoding RNAs, mammals, RNomics