SparseMFEFold - Time and Space-efficient Sparsified Minium Free Energy Folding of RNAs

Implementation of the method presented in the manuscript

Sparse RNA folding revisited: space-efficient minimum free energy structure prediction

by Sebastian Will and Hosna Jabbari

This work was presented at WABI 2015. An extended abstract was published in the conference proceedings.

Sebastian Will and Hosna Jabbari. Sparse Rna folding revisited: space-efficient minimum free energy prediction. In Mihai Pop and Hélène Touzet, editors, Proceedings of the 15th Workshop on Algorithms in Bioinformatics (WABI 2015), volume 9289 of LNCS, pages 257–270. Springer Berlin Heidelberg, 2015. (doi:10.1007/978-3-662-48221-6_19)

Software written by Sebastian Will

New Benchmark Results from the submitted manuscript are given below.

Download

SparseMFEFold is free software under GNU GPL 3. The SparseMFEFold repository is hosted on GitHub.

Installation

Please install from source. The package is easy to compile and install via autotool's ./configure; make; make install.

Dependencies: RNA library of the Vienna RNA package 2.x

Help

Usage: sparsemfefold[options] [sequence]

Read RNA sequence from stdin or cmdline; predict minimum
free energy and optimum structure using the time- and
space-efficient MFE RNA folding algorithm of Will and
Jabbari, 2015. The results are equivalent to RNAfold -d0,
but the computation takes less time (for long sequences) and
much less space.

  -h, --help             Print help and exit
  -V, --version          Print version and exit
  -v, --verbose          Turn on verbose output
  -m, --mark-candidates  Represent candidate base pairs 
                         by square brackets

The input sequence is read from standard input, unless it is
given on the command line.

Examples

Call with sequence from stdin in default mode
(compatible with RNAfold -d0)

$ echo UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA \
       | SparseMFEFold      
UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA
.(((((..(..(((.(((((((...))))))).)))...)..))))).... (-6.00)

Call with sequence on command line, mark candidates, report trace arrow and candidate counts (verbose)

$ SparseMFEFold -m -v \
       UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA 
UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA
.[[(((..(..[((.[[[([[[...]]])]]].))]...)..)))]].... (-6.00)

TA cnt:165
TA max:167
TA av:167
TA rm:6

Can num:109
Can cap:118
TAs num:165
TAs cap:169

Benchmark Results

We provide raw benchmark results from the submitted manuscript. We performed runs of RNAfold, SparseMFEFold and a version of SparseMFEFold w/o garbage collection on the RNA sequences from the RNA Strand dagta base. The SparseMFEFold runs are repeated on the shuffled sequences.
FileDescription
rna_strand.fasta RNA sequences from RNA strand data-base
rna_strand-shuffled.fasta Di-nucleotide shuffled RNA sequences from RNA strand data-base
rna_strand-rnafold.tab Tabulated results of RNAfold on the RNA strand sequences
rna_strand-sparse.tab Tabulated results of SparseMFEFold on the RNA strand sequences
rna_strand-nogc-sparse.tab Tabulated results of SparseMFEFold w/o GC on the RNA strand sequences
rna_strand.sparse Runs of RNAFold on the RNA strand sequences
rna_strand.rnafold Runs of SparseMFEFold on the RNA strand sequences
rna_strand.nogc-sparse Runs of SparseMFEFold w/o GC on the RNA strand sequences
rna_strand-shuffled-rnafold.tab Tabulated results of RNAfold on the shuffled RNA strand sequences
rna_strand-shuffled-sparse.tab Tabulated results of SparseMFEFold on the shuffled RNA strand sequences
rna_strand-shuffled-nogc-sparse.tab Tabulated results of SparseMFEFold w/o GC on the shuffled RNA strand sequences
rna_strand-shuffled.sparse Runs of RNAFold on the shuffled RNA strand sequences
rna_strand-shuffled.rnafold Runs of SparseMFEFold on the shuffled RNA strand sequences
rna_strand-shuffled.nogc-sparse Runs of SparseMFEFold w/o GC on the shuffled RNA strand sequencesg