SparseMFEFold - Time and Space-efficient Sparsified Minium Free Energy Folding of RNAs
Implementation of the method presented in the manuscript
Sparse RNA folding revisited: space-efficient minimum free energy structure prediction
by Sebastian Will and Hosna Jabbari
This work was presented at WABI 2015. An extended abstract was published in the conference proceedings.
Sebastian Will and Hosna Jabbari. Sparse Rna folding revisited: space-efficient minimum free energy prediction. In Mihai Pop and Hélène Touzet, editors, Proceedings of the 15th Workshop on Algorithms in Bioinformatics (WABI 2015), volume 9289 of LNCS, pages 257–270. Springer Berlin Heidelberg, 2015. (doi:10.1007/978-3-662-48221-6_19)
Software written by Sebastian Will
New Benchmark Results from the submitted manuscript are given below.
DownloadSparseMFEFold is free software under GNU GPL 3. GitHub.
Please install from source. The package is easy to compile and install via autotool's ./configure; make; make install.
Dependencies: RNA library of the Vienna RNA package 2.x
Usage: sparsemfefold[options] [sequence] Read RNA sequence from stdin or cmdline; predict minimum free energy and optimum structure using the time- and space-efficient MFE RNA folding algorithm of Will and Jabbari, 2015. The results are equivalent to RNAfold -d0, but the computation takes less time (for long sequences) and much less space. -h, --help Print help and exit -V, --version Print version and exit -v, --verbose Turn on verbose output -m, --mark-candidates Represent candidate base pairs by square brackets The input sequence is read from standard input, unless it is given on the command line.
Call with sequence from stdin in default mode
(compatible with RNAfold -d0)
$ echo UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA \ | SparseMFEFold UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA .(((((..(..(((.(((((((...))))))).)))...)..))))).... (-6.00)
Call with sequence on command line, mark candidates, report trace arrow and candidate counts (verbose)
$ SparseMFEFold -m -v \ UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA UAACUUAGGGGUUAAAGUUGCAGAUUGUGGCUCUGAAAACACGGGUUCGAA .[[(((..(..[((.[[[([[[...]]])]]].))]...)..)))]].... (-6.00) TA cnt:165 TA max:167 TA av:167 TA rm:6 Can num:109 Can cap:118 TAs num:165 TAs cap:169
Benchmark ResultsWe provide raw benchmark results from the submitted manuscript. We performed runs of RNAfold, SparseMFEFold and a version of SparseMFEFold w/o garbage collection on the RNA sequences from the RNA Strand dagta base. The SparseMFEFold runs are repeated on the shuffled sequences.
|rna_strand.fasta||RNA sequences from RNA strand data-base|
|rna_strand-shuffled.fasta||Di-nucleotide shuffled RNA sequences from RNA strand data-base|
|rna_strand-rnafold.tab||Tabulated results of RNAfold on the RNA strand sequences|
|rna_strand-sparse.tab||Tabulated results of SparseMFEFold on the RNA strand sequences|
|rna_strand-nogc-sparse.tab||Tabulated results of SparseMFEFold w/o GC on the RNA strand sequences|
|rna_strand.sparse||Runs of RNAFold on the RNA strand sequences|
|rna_strand.rnafold||Runs of SparseMFEFold on the RNA strand sequences|
|rna_strand.nogc-sparse||Runs of SparseMFEFold w/o GC on the RNA strand sequences|
|rna_strand-shuffled-rnafold.tab||Tabulated results of RNAfold on the shuffled RNA strand sequences|
|rna_strand-shuffled-sparse.tab||Tabulated results of SparseMFEFold on the shuffled RNA strand sequences|
|rna_strand-shuffled-nogc-sparse.tab||Tabulated results of SparseMFEFold w/o GC on the shuffled RNA strand sequences|
|rna_strand-shuffled.sparse||Runs of RNAFold on the shuffled RNA strand sequences|
|rna_strand-shuffled.rnafold||Runs of SparseMFEFold on the shuffled RNA strand sequences|
|rna_strand-shuffled.nogc-sparse||Runs of SparseMFEFold w/o GC on the shuffled RNA strand sequencesg|