GenussFold: RNA pseudoknot structure prediction

GenussFold is a (simple) example grammar that shows how to write a pseudoknot-folding grammar in ADPfusion. Pseudoknotted RNA structures require the use of multi context-free grammars. The principle idea is to interleave grammatical rules to achieve the effect of interleaving structural features.

We have a preprint online ‘(???)’.

README

(included from project) Christian Höner zu Siederdissen - GenussFold

GenussFold: RNA Pseudoknot Folding

generalized Algebraic Dynamic Programming Homepage

The implementation makes use of the gADP technique and provides a larger example on how to implement algorithms that require interleaved, split syntactic variables.

Formal background can be found in this paper:

Maik Riechert, Christian Höner zu Siederdissen, and Peter F. Stadler
Algebraic dynamic programming for multiple context-free languages
2015, submitted
preprint

As an example, consider palindromic brackets ((())). Given two types of brackets, these can be interleaved: ((( [[[ ))) ]]]. Such interleaved, long-range dependencies have been observed in human languages and, in particular, in RNA bioinformatics.

RNA structures may form so-called pseudoknots, where the RNA structure does not yield a planar structure (the canonical secondary structure) anymore, but rather forms graphs with crossing edges. Using the idea of interleaved brackets and given an input sequence AAA CCC UUU GGG (with artificial white space to make this more clear), a pseudoknotted structure may be formed:

AAA CCC UUU GGG  
[[[ ((( ]]] )))

A formal grammar that parses such a structure requires the ability to denote that a sub-structure has a “hole”. We can write such a grammar as follows:

S     -> U V U V  
<U,U> -> [ε,ε]  
<U,U> -> [S,-] [a,-] <U,U> [-,S] [-,u]  
<V,V> -> [ε,ε]  
<V,V> -> [S,-] [c,-] <V,V> [-,S] [-,g]

The PKN grammar in GenussFold (for genus-1 structures, but much more pleasurable to write) offers the required features:

state that a syntactic variable is split between two regions <U,U>
state that this split system is linearized and different symbols can be interleaved: U V U V
in addition, we allow syntactic variables of lower dimension (like S) to be used in dimensional stacks of symbols ([S,-]).

This system allows writing monotone multiple context-free grammars with good performance – we are reasonably close to C in running time performance. Reasonable means around a factor of 2 slower.

Performance comparison

C-code for running time performance comparison is available in the GenussFold github repository. The direct URL is: https://github.com/choener/GenussFold/blob/master/C/genussfold.c

Contact

Christian Hoener zu Siederdissen
Leipzig University, Leipzig, Germany
choener@bioinf.uni-leipzig.de
http://www.bioinf.uni-leipzig.de/~choener/


Binaries	Stable Sources	git	Bugtracker	Build Status

Installation from sources via `stack`:

This software can be compiled and installed using stack in a small number of steps. Note that some packages only provide a library and no executable.

Prepare a directory for the software:
mkdir ~/haskell/ cd ~/haskell
(If you choose another name, replace accordingly below)
Install stack itself. Binaries are available here. Copy the binaries into the ~/haskell directory.
Download the GenussFold software:
~/haskell/stack unpack GenussFold ls
This unpacks the newest version of GenussFold, the version itself being referred to as -VERSION, which you should see via ls.
Compile:
cd ~/haskell/GenussFold-VERSION
~/haskell/stack build
Execute (you can find the executable names, if any, in the cabal as executable build targets):
~/haskell/stack exec EXECUTABLE
Optionally, install any executables into ~/haskell/bin/ . This simplifies actually running the programs later on. mkdir -p ~/haskell/bin ~/haskell/stack install --local-bin-path ~/haskell/bin

Home

Publications

Software

Teaching

Students/Theses

Generalized ADP

Jobs

GenussFold: RNA pseudoknot structure prediction

README

Home

Publications

Software

Teaching

Students/Theses

Generalized ADP

Jobs

GenussFold: RNA Pseudoknot Folding

Performance comparison

Contact

Installation from sources via `stack`:

GenussFold: RNA pseudoknot structure prediction

README

GenussFold: RNA Pseudoknot Folding

Performance comparison

Contact

Installation from sources via stack:

Installation from sources via `stack`: