Christian Höner zu Siederdissen - Haskell Packages, Build States, Summaries of Libraries

Haskell Packages

Generalized ADPfusion:

Package	Status
ADPfusion
AlignmentAlgorithms
FormalGrammars
GrammarProducts
OrderedBits
PrimitiveArray

Applications in Bioinformatics:

Package	Status
CMcompare
Dna Protein Alignment
RNAdesign
RNAwolf
SimulateGeneDuplications

Applications in Computational Linguistics

Package	Status
WordAlignment

High-level Libraries for Bioinformatics and Linguistics

Package	Status
Phylogenetics
RNAdraw
RNAmodules

Low-level Bioinformatics Libraries:

Package	Status
BiobaseBlast
BiobaseInfernal
BiobaseFR3D
BiobaseMAF
BiobaseNCBI
BiobaseNewick
BiobaseTurner
BiobaseTypes
BiobaseXNA
ViennaRNA-bindings

Low-level Linguistics Libraries:

Package	Status
NaturalLanguageAlphabets

Low-level Libraries:

Package	Status
bimaps
fgl-extras-decompositions
SuffixStructures

general biohaskell information

Information regarding bioinformatics and Haskell can be found on the biohaskell wiki. There are pointers toward the mailing list as well.

Ketil Malde and I are working on splitting up our bioinformatics libraries into smaller ones. Each library is supposed to do only one thing. This goes against the trend of having all-encompassing bioinformatics libraries but allows us to push new versions easily – at least that is the plan.

I write software for RNA secondary structure, whole genome ncRNA search, miRNAs and other stuff.

individual library and program descriptions

Note that with the library split, the loader will generally use the enumerator / iteratee packages. Furthermore, I’m strongly considering switching to iteratee completely due to the parallel-composition options.

All described packages are either already in a form, where they provide library functions and in addition, a program; or will soon be.

Biobase

This library is currently in the process of being dismantled. Individual data sources go in the BiobaseZZZ libraries. I’ll probably keep it around for some common / often used things.

BiobaseDotP

Parsers for Vienna dot-bracket like formats. Includes parsing two-line RNAfold output, RNAstrand dot-bracket notation and the RNAwolf extended RNA secondary structure notation.

BiobaseFR3D

Provides importers for FR3D resource files. Of particular interest are basepairs files which describe canonical and non-canonical (non-Watson-Crick) base pairings in RNA secondary structure.

BiobaseInfernal

Loads different Infernal file formats. Understands taxonomy files, and verbose hits. Other parsers will be (re-)integrated soon, as they are still based on parsec3.

BiobaseMAF

Provides a loader for MAF files. Based on Oleg Keselyovs and John Latos iteratee.

BiobaseTrainingData

Parameter training for RNA secondary structure prediction tools requires data to train on. Since there are a number of different available formats, and handling them all in the training tools is a pain, we have this library and programs. MkTrainingData transforms different formats and they all produce a common training data format. This format is Haskell-readable (and only partially human-readable) line-by-line. Generating additional training data is therefor easy as one can just cat together different training files.

BiobaseTurner

A data structure for Mathews / Turner RNA and DNA energy parameters. This library currently only provides an importer, not export functions. There are two reasons: (i) We currently have no use-case where we need more than import facilities (ii) The file structure is geared towards humans, not machines. If you need to be able to export, send a mail.

BiobaseXNA

Provides representations and functions for RNA primary and secondary structure.

CMCompare

main page

(no library yet, only an executable)

A program to compare two Infernal covariance models. Useful to determine if a newly designed structural multiple alignment in CM form has high discriminatory power. If it does not, it will produce a lot of false positives.

MC-Fold-DP

main page

A polynomial-time variant of the MC-Fold RNA folding program by Parisien and Major. Part of our ongoing effort to provide asymptotically fast prediction tools for extended RNA secondary structures.

RNAFold

Haskell version of the ViennaRNA RNAfold program. This is only the library. The program can be found in RNAFoldProgs, but this will change soon.

RNAwolf

main page

The algorithm implemented here-in provides extended RNA secondary structure prediction. Each predicted nucleotide pairing is extended with an annotation describing which of three nucleotide edges is engaged in the pairing. In addition, each nucleotide may be engaged in more than one pairing.

HsTools

Some helper functions. This library needs a clean-up.

Home

Publications

Software

Teaching

Students/Theses

Generalized ADP

Jobs

Haskell Packages

Generalized ADPfusion:

Applications in Bioinformatics:

Applications in Computational Linguistics

High-level Libraries for Bioinformatics and Linguistics

Low-level Bioinformatics Libraries:

Low-level Linguistics Libraries:

Low-level Libraries:

general biohaskell information

individual library and program descriptions

Biobase

BiobaseDotP

BiobaseFR3D

BiobaseInfernal

BiobaseMAF

BiobaseTrainingData

BiobaseTurner

BiobaseXNA

CMCompare

MC-Fold-DP

RNAFold

RNAwolf

HsTools