The origin and evolution of metabolism is an interesting
field of research with many unsolved
questions. Simulation approaches have proven
to be helpful in explaining properties observed
in real world metabolic reaction networks.
We propose here a more complex and intuitive
graph-based model combined with an
artificial chemistry. Instead of differential
equations, enzymes are represented as graph
rewriting rules and reaction rates are derived
from energy calculations of the involved
Metabolic pathway analysis is the calculation and analysis of the pathway distribution of a steadystate
metabolic network to gain insights about its structure, functionality and properties. The calculation
starts with the formation of the stoichiometric matrix presentation of the network and delivers
the extreme pathways, spanning the entire steady state flux space, as the final result.
Minimal knockout sets are sets of reactions that need to be removed in order to disable the function
of a certain target reaction, this means that there may not be any extreme pathway containing this
The emergence and evolution of properties on the network level, is another
intriguing question. Here we apply our metabolic pathway analysis
tool to our simulation for the evolution
of metabolisms. In this process the tool is used to define a realistic selection
criteria -optimal metabolic yield- for the
metabolisms, leading to networks with properties
as observed in their real-world counterparts
(e.g. scale free degree distribution). Further we use it to
analyze less well understood properties of the
resulting networks, such as robustness or modularity,
to make predictions about their origin
and development. It can also be used as a standalone
tool for the enumeration of elementary
modes and the subsequent computation of minimal
cut-sets, in a very memory efficient way.
For many years it is known that neutral mutations have a considerable influence on the evolution in
molecular systems. The folding of RNA sequences to secondary structures with its many-to-one property
represents a mapping entailing considerable redundancy. Various extensive studies concerning RNA folding
in the context of neutral theory yielded insights about properties of the structure space and the mapping
itself. We intend to get a better understanding of some of these properties and especially of the evolution of
RNA-molecules as well as their effect on the evolution of the entire molecular system.
We introduce a novel genotype-phenotype mapping based on the relation between RNA sequence and its
secondary structure for the use in evolutionary studies. The inspiration for this particular mapping emerged
from the modeling of RNA enzymes within our simulation framework for the evolution of metabolic reaction
networks. The genome contains a number of RNA genes which then give rise to RNA enzymes acting on metabolites and
thus shaping the metabolic network. The use of our mapping allows not only for a more realistic
study of the evolution of the entire system, but also enables us to observe the behavior of our enzymes itself
and therefore possibly gain some insights about the evolution of catalytic molecules in general.
Besides using the mapping in several simulation runs which yielded realistic metabolic networks and
connectivities, we performed several statistical tests commonly used in neutral theory, such as the number
of visited phenotypes and the average discovery rate during a random neutral walk. We compared it with
results of approaches using cellular automatons, random boolean networks and other mappings based on
RNA folding. It exceeds all non-RNA mappings in extent and connectivity of the underlying neutral network.
Further, it has a significantly higher evolvability and innovation rate than the rest. Especially interesting is
the highly innovative starting phase in RNA-based mappings.
Predicting Function of proteins is an important problem in computational
biology and many approaches have been introduced to solve this problem.
However, there is no single approach which regards all issues of protein function
prediction. The most promising clue to infer function seems to be Protein
Packing Motifs, patterns of only a few amino acids which are spatially close
together and possess similar bio-chemical and geometrical properties. Our
approach finds these packing motifs by mining closed frequent subgraphs
from a set of protein graphs. The key to finding good motifs is the use of
optimal labels for nodes and edges of the protein graphs, thus we introduce a
novel way to assign these labels based on credible biological characteristics
of such motifs. The use of Feature Selection allows the approach to be extended
with discriminative features other than our motifs. Possible additional
features could come from other function prediction approaches. By assigning
feature vectors to protein graphs we have the possibility to use our approach
to develop a protein index for structural protein databases, in particular the
PDB databse, since we extract all information for our approach solely from
A packing motif is a pattern, representing a cluster of residues which are close in
space with no restriction in sequence, e.g. distance or order, which occurs multiple
times in a set of proteins. Due to that definition they can hardly be found by
sequence analysis, thus our approach uses structural information.
It is known that protein packing motifs can provide strong evidence for bio-chemical
function, for example the function of the HIV virus was identified through a characteristic
motif of this functional family.
One important aspect of computational systems biology includes the
identification and analysis of functional response networks within
large biochemical networks. These functional response networks
represent the response of a biological system under a particular
experimental condition which can be used to pinpoint critical
For this purpose, we have developed a novel algorithm to calculate
response networks as scored/weighted sub-graphs spanned by k-shortest
simple (loop free) paths. The k-shortest simple path algorithm is
based on a forward/backward chaining approach synchronized between
pairs of processors. The algorithm
scales linear with the number of processors used. The algorithm
implementation is using a Linux cluster platform, MPI lam and mpiJava
messaging as well as the Java language for the application.
The algorithm is performed on a hybrid human network consisting of 45,041
nodes and 438,567 interactions together with gene expression
information obtained from human cell-lines infected by influenza
virus. Its response networks show the early innate immune response
and virus triggered processes within human epithelial cells. Especially
under the imminent threat of a pandemic caused by novel influenza strains,
such as the current H1N1 strain, these analyses are crucial for a
comprehensive understanding of molecular processes during early phases
of infection. Such a systems level understanding may aid in the
identification of therapeutic markers and in drug development for
diagnosis and finally prevention of a potentially dangerous disease.