The computation of reliable, chemically correct atom maps from educt/product pairs has turned out to be a difficult problem in cheminformatics because the chemically correct solution is not necessarily an optimal solution for combinatorial formulations such as maximum common subgraph problems. As a consequence, competing models have been devised and compared in extensive benchmarking studies. Due to isomorphisms among products and educts it is not immediately obvious, however, when two atom maps for a given educt/product pairs are the same. We formalize here the equivalence of atom maps and show that equivalence of atom maps is in turn equivalent to the isomorphism of labeled auxiliary graphs. In particular, we demonstrate that Fujita's Imaginary Transition State can be used for this purpose. Numerical experiments show that practical feasibility. Generalizations to the equivalence of subgraph matches, double pushout graph transformation rules, and mechanisms of multi-step reactions are discussed briefly.
Circular RNAs (circRNAs) are a regulatory RNA class. While cancer-driving functions have been identified for single circRNAs, how they modulate gene expression in cancer is not well understood. We investigate circRNA expression in the pediatric malignancy, neuroblastoma, through deep whole-transcriptome sequencing in 104 primary neuroblastomas covering all risk groups. We demonstrate that MYCN amplification, which defines a subset of high-risk cases, causes globally suppressed circRNA biogenesis directly dependent on the DHX9 RNA helicase. We detect similar mechanisms in shaping circRNA expression in the pediatric cancer medulloblastoma implying a general MYCN effect. Comparisons to other cancers identify 25 circRNAs that are specifically upregulated in neuroblastoma, including circARID1A. Transcribed from the ARID1A tumor suppressor gene, circARID1A promotes cell growth and survival, mediated by direct interaction with the KHSRP RNA-binding protein. Our study highlights the importance of MYCN regulating circRNAs in cancer and identifies molecular mechanisms, which explain their contribution to neuroblastoma pathogenesis.
Structural analysis of RNA is an important and versatile tool to investigate the function of this type of molecules in the cell as well as in vitro. Several robust and reliable procedures are available, relying on chemical modification inducing RT stops or nucleotide misincorporations during reverse transcription. Others are based on cleavage reactions and RT stop signals. However, these methods address only one side of the RT stop or misincorporation position. Here, we describe Led-Seq, a new approach based on lead-induced cleavage of unpaired RNA positions, where both resulting cleavage products are investigated. The RNA fragments carrying 2′, 3′-cyclic phosphate or 5′-OH ends are selectively ligated to oligonucleotide adapters by specific RNA ligases. In a deep sequencing analysis, the cleavage sites are identified as ligation positions, avoiding possible false positive signals based on premature RT stops. With a benchmark set of transcripts in Escherichia coli, we show that Led-Seq is an improved and reliable approach based on metal ion-induced phosphodiester hydrolysis to investigate RNA structures in vivo.
Background: Evolutionary scenarios describing the evolution of a family of genes within a collection of species comprise the mapping of the vertices of a gene tree T to vertices and edges of a species tree S. The relative timing of the last common ancestors of two extant genes (leaves of T) and the last common ancestors of the two species (leaves of S) in which they reside is indicative of horizontal gene transfers (HGT) and ancient duplications. Orthologous gene pairs, on the other hand, require that their last common ancestors coincides with a corresponding speciation event. The relative timing information of gene and species divergences is captured by three colored graphs that have the extant genes as vertices and the species in which the genes are found as vertex colors: the equal-divergence-time (EDT) graph, the later-divergence-time (LDT) graph and the prior-divergence-time (PDT) graph, which together form an edge partition of the complete graph. Results: Here we give a complete characterization in terms of informative and forbidden triples that can be read off the three graphs and provide a polynomial time algorithm for constructing an evolutionary scenario that explains the graphs, provided such a scenario exists. While both LDT and PDT graphs are cographs, this is not true for the EDT graph in general. We show that every EDT graph is perfect. While the information about LDT and PDT graphs is necessary to recognize EDT graphs in polynomial-time for general scenarios, this extra information can be dropped in the HGT-free case. However, recognition of EDT graphs without knowledge of putative LDT and PDT graphs is NP-complete for general scenarios. In contrast, PDT graphs can be recognized in polynomial-time. We finally connect the EDT graph to the alternative definitions of orthology that have been proposed for scenarios with horizontal gene transfer. With one exception, the corresponding graphs are shown to be colored cographs.
Background RNA features a highly negatively charged phosphate backbone that attracts a of cloud counter-ions that reduce the electrostatic repulsion in a concentration dependent manner. Ion concentrations thus have a large influence on folding and stability of RNA structures. Despite their well-documented effects, salt effects are not handled by currently available secondary stucture prediction algorithms. Combining Debye-Hückel potentials for line charges and Manning’s counter-ion condensation theory, Einert et al. [ Biophys. J. 100: 2745-2753 (2011)] modeled the energetic effects contributions monovalent cations on loops and helices. Results The model of Einert et al. is adapted to match the structure of the dynamic programming recursion of RNA secondary structure prediction algorithms. An empirical term describing the dependence salt dependence of the duplex initiation energy is added to improve co-folding predictions for two or more RNA strands. The slightly modified model is implemented in the ViennaRNApackage in such way that only the energy parameters but not the algorithmic structure is affected. A comparison with data from the literature show that predicted free energies and melting temperatures are in reasonable agreement with experiments. Conclusion The new feature in the ViennaRNApackage makes it possible to study effects of salt concentrations on RNA folding in a systematic manner. Strictly speaking, the model pertains only to mono-valent cations, and thus covers the most important parameter, i.e., the NaCl concentration. It remains a question for future research to what extent unspecific effects of bi- and tri-valent cations can be approximated in a similar manner. Availability Corrections for the concentration of monovalent cations are available in the ViennaRNApackage starting from version 2.6.0.