Improved Annotation of Protein-Coding Genes Boundaries in Metazoan Mitochondrial Genomes

Donath, Alexander and Jühling, Frank and Al-Arab, Marwa and Berhart, Stephan H. and Reihardt, Franziska and Stadler, Peter F and Middendorf, Martin and Bernt, Matthias


PREPRINT 19-028:
Nucleic Acids Res. 47: 10543-10552


With the rapid increase of sequenced metazoan mitochondrial genomes, a detailed manual annotation is becoming more and more infeasible. While it is easy to identify the approximate location of protein-coding genes within mitogenomes, the peculiar processing of mitochondrial transcripts, however, makes the determination of precise gene boundaries a surprisingly difficult problem. We have analyzed the properties of annotated start and stop codon positions in detail, and use the inferred patterns to devise a new method for predicting gene boundaries in de novo annotations. Our method benefits from empirically observed prevalances of start/stop codons and gene lengths, and considers the dependence of these features on variations of genetic codes. Albeit not being perfect, our new approach yields a drastic improvement in the accuracy of gene boundaries and upgrades the mitochondrial genome annotation server MITOS to an even more sophisticated tool for fully automatic annotation of metazoan mitochondrial genomes.