Bioinformatics Preprint 04-009
Divergence of Conserved Non-Coding Sequences: Rate Estimates and Relative Rate Tests
Paulo R. A. Campos, Viviane M. de Olivera, Günter P. Wagner, Peter F. Stadler
Submitted for publication in:
The study of gene families critically depends on the correct reconstruction of gene genealogies, as for instance in the case of transcription factor genes like Hox genes and Dlx gene families. Proteins belonging to the same family are likely to share some of the same protein interaction partners and may thus face a similar selective environment. This common selective environment can induce co-evolutionary pres- sures and thus can give rise to correlated rates and patterns of evolution among members of a gene family. In this study we simulate the evolution of a family of sequences which share a set of interaction partners. Depending on the amount of sequence dedicated to protein-protein interaction and the relative rate parameters of sequence evolution three outcomes are possible: if the fraction of the sequence dedicated to interaction with common co-factors is low and the time since diver- gence is small, the trees based on sequence information tend to be correct. If the time since gene duplication is long two possible outcomes are observed in our simu- lations. If the rate of evolution of the interaction partner is small compared to the rate of evolution of the focal protein family, the reconstructed trees tend towards star phylogenies. As the rate of evolution of the interaction partner approaches that of the focal protein family the reconstructed phylogenies tend to be incorrectly re- solved. We conclude that the genealogies of gene families can be hard to estimate, in particular if the proteins interact with a conserved set of binding partners, as is likely the case for transcription factors.
Gene phylogeny, tree reconstruction, correlated substitutions
Return to 2004 working papers list.
Last modified: 2004-03-28 19:56:33 studla