Bioinformatics Preprint 07-006
The transient probability distribution of M/M/&infty;: a stochastic model for transcription factor binding site evolution
Günter P. Wagner, Wolfgang Otto, Vincent Lynch, Peter F. Stadler
Both experimental as well as sequence evolution evidence suggests that transcription factor binding sites can undergo divergence and turnover even when the transcriptional output remains conserved. Furthermore it is likely that there exist lineage specific differences in the retention rate of binding sites that make it desirable to estimate the rate of acquisition and decay of transcription factor binding sites from comparative sequence data. In this paper we propose a stochastic, phenomenological model for binding site turnover. For a given genomic region we assume a constant rate of binding site origination /lambda/ and a constant per site decay rate of /mu/. We derived an explicit expression for the conditional probability distribution of the number of binding sites n at time t given n0 binding sites at t=0. The analytical result was compared to a simulation model and we found that it closely predicts the simulated sequence evolution. We then analyzed a small data set of the number of estrogen response elements (ERE) in mammalian HoxA sequences and showed that the data is broadly consistent with the assumption of a stationary turnover process. A regression of shared ERE«s over the time since divergence led to an estimate of the half life time for an ERE in the primate HoxA clusters of about 27 myr, which corresponds to a per site decay rate of 1.3*10-9/year and a rate of origination of 1.6*-7/year. We conclude that the model can be used to estimate the rate of binding site turnover from comparative genomic data.
Return to 2007 working papers list.
Last modified: 2006-08-09 15:54:23 xtof