Published on Mon Sep 06 2021

Gorin, G., Vastola, J. J., Fang, M., Pachter, L.

To what extent do cell-to-cell differences in transcription rate affect RNA copy number distributions? We argue that successfully answering such questions requires quantitative models that are both interpretable and tractable. Such models enable the identification of experiments which best discriminate between competing hypotheses.

5

24

96

To what extent do cell-to-cell differences in transcription rate affect RNA copy number distributions, and what can this variation tell us about biological processes underlying transcription? We argue that successfully answering such questions requires quantitative models that are both interpretable (describing concrete biophysical phenomena) and tractable (amenable to mathematical analysis); in particular, such models enable the identification of experiments which best discriminate between competing hypotheses. As a proof of principle, we introduce a simple but flexible class of models involving a stochastic transcription rate (governed by a stochastic differential equation) coupled to a discrete stochastic RNA transcription and splicing process, and compare and contrast two biologically plausible hypotheses about observed transcription rate variation. One hypothesis assumes transcription rate variation is due to DNA experiencing mechanical strain and relaxation, while the other assumes that variation is due to fluctuations in the number of an abundant regulator. Through a thorough mathematical analysis, we show that these two models are challenging to distinguish: properties like first- and second-order moments, autocorrelations, and several limiting distributions are shared. However, our analysis also points to the experiments which best discriminate between them. Our work illustrates the importance of theory-guided data collection in general, and multimodal single-molecule data in particular for distinguishing between competing hypotheses. We use this theoretical case study to introduce and motivate a general framework for constructing and solving such nontrivial continuous-discrete models.