MAP5, CNRS, Université Paris Cité
2025-10-17
Introduction to phylogeography
Practical
Questions
Source: National Human Genome Research Institute
PCR = ?
Prélèvement Covid Rhino-pharyngé ?
Polymerase Chain Reaction !
Source: Wikipedia
AACUUUUGCGCGCGGGGAAAAAAAGCCCCAAAAUUUU
\(\qquad\qquad\downarrow\) AACUUUUGCGCGCGGGGAAAAAAAGCCCCAAAAUUUU
\(\qquad\qquad\downarrow\) AACUUUUGCGCGCGGGGAAUAAAAGCCCCAAAAUUUU
\(\qquad\qquad\downarrow\) AACUUUUGCGCGCGGGGAAUAAAAGCCCCAAAAUUGU
\(\qquad\qquad\downarrow\) AACUUUUGCGCGCGGGGAAUAAAGGCCCCAAAAUUGU
\(\qquad\qquad\downarrow\)
\(\qquad\qquad\cdots\)
\[ \frac{1}{\sqrt{2}}\mathbf{Z}_i = \begin{cases} (\;\,0, \;\,1) & p = 1/4\\ (\;\,0, -1) & p = 1/4\\ (\;\,1, \;\,0) & p = 1/4\\ (-1, \;\,0) & p = 1/4\\ \end{cases} \]
\[ \mathbf{Z}_i \text{ i.i.d. } \quad \mathbf{E}[\mathbf{Z}] = \mathbf{0}_2 \quad \mathbf{V}[\mathbf{Z}] = \mathbf{I}_2 \]
\[ \mathbf{S}_n = \sum_{i=1}^n \mathbf{Z}_i \]
\[ \mathbf{S}_n = \sum_{i=1}^n \mathbf{Z}_i \]
\[ \mathbf{W}^{(n)}_t = \frac{1}{\sqrt{n}} \mathbf{S}_{\lfloor tn \rfloor} \]
\[ \mathbf{S}_n = \sum_{i=1}^n \mathbf{Z}_i \]
\[ \mathbf{W}^{(n)}_t = \frac{1}{\sqrt{n}} \mathbf{S}_{\lfloor tn \rfloor} \implies \mathbf{W}_t \]
The random walk converges weakly to the Brownian motion as \(n\to \infty\).
\[ X_0 = \mu; \qquad d X_t = \sigma d B_t \]
Brownian Motion:
\[ X_0 = \mu; \qquad d X_t = \sigma d W_t \]
Brownian Motion:
\[ X_0 = \mu; \qquad d X_t = \sigma d W_t \]
Brownian Motion:
Brownian Motion:
Brownian Motion:
\[ \mathbf{X}_t \sim \mathcal{N}(\mathbf{X}_0, t \mathbf{\Sigma}) \]
Random trajectories
Distribution of final points.
Distribution of final points.
Distribution of final points.
\[ \mathbf{X}_t \sim \mathcal{N}\left( \mathbf{X}_0, \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} t\right) \]
\[ \mathbf{\Sigma} = \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \]
Distribution of final points.
\[ \mathbf{X}_t \sim \mathcal{N}\left( \mathbf{X}_0, \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} t\right) \]
\[ \mathbf{\Sigma} = \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} = \begin{pmatrix} 0.1 & 0 \\ 0 & 1 \end{pmatrix} \]
Distribution of final points.
\[ \mathbf{X}_t \sim \mathcal{N}\left( \mathbf{X}_0, \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} t\right) \]
\[ \mathbf{\Sigma} = \begin{pmatrix} \sigma^2_x & \sigma^2_{xy} \\ \sigma^2_{xy} & \sigma^2_{y} \end{pmatrix} = \begin{pmatrix} 0.55 & 0.45 \\ 0.45 & 0.55 \end{pmatrix} \]
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid \[ \mathbf{V}(X_9) = \mathbf{V}(X_8) + \sigma^2 t_9 \]
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid \[ \mathbf{V}(X_9) = \mathbf{V}(X_8) + \sigma^2 t_9 = \sigma^2 t_8 + \sigma^2 t_9 \]
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid \[ \mathbf{V}(X_9) = \mathbf{V}(X_8) + \sigma^2 t_9 = \sigma^2 t_8 + \sigma^2 t_9 = \sigma^2 V_{9} \]
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid \[ \mathbf{V}(X_9) = \sigma^2 V_{9} \] \[ \mathbf{C}(Y_4, Y_5) = \mathbf{V}(X_9) = \sigma^2 V_{9} = \sigma^2 V_{45} \]
Structure: \(X_i = X_{\text{pa}(i)} + \sigma \sqrt{t_{i}} \times \epsilon_i\), with \(\epsilon_i \sim \mathcal{N}(0, 1)\) iid
Covariances: \(\mathbf{C}(X_i, X_j) = \sigma^2 V_{ij}\)
Distribution: \(\mathbf{X} \sim \mathcal{N}(\mu\mathbf{1}_n, \sigma^2 \mathbf{V})\) \(\to\) Multivariate Gaussian
\[ \mathbf{Y} \sim \mathcal{N}(\mu\mathbf{1}_n, \sigma^2 \mathbf{V}) \]
\[ \hat{\mu} = (\mathbf{1}_n^T \mathbf{V}^{-1} \mathbf{1}_n)^{-1} \mathbf{1}_n^T \mathbf{V}^{-1} \mathbf{Y} \\ \hat{\sigma}^2 = \frac{1}{n-1} (\mathbf{Y} - \hat{\mu}\mathbf{1}_n)^T \mathbf{V}^{-1} (\mathbf{Y} - \hat{\mu}\mathbf{1}_n) \]
\[ \begin{pmatrix} \mathbf{Z}\\ \mathbf{Y} \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} \mu\mathbf{1}_m\\ \mu\mathbf{1}_n \end{pmatrix} , \sigma^2 \begin{pmatrix} \mathbf{V}_{ZZ} & \mathbf{V}_{ZY}\\ \mathbf{V}_{YZ} & \mathbf{V}_{YY} \end{pmatrix} \right) \]
\[ \begin{pmatrix} \mathbf{Z}\\ \mathbf{Y} \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} \mu\mathbf{1}_m\\ \mu\mathbf{1}_n \end{pmatrix} , \sigma^2 \begin{pmatrix} \mathbf{V}_{ZZ} & \mathbf{V}_{ZY}\\ \mathbf{V}_{YZ} & \mathbf{V}_{YY} \end{pmatrix} \right) \]
Conditional distribution:
\[ \mathbf{Z} \mid \{\mathbf{Y} = \mathbf{y}\} \sim \mathcal{N}\left( \bar{\boldsymbol{\mu}} , \sigma^2\bar{\mathbf{V}} \right) \]
\[ \bar{\boldsymbol{\mu}} = \mu\mathbf{1}_m + \mathbf{V}_{ZY} \mathbf{V}_{YY}^{-1} (\mathbf{y} - \mu\mathbf{1}_n) \quad \bar{\mathbf{V}} = \mathbf{V}_{ZZ} - \mathbf{V}_{ZY} \mathbf{V}_{YY}^{-1}\mathbf{V}_{YZ} \]
\[ \mathbf{Y} \sim \mathcal{MN}_{n,2}(\mathbf{1}_n\boldsymbol{\mu}^T, \mathbf{V}, \mathbf{\Sigma}) \]
\[ \hat{\boldsymbol{\mu}}^T = (\mathbf{1}_n^T \mathbf{V}^{-1} \mathbf{1}_n)^{-1} \mathbf{1}_n^T \mathbf{V}^{-1} \mathbf{Y} \\ \hat{\mathbf{\Sigma}} = \frac{1}{n-1} (\mathbf{Y} - \mathbf{1}_n\hat{\boldsymbol{\mu}}^T)^T \mathbf{V}^{-1} (\mathbf{Y} - \mathbf{1}_n\hat{\boldsymbol{\mu}}^T) \]
Vectorized version: \[ \begin{pmatrix} \mathbf{Y}_{\cdot,1}\\ \mathbf{Y}_{\cdot,2} \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} \mu_1\mathbf{1}_n\\ \mu_2\mathbf{1}_n \end{pmatrix} , \begin{pmatrix} \Sigma_{11}\mathbf{V}_{YY} & \Sigma_{12}\mathbf{V}_{YY}\\ \Sigma_{21}\mathbf{V}_{YY} & \Sigma_{22}\mathbf{V}_{YY} \end{pmatrix} \right) \]
With ancestral states: \[ \begin{pmatrix} \mathbf{Z}_{\cdot,1}\\ \mathbf{Z}_{\cdot,2}\\ \mathbf{Y}_{\cdot,1}\\ \mathbf{Y}_{\cdot,2} \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} \mu_1\mathbf{1}_m\\ \mu_2\mathbf{1}_m\\ \mu_1\mathbf{1}_n\\ \mu_2\mathbf{1}_n \end{pmatrix} , \begin{pmatrix} \Sigma_{11}\mathbf{V}_{ZZ} & \Sigma_{12}\mathbf{V}_{ZZ} & \Sigma_{11}\mathbf{V}_{ZY} & \Sigma_{12}\mathbf{V}_{ZY}\\ \Sigma_{21}\mathbf{V}_{ZZ} & \Sigma_{22}\mathbf{V}_{ZZ} & \Sigma_{21}\mathbf{V}_{ZY} & \Sigma_{22}\mathbf{V}_{ZY}\\ \Sigma_{11}\mathbf{V}_{YZ} & \Sigma_{12}\mathbf{V}_{YZ} & \Sigma_{11}\mathbf{V}_{YY} & \Sigma_{12}\mathbf{V}_{YY}\\ \Sigma_{21}\mathbf{V}_{YZ} & \Sigma_{22}\mathbf{V}_{YZ} & \Sigma_{21}\mathbf{V}_{YY} & \Sigma_{22}\mathbf{V}_{YY} \end{pmatrix} \right) \]
\[ \begin{pmatrix} \mathbf{Z}\\ \mathbf{Y} \end{pmatrix} \sim \mathcal{N}\left( \begin{pmatrix} \mathbf{m}_Z\\ \mathbf{m}_Y \end{pmatrix} , \begin{pmatrix} \mathbf{S}_{ZZ} & \mathbf{S}_{ZY}\\ \mathbf{S}_{YZ} & \mathbf{S}_{YY} \end{pmatrix} \right) \]
Conditional distribution:
\[ \mathbf{Z} \mid \{\mathbf{Y} = \mathbf{y}\} \sim \mathcal{N}\left( \bar{\mathbf{m}} , \bar{\mathbf{S}} \right) \]
\[ \bar{\mathbf{m}} = \mathbf{m}_Z + \mathbf{S}_{ZY} \mathbf{S}_{YY}^{-1} (\mathbf{y} - \mathbf{m}_Y) \quad \bar{\mathbf{S}} = \mathbf{S}_{ZZ} - \mathbf{S}_{ZY} \mathbf{S}_{YY}^{-1}\mathbf{S}_{YZ} \]
Joint posterior: \[ p\left(\boldsymbol{\theta}, \mathcal{T}, \boldsymbol{\psi} \mid \mathbf{Y}, \mathbf{S} \right) \propto p\left(\mathbf{Y}, \mathbf{S} \mid \boldsymbol{\theta}, \mathcal{T}, \boldsymbol{\psi} \right) p\left(\boldsymbol{\theta}, \mathcal{T}, \boldsymbol{\psi} \right) \]
Assumption: \(\mathbf{Y}\) and \(\mathbf{S}\) independent conditionally on \(\mathcal{T}\).
\[ p\left(\boldsymbol{\theta}, \mathcal{T}, \boldsymbol{\psi} \mid \mathbf{Y}, \mathbf{S} \right) \propto p\left(\mathbf{Y} \mid \boldsymbol{\theta}, \mathcal{T} \right) p\left(\mathbf{S} \mid \mathcal{T}, \boldsymbol{\psi} \right) p\left(\boldsymbol{\theta}, \mathcal{T}, \boldsymbol{\psi} \right) \]
\(\to\) Run a MCMC, updating parameters sequentially.
Wikipedia Viral phylodynamics (en).
Lecture on viral phylodynamics by Trevor Bedford (en).
Modèles et méthodes pour l’évolution biologique, éditeurs Gilles Didier et Stéphane Guindon, ISTE éditions (2022).