Strength of genetic drift in SARS-CoV-2 across time and spatial scales in England

In a recent study published in bioRxiv* preprint server, researchers developed and validated an approach for joint inference of measurement noise and genetic drift by analyzing time-series data for lineage frequencies.

The study: Lineage frequency time series reveal elevated levels of genetic drift in the transmission of SARS-CoV-2 in England.  Image credit: creativeneko / Shutterstock
Stady: Lineage frequency time series reveal elevated levels of genetic drift in the transmission of SARS-CoV-2 in England. Image credit: creativeneko / Shutterstock


Random genetic drift in the dynamics of infectious disease outbreaks at the population level results from random transmission between hosts and host death or recovery. Studies have reported strong genetic divergence in the SARS-CoV-2 sequence resulting from spreading events, which are expected to significantly influence viral evolution and coronavirus disease 2019 (COVID-19). Noise from the measurement process, including bias in data acquisition across location and time, can confound estimates of genetic drift.

about studying

In this study, the authors developed an approach to jointly infer the strength of measurement noise and genetic drift from time-varying strain frequency data that enabled the dispersal of measurement noise (rather than maintaining uniformity) and the strength of hyperdispersion to change over time (instead of being constant). They also verified the accuracy of the approach via simulations.

HMM (Hidden Markov Modeling) was used with the observed states occurring continuously and the hidden states representing the observed and real frequencies, respectively. The transition potential between hidden states was determined by genetic drift, where the mean true frequency was based on the true frequencies determined in the previous period. For rare frequencies, the variance is related to average values ​​based on effective community size [Ne

The transmission potential between the observed and hidden states was based on the measurement noise such that the mean value of the observed frequencies would be equal to the true frequencies. In the case of rare frequencies, the value of variance in the observed frequencies is related to the average value indicating time-dependent deviations from standardized type sampling. The modeling was performed under the assumption that the number of subjects and ancestry frequencies were high enough to apply the central limit theorem.

The model produced “superlines” by grouping lineages based on phylogenetic distances such that the total value of abundance and frequency of lineages exceeded a threshold value, yielding 486, 4083, 6225, and 24,867 SARS-CoV-2 strains before B. 1.177 and B. 1.177, Alpha, and Delta, respectively. The team hypothesized that N.e

Next, the parameters most likely to represent the data set were determined. The model was validated by running simulations with the time variable Ne

conclusion Ne


The strength of genetic drift was consistently higher than that estimated from the observed number of people infected with SARS-CoV-2 in England by one to three orders of magnitude, over time, even after correcting for measurement noise. The high genetic drift could not be explained based on superproliferation but could be explained in part by dime-community structures in the contact networks of the hosts. The discrepancy cannot be explained by calculating corrections for epidemiological dynamics (SIR or SEIR modeling).

The sampling of people infected with SARS-CoV-2 from the population of England was largely similar to the data set. The team found evidence of spatial order in the dynamics of the B.1.177 variant, the alpha variant, and the delta variant transmission. estimated Ne

concluded HMM-Ne


Overall, the results of the study showed that the strength of genetic drift in SARS-CoV-2 transmission in England was greater than expected, and indicated that more modeling studies methods are required to better understand the mechanisms behind the high levels of SARS-CoV-2 genetic drift. 2 in England.

*Important note

bioRxiv publishes preliminary scientific reports that are not peer-reviewed and therefore should not be considered conclusive, directing clinical practice/health-related behaviour, or treated as hard information.

Leave a Comment