Illingworth, CJR;
Roy, S;
Beale, MA;
Tutill, H;
Williams, R;
Breuer, J;
(2017)
On the effective depth of viral sequence data.
Virus Evolution
, 3
(2)
, Article vex030. 10.1093/ve/vex030.
Preview |
Text (Published article)
vex030.pdf - Published Version Download (762kB) | Preview |
Preview |
Text (Supplementary table 1)
Illingworth_Effective_depth_viral_sequence_data_S1.pdf Download (23kB) | Preview |
Preview |
Text (Supplementary table 2)
Illingworth_Effective_depth_viral_sequence_data_S2.pdf Download (17kB) | Preview |
Preview |
Text (Supplementary table 3)
Illingworth_Effective_depth_viral_sequence_data_S3.pdf Download (20kB) | Preview |
Preview |
Text (Supplementary table 4)
Illingworth_Effective_depth_viral_sequence_data_S4.pdf Download (22kB) | Preview |
Preview |
Text (Supplementary table 5)
Illingworth_Effective_depth_viral_sequence_data_S5.pdf Download (17kB) | Preview |
Abstract
Genome sequence data are of great value in describing evolutionary processes in viral populations. However, in such studies, the extent to which data accurately describes the viral population is a matter of importance. Multiple factors may influence the accuracy of a dataset, including the quantity and nature of the sample collected, and the subsequent steps in viral processing. To investigate this phenomenon, we sequenced replica datasets spanning a range of viruses, and in which the point at which samples were split was different in each case, from a dataset in which independent samples were collected from a single patient to another in which all processing steps up to sequencing were applied to a single sample before splitting the sample and sequencing each replicate. We conclude that neither a high read depth nor a high template number in a sample guarantee the precision of a dataset. Measures of consistency calculated from within a single biological sample may also be insufficient; distortion of the composition of a population by the experimental procedure or genuine within-host diversity between samples may each affect the results. Where it is possible, data from replicate samples should be collected to validate the consistency of short-read sequence data.
Archive Staff Only
View Item |