Go back

6/15

What can SNPs tell us?

The results of the SNP-analyses of the four groups of TB isolates are shown in the figure below.

The figure allows us to parameterise the variability we observe. When using a SNP-based genomics approach to investigate potential TB clusters it is important that we are able to put the amount of variability observed in an investigation into context.

When using SNPs it is necessary to sequence sets of genomes which will be used as a reference for the level of genomic diversity expected in different contexts. These observations are then related to the SNP numbers in the investigational genomes.

  • What can you say about the SNP diversity overall?

  • What does the level of SNP diversity within each group suggest about transmission?

  • How do you think we can use this type of information in the investigation of TB clusters?
»Show answers
» Hide answers

    Feedback for Exercise 3

  • What can you say about the SNP diversity overall?
  • Overall

    The majority of the isolates in any of the comparison groups differ at less than 12 positions, given that the TB genome is 4.4 million base pairs in length this means that TB isolates used to generate the plot are very similar to each other.

    This demonstrates that the extra resolution provided by genomic methods has the potential to discriminate between these isolates in a way that other methods have not been able to.

  • What does the level of SNP diversity within each group suggest about transmission?
  • Household Clusters

    The household group represents the situation where recent transmission was known to have occurred, in this group all the links were 5 or fewer SNP’s. This suggests that this level of relatedness (≤ 5 SNPs) is good evidence of recent transmission taking place.

    Also note the triangular distribution in which most pairs have zero SNPs, and there are decreasing numbers of pairs separated by 1, 2, 3 and 4 SNPs.

    Within-individual comparisons of paired isolates

    In the cross sectional (sputum v other) there is one instance of large (>400) numbers of SNPs between isolates from the same person.

    The longitudinal (persistent open TB > 6 months) show the widest distribution, the majority of longitudinal pairs are less than 5 SNPs apart and the distribution of these pairs is very similar to that of the household isolates – again you can see a triangular shape in which most pairs have zero SNPs, and there are decreasing numbers of pairs separated by 1, 2, 3 and 4 SNPs.

    There are four cases where isolates are separated by 6 to 10 SNPs and two cases where there are > 400 SNP’s.

    Community Clusters

    The plot for cases in which there were known epidemiological links appears very similar to those for individuals and households – there are no instances where isolates are separated by more than 5 SNPs and the distribution is very similar.

    This can be interpreted as meaning there is a high probability of recent direct transmission between cases, which fits with the epidemiological picture.

    The plot for cases in which there is only possible linkage looks rather different. The largest distance between isolates is 12 SNPs and the distribution is much wider – but it is also important to notice that there are fewer cases, there are only 13 links in this group.

    This makes it more difficult to draw strong conclusions about transmission in this group

    The distribution in the group where there is no known linkage looks different again. We can see that points cluster at each end of the scale. There are 7 links separated by >400 SNPs but the majority cluster at the lower end and are separated by fewer than 5 SNPs.

    We can conclude that those separated by >400 SNPs are unlikely to represent direct transmission and that those separated by five or less SNPs have a high probability of being the result of direct transmission.

    There are six links of between 6 and 9 SNPs - the genomic data is more difficult to interpret in terms of transmission here and it is very important that any action should be based on ALL the available information.

    The results from the household clusters suggests that a level of relatedness of ≤ 5 SNPs is good evidence of recent transmission taking place.

  • How do you think we can use this type of information in the investigation of TB clusters?
  • One way of utilising the information is to construct cut-offs around the number of SNPs that may be considered to be consistent with evidence for recent transmission of TB. Dotted lines indicate where these might be on the basis of the four groups of TB isolates analysed in the study described here.

    • Epidemiological linkage consistent with transmission may be expected to exist for TB isolates differing by ≤ 5 SNPs

    • Evidence for transmission is NOT expected to exist where TB isolates differ by > 12 SNPs.

    The majority of links observed in the study fall into one of these two groups, demonstrating that this type of “rule” may be useful in most cases.

    Where TB isolates differ by between 6 - 12 SNPs, the data does not allow us to say with confidence whether we can expect epidemiological data consistent with transmission to exist.

It is important to note that this intermediate level of diversity may have a different interpretation depending on the epidemiological context, for instance in a situation where there were known or strongly suspected epidemiological links the likelihood of transmission is higher, whereas in a situation involving TB cases with a common ethnic background (such as in a community where there are many recent immigrants from a particular country) but no recent epidemiological linkage between cases, where probability of transmission is likely to be lower.

» Continue