Using the genome data

  • The assembled genome sequence data for each strain was uploaded into a database called BIGSdb . The assembly, uploading and the annotation of genomes is a largely automated process within the BIGSdb database system that is publicly available over the World Wide Web.

  • The isolates are compared based on whether their copies of the genes at each annotated gene position are identical or different.

  • This approach was used to analyse relationships between the 12 genome sequences. The automated function involves BIGSdb identifying gene sequences within each newly sequenced genome, and comparing each gene sequence against the corresponding gene in the reference sequence using the BLAST algorithm, assigning an allele designation (to allow rapid checking of whether genes are identical or different between isolates) and then undertaking comparisons between all of the genomes.

  • BIGSdb was used to compare the 12 isolates using all the genes which are present and sequenced in all 12 strains isolate, when compared against the reference genome (Any genome or set of known genes in a species can be chosen as the “reference” sequence, this is the set of genes by which isolates are compared in this case two reference genomes were used, one from a previous UK outbreak and another international strain which has a well annotated genome).

