Published on Tue Jul 20 2021

Genomic epidemiology and strain taxonomy of Corynebacterium diphtheriae

Guglielmini, J., Hennart, M., Badell, E., Toubiana, J., Criscuolo, A., Brisse, S.

Corynebacterium diphtheriae is highly transmissible and can cause large outbreaks. Sporadic cases or small clusters are observed in high-vaccination settings. We combined 1,305 genes with highly reproducible allele calls into a core genome multilocus sequence typing scheme. We devised a genomic taxonomy of strains and deeper sublineages.

3
9
23
Abstract

Background: Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient. Sporadic cases or small clusters are observed in high-vaccination settings. The phylogeography and short timescale evolution of C. diphtheriae are not well understood, in part due to a lack of harmonized analytical approaches of genomic surveillance and strain tracking. Methods: >We combined 1,305 genes with highly reproducible allele calls into a core genome multilocus sequence typing (cgMLST) scheme. We analyzed cgMLST genes diversity among 602 isolates from sporadic clinical cases, small clusters or large outbreaks. We defined sublineages based on the phylogenetic structure within C. diphtheriae and strains based on the highest number of cgMLST mismatches within documented outbreaks. We performed time-scaled phylogenetic analyses of major sublineages. Results: The cgMLST scheme showed high allele call rate in C. diphtheriae and the closely related species C. belfantii and C. rouxii. We demonstrate its utility to delineate epidemiological case clusters and outbreaks using a 25 mismatches threshold, and reveal a number of cryptic transmission chains, most of which are geographically restricted to one or a few adjacent countries. Subcultures of the vaccine strain PW8 differed by up to 20 cgMLST mismatches. Phylogenetic analyses revealed short timescale evolutionary gain or loss of the diphtheria toxin and biovar-associated genes. We devised a genomic taxonomy of strains and deeper sublineages (defined using a 500 cgMLST mismatches threshold), currently comprising 151 sublineages, only a few of which are geographically widespread based on current sampling. The cgMLST genotyping tool and nomenclature was made publicly accessible at https://bigsdb.pasteur.fr/diphtheria. Conclusions: Standardized genome-scale strain genotyping will help tracing transmission and geographic spread of C. diphtheriae. The unified genomic taxonomy of C. diphtheriae strains provides a common language for studies into the ecology, evolution and virulence heterogeneity among C. diphtheriae sublineages.