992) the most differentiating (Table 1). Diversity of peptide sequence types After translating the in-frame nucleotide sequences into the peptide sequences a total of 31 different pSTs with 19 (61.3%) new pSTs were generated from the analyzed isolates (Additional file 1: Table S1). The pSTs occurred with a frequency of 0.8% to 28.5%. For the different loci a total of 39 distinct alleles were found. For most of the loci, one allele was dominant (more than 90%), except for p_dnaE and p_pyrC. New
alleles (n = 15) were identified for all loci despite of p_gyrB and p_recA. The Simpsons Index of diversity was heterogenic, with very low values for p_gyrB, p_recA and p_tnaA (0.000, Torin 1 cost 0.000, and 0.127) indicating a low ability to discriminate between strains up to higher values for p_dnaE and p_pyrC (0.630 and 0.791) (Table 1). To summarize the data of the different subpopulations, less different pSTs with a lower proportion of new types were observed, but for several 17-AAG nmr regions pSTs
were diverse, e.g. each distinct ST of strains from the Chillaw region in Sri Lanka possessed a unique corresponding pST (Table 2). Peptide sequence types of pubMLST database In total, 584 STs with at least one corresponding isolate were present in the pubMLST database and translation of the in-frame sequences yielded 166 distinct pSTs. AA-MLST profiles and properties of each allele on peptide level (numbers, sequences and frequencies) are shown in Additional file 2: Tables S2. An alternative AA-MLST typing scheme was applied by Theethakaew et al. during the preparation of this manuscript . Comparison of MLST and AA-MLST In total, 372 unique MLST and 39 AA-MLST-alleles were detected in our study. Therefore most of the reduction (mean of 95.6%) in strain diversity stemmed from the wobble bases as exemplarily calculated for the most common allele of each locus of the Ergoloid pubMLST dataset (data not shown). The proportion of the alleles of one locus to the total number of alleles changed from nucleotide to peptide level as reflected by the d N /d S -values and revealing different influences of the loci on both
typing schemes. For example, on nucleotide level 65 different gyrB alleles were transformed into one p_gyrB. This is reflected by a d N /d S -value of 0 that indicates exclusively synonymous substitutions. In contrast, far more non-synonymous substitutions (as indicated by a d N /d S -value of 0.045) were observed for pyrC. Clonal relationships among global sets and subsets of isolates To identify the population structure of the analyzed strains, the standardized Index of Association ( ) was calculated (Table 3). The value differed learn more significantly from zero, when all our isolates, all subsets separately or all pubMLST isolates were included, indicating that the alleles were in linkage disequilibrium or were not randomly distributed. When analyzing only one isolate per ST, the drops, but remains unequal to zero, indicating a tendency to linkage disequilibrium.