The Science of the total environment

Empirical testing of modified Salmonella MLST in aquatic environmental samples by in silico analysis.

PMID 28043703


Multilocus sequence typing (MLST) is an approach for prediction of Salmonella servoar and eBRUST groups (eBGs) based on seven typing scheme of housekeeping genes. Up to date, >220.000 allelic profiles and 65,973 Salmonella strains have been established in the MLST database. Several studies have modified MLST method with fewer targeted housekeeping genes for the purpose of economy and efficiency. Nevertheless, no study has conducted systematically to evaluate the correlation between the numbers of housekeeping genes targeted and the accuracy of prediction rate. In this study, we aimed to tackle this problem by extracting data from the MLST database as a whole using the software RStudio. Our results indicated that as the numbers of genes in MLST scheme increased, the accuracy of the eBGs prediction rate increased and reached 100% when the gene numbers are greater than or equal to 5. To examine the applicability of the approach, 395 environmental water samples were subjected to this study. A set of 52 Salmonella enterica isolates was initially used to develop MLST targeting seven housekeeping genes. A total of 29 sequence types, including 11 new sequence types were found among the 52 sequenced isolates that differentiated into 19 serotypes. Moreover, two novel sequence types did not belong to current classification. Our results show that the outcome in the three-gene sequence typing (aroC, hisD, and purE) was as accurate as in the seven-gene sequence typing for prediction of environmental Salmonella isolates. Our data suggested that this five-gene and reduced gene-number sequence-typing scheme can serve as an alternative modified MLST when effectiveness and financial management were the concerns.