
Whole genome sequencing data requirements
Users are requested to submit only high-quality assemblies, generated from pure cultures sequenced at a minimum coverage of 40X. Assembly files consisting of high numbers of contigs (> 500), presenting a cumulative contigs length outside the typical range of Escherichia coli (~ 4.3 – 5.9 Mbp) will not be accepted. Submissions containing low quality assemblies may be entirely rejected.
NB. For quality purposes, we only accept assemblies either generated from high quality short-reads or combining both short and long reads (hybrid assemblies). Please note that genomes obtained using long-read sequence technology, or by Ion Torrent/Roche/454 will not be uploaded to the database, nor used to define new alleles. However, if using one of these assemblies you discover a new MLST profile(s) composed solely of existing alleles, you may make a 'profile' submission type to define a new ST.
Please refer to the assembly metrics below:
| Species | Size of genome | Number of contigs | C+G% | Coverage |
|---|---|---|---|---|
| E. coli | 4 300 000 - 5 900 000 | ≤ 500 | N.A | >= 40 |