Data access policy note: since January 1st, 2025, registration and authentication are mandatory to access all data curated after this date, either via the web interface or via the application programming interface (API authentication help link). Please contact us if you have any questions.

Quality criteria for accepting whole genome sequencing data

Users are requested to submit only high-quality assemblies, generated from pure cultures and sequenced at a minimum coverage of 40X, that comply with the KlebNET-GSP quality criteria. Assembly files consisting of highly fragmented contigs (> 500 contigs or N50 < 20K) or presenting a cumulative contigs length outside the typical range of the Kp species complex (~ 4.6 - 6.4 Mbp) will not be accepted. Submissions containing low quality assemblies may be entirely rejected.

NB. For quality purposes, we only accept assemblies either generated from high quality short-reads or combining both short and long reads (hybrid assemblies). Please note that genomes obtained using long-read sequence technology, or by Ion Torrent/Roche/454 will not be uploaded to the database, nor used to define new alleles. However, if using one of these assemblies you discover a new MLST profile(s) composed solely of existing alleles, you may make a 'profile' submission type to define a new ST.

Please refer to the assembly metrics below:

SpeciesSize of genomeNumber of contigsC+G%Coverage
K.pneumoniae4700000 - 6300000≤ 50056.1 - 58.1>= 40
K. quasipneumoniae4900000 - 6100000≤ 50056.7 - 58.4>= 40
K. quasivariicola5400000 - 6100000≤ 50056.1 - 57.4>= 40
K. variicola5200000 - 6200000≤ 50056.5 - 57.8>= 40
Edit on GitLab