LIN codes for Listeria monocytogenes

cgMLST-based Life Identification Numbers for Genomic Taxonomy of strains of L. monocytogenes

Quick Presentation of Listeria monocytogenes LIN codes

The cgMLST-based Life Identification Numbers (LIN) codes (cgLIN codes, or LIN codes for short) constitute a novel genomic taxonomy system for bacterial strains within species. An implementation of LIN codes was specifically developed for Listeria monocytogenes strains. This system provides a stable, hierarchical classification framework that captures phylogenetic diversity from major lineages down to near genetic identity levels, useful for epidemiological surveillance of this critical foodborne pathogen (Palma, Hennart et al. for details).

General Presentation

The LIN code system for Listeria monocytogenes is built upon a refined version of the widely used Moura et al. core-genome Multi-Locus Sequence Typing (cgMLST) scheme comprising 1,748 conserved loci. This version 2 cgMLST scheme represents a robust genomic typing approach used as a basis for the LIN code genomic taxonomy.

A LIN code is a multi-position integer-based code attributed to each genome based on cgMLST profile. The L. monocytogenes LIN code system consists of 12 fixed, ordered bins that determine partitions of the [0%-100%] range of cgMLST profile similarity. Each bin is defined by lower (inclusive) and upper (exclusive) similarity thresholds, with leftmost bins capturing broad phylogenetic divisions (major lineages LI-LIV) and rightmost bins capturing fine-scale genetic levels for epidemiological surveillance and outbreak investigation.

Additional information on LIN code fundamental concepts, LIN code attribution process and LIN code implementation in external platforms can be found at the BIGSdb-Pasteur page of the first cgMLST LIN code implementation https://bigsdb.pasteur.fr/klebsiella/cgmlst-lincodes/, and in the publication Palma, Hennart et al..

Table 1. Taxonomic levels and associated thresholds of the LIN code strain taxonomy for L. monocytogenes

bin	classification	No. allele mismatches	No. identical alleles	% allele similarity	Purpose
1	Lineage	[1748-1660[	[0-88[	[0.00-5.03[	Phylogenetic level
2	-	[1660-1500[	[88-248[	[5.03-14.19[	Phylogenetic level
3	-	[1500-1342[	[248-406[	[14.19-23.23[	Phylogenetic level
4	-	[1342-1182[	[406-566[	[23.23-32.38[	Phylogenetic level
5	-	[1182-1020[	[566-728[	[32.38-41.65[	Phylogenetic level
6	Sublineage (SL)	[1020-200[	[728-1548[	[41.65-88.56[	Phylogenetic level
7	-	[200-50[	[1548-1698[	[88.56-97.14[	Phylogenetic level
8	Genetic Cluster (GC)	[50-7[	[1698-1741[	[97.14-99.6[	Phylogenetic level
9	-	[7-4[	[1741-1744[	[99.6-99.77[	Epidemiological level
10	-	[4-2[	[1744-1746[	[99.77-99.89[	Epidemiological level
11	-	[2-1[	[1746-1747[	[99.89-99.94[	Epidemiological level
12	-	[1-0[	[1747-1748[	[99.94-100.00[	Epidemiological level

Source database of the LIN codes

For obvious consistency purposes, there should be only one source database for the LIN code definitions. The source of LIN codes for L. monocytogenes genomes is the BIGSdb-Pasteur platform Listeria monocytogenes sequence definitions database (BIGSdb-Lm) (https://bigsdb.pasteur.fr/listeria/).

A human-readable nomenclature of genetic clusters and sublineages: nicknames of LIN code prefixes

The LIN codes can be used as a nomenclatural system per se, but it is more convenient to use human-readable and easy to remember nicknames to designate important groups. In L. monocytogenes, existing prefixes (an incomplete part of the LIN code starting at its leftmost bin, of size 1 to 11) corresponding to a few key levels of the LIN codes are attributed aliases (nicknames). This enables backward compatibility and easier communication between microbiologists and epidemiologists using LIN codes.

To maximize continuity with established nomenclatures, these nicknames were chosen based on previously existing taxonomic systems. The first level (prefix of size 1) was nicknamed with the classical lineage nomenclature (table 2).

Table 2. The correspondence between the first LIN code level and major lineages

LIN Prefix	Major Lineage
0_	Lineage I (LI)
1_	Lineage II (LII)
2_	Lineage III (LIII)
3_	Lineage IV (LIV)

Levels 6 and 8 of the LIN codes correspond to phylogenetic classification levels within species called 'sublineage' (SL) and 'genetic cluster' (GC) levels, respectively (Table 3). SL nicknames correspond to the predominant sublineages (SL) defined previously with the cgMLST 150 allelic mismatches cutoff (Moura et al.), where possible, creating a dictionary between cgMLST identifiers and the LIN code classification. GC nicknames were initially assigned by decreasing total genome counts in the database (as of 2025) and are now being incremented by one automatically. The GC level is particularly relevant for epidemiological investigations in L. monocytogenes, as isolates sharing the same GC (i.e., same LIN code prefix up to level 8) have no more than 7 allelic mismatches, fitting the former cgMLST type (CT) definition for epidemiologically related isolates (Moura et al.).

Table 3. Examples of LIN code prefixes, nicknames and corresponding previous SL classification for sublineages (SL) and genetic clusters (GC)

LIN Prefix (6-levels)	SL Nickname	Majority previous SLs	LIN Prefix (8-levels)	MGC Nickname
1_0_0_0_13_1	SL6	SL6	1_0_0_0_13_1_3_11	GC1
0_0_0_0_6_0	SL321	SL321	0_0_0_0_6_0_0_15	GC2
0_0_0_19_1_1	SL121	SL121	0_0_0_19_1_1_0_48	GC5
0_0_0_19_1_1	SL121	SL121	0_0_0_19_1_1_0_114	GC7
0_0_0_18_0_0	SL155	SL155	0_0_0_18_0_0_0_41	GC12
1_0_0_0_7_0	SL217	SL217	1_0_0_0_7_0_0_4	GC18
2_51_0_0_8_1	SL2602	SL2602	2_51_0_0_8_1_0_0	GC138
3_1_2_0_0_0	SL562	SL562	3_1_2_0_0_0_0_0	GC1005

Implementation and Database Access

BIGSdb-Pasteur Listeria monocytogenes Platform

The L. monocytogenes LIN code system is implemented and maintained in the BIGSdb-Pasteur Listeria monocytogenes platform (BIGSdb-Lm, https://bigsdb.pasteur.fr/listeria/). This platform serves as the authoritative source for:

🧬 cgMLST allele and profile definitions

🔗 LIN code definitions and assignments

🏷️ Nickname assignments for lineages and genetic clusters

🌐 Standardized nomenclature platform for global comparative genomics

BIGSdb-Lm database URL: https://bigsdb.pasteur.fr/listeria/

Advantages of Using the LIN code Nomenclature

The cgMLST-based LIN code system for L. monocytogenes offers several critical advantages over previous typing methods:

Enhanced Phylogenetic Resolution

The 1,748 loci provide precise discrimination between closely related isolates in outbreak investigations
The high number of genetic markers provide meaningful phylogenetic context across all four major lineages

Nomenclatural Stability and Backwards Compatibility

LIN codes are stable by design, avoiding the group fusion issues of single-linkage clustering
Once assigned, previously attributed LIN codes are static regardless of new genome additions
Human-readable nicknames provide continuity with established lineage nomenclature
Shared nomenclature enables global communication, unlike local or private nomenclatures, and provides precise phylogenetic context, unlike arbitrary ST numbers

Flexible Cluster Analysis Using Multiple Levels

The 12 hierarchical levels accommodate diverse biological questions or epidemiological investigations
LIN codes enable analysis from broad population structure (lineages) to fine-scale transmission (0-7 allelic differences)
Shared LIN prefixes indicate minimum and maximum similarity levels between isolates, providing genetic relatedness information embedded within the LIN codes

Data Submission and Curation

For consistency and accuracy, there should ideally be only one authoritative source for L. monocytogenes LIN code definitions. Researchers are encouraged to submit genomic data to the BIGSdb-Pasteur platform for LIN code assignment and integration into the shared L. monocytogenes genomic taxonomy.

📋 Submission Guidelines: https://bigsdb.pasteur.fr/submission-procedure-for-data-curation/

Key References

Delgado-Blas JF et al. The cgMLST-based LIN code system for Listeria monocytogenes genomic taxonomy. In preparation.

Hennart M, Guglielmini J, Bridel S, Maiden MCJ, Jolley KA, Criscuolo A, Brisse S. A Dual Barcoding Approach to Bacterial Strain Nomenclature: Genomic Taxonomy of Klebsiella pneumoniae Strains. Mol Biol Evol. 2022;39(7):msac135. https://doi.org/10.1093/molbev/msac135

Palma F, Hennart M, Jolley KA, Crestani C, Wyres KL, Bridel S, Yeats CA, Brancotte B, Raffestin B, David S, Lam MMC, Izdebski R, Passet V, Rodrigues C, Rethoret-Pasty M, Combary A, Cottis S, Maiden MCJ, Aanensen DM, Holt KE, Criscuolo A, Brisse S. Life Identification Numbers: Life Identification Numbers: A strain nomenclature approach to aid epidemiological surveillance of bacterial pathogens. PLoS Biol. 2026 Jun 4;24(6):e3003781. https://doi.org/10.1371/journal.pbio.3003781

Moura A, Criscuolo A, Pouseele H, Maury MM, Leclercq A, Tarr C, Björkman JT, Dallman T, Reimer A, Enouf V, Larsson JT, Carleton HA, Bracq-Dieye H, Katz LS, Jones L, Touchon M, Tourdjman M, Walker M, Stroika S, Cantinelli T, Chenal-Francisque V, Kucerova Z, Rocha EP, Nadon C, Lecuit M. Whole genome-based population biology and epidemiological surveillance of Listeria monocytogenes. Nat Microbiol. 2016;2:16185. https://doi.org/10.1038/nmicrobiol.2016.185

Edit on GitLab