Data access policy note: since January 1st, 2025, registration and authentication are mandatory to access all data curated after this date, either via the web interface or via the application programming interface (API authentication help link).

To request an API key, follow the steps outlined at: Requesting a BIGSdb-Pasteur API Key

Please contact us if you have any questions.

Getting an account

If you do not have an account yet, please create a new account thorough the site Register for a site-wide account and register your account with the database where you wish to submit data. If you already have an account associated with a database, you can reset your password, update your profile and register your account with specific databases in the site-wide account settings page.

Please ensure that you enter a valid E-mail address as account validation details will be sent to this. Please fill the box with your details completely with proper first letter capitalization of names and full affiliation details (avoiding acronyms). This information will appear with any data that you submit.

Genomes curation (RECOMMENDED):

We accept whole genome sequence data in FASTA format:

single contig of closed chromosome
multifasta file with closed chromosome and plasmids from the same isolate
multi-contig files (whole genome shotgun)
scaffold files

Each file should correspond to a single isolate. Users are requested to provide only high-quality sequences, generated from pure cultures sequenced at a minimum coverage of 40X. Assembly files consisting of high numbers of contigs (> 500), presenting a cumulative contigs length outside the typical range of the species will not be accepted.

Submission of whole genome sequences should be accompanied by the standard isolate submission template file. Please download the Template for isolates submission from the homepage and fill it out with the isolate's information. Please find here the required fields. Please note that the assembly files should be named with an identifier corresponding to the isolate name. For example, if the isolate name is ATCC13883 in the isolates template, please label your assembly file as ATCC13883.fas. The suffixes .fa, .fasta or .fna are also acceptable.

Before submitting whole genome sequence data, please contact the curators in order to agree on how to send the data (please do not attach assemblies to e-mails).

Alleles and profiles curation:

To incorporate your profile and isolate information into the database, please provide us with the user identification and the following information:

For each new allele that does not already exist in the database, please send us at least two chromatogram traces (for example in .ab1 or .scf format). Ideally, each nucleotide over the length of the template should be supported by at least two traces without ambiguity. Please label your chromatogram trace files starting with the strain code (avoiding special characters such as space, dot,slash, etc), followed by underscore, followed by the gene name written exactly as in the database, followed by underscore, followed by any information you would like, followed by '.ab1' or '.scf'. Example: SB1_ddlA_398509385.ab1
For each new profile that does not already exist in the database, we request that at least one reference strain be submitted to the isolates database. Please use the Excel templates available from the home page to provide isolate data. Please find here the required fields. You can submit multiple MLST profiles in a single submission. The submitted profile should correspond to results obtained querying your sequence into BIGSdb-Pasteur Alleles and profiles database, as this database may be more up to date than external databases that download and provide our curated data on other platforms.

Data release

We aim to process submissions within 48h.

Once the curation is complete, allele definitions and profiles will be released immediately, as they form the basis for the public nomenclature of genotypes.

We encourage the submission of all isolates of your studies, so that the database would be as representative of the natural populations as possible. Thus, please do not restrict yourself to submitting only isolates that represent new profiles: isolates with already known MLST types are valuable as well.

We also encourage immediate release of your isolates provenance data and genomic sequences. However, in the case you would like to ask for an embargo period (e.g. prior to publication), please specify it to the curators in the submission request mail. Data release should ideally be when your publication is accepted or no later than one year after submission to our database. Curators may automatically submit your data to the public database after this period. You will be informed about the imminent release of your data.

The proposed release date should be the same for all isolates of the submission batch. Please provide this information in the isolates submission template or via the submission confirmation mail.

You will be notified and asked to control if all data integrated in the BIGSdb database are correct. We kindly ask you to inform us in the case of metadata rectifications to allow us to keep current records updated.

Acknowledgments

Curation and maintenance of BIGSdb-Pasteur databases is performed on a voluntary basis by a handful of colleagues, who dedicate their time to the community.

We appreciate if you can recognize our efforts in the acknowledgments section of your publications:

We thank the Institut Pasteur teams for the curation and maintenance of BIGSdb-Pasteur databases at http://bigsdb.pasteur.fr/.

References

Please quote the original publications on the development of nomenclature schemes from the references page.

Collaboration

For submissions of large amounts of data, or when you would like to request further analyses in addition to curation, we may ask you to add the person of the curator team in charge of analyzing your data among your co-authors.

Edit on GitLab