How to qualify a gene (structure and localization)
Extraction of the List of standardized IMGT/LIGM-DB keywords
Use the following keywords as far as possible to describe your sequences when
you submit them to a generalist database (EMBL-Bank, GenBank, DDBJ).
StructureType
The structure type distinguishes sequences that show a classical organisation (regular), from those which have been modified either naturally (spliced, processed...), or artificially (humanized, engineered...).
- alternative splicing
Identifies the structures of gene that have characteristics for potential alternative splicing (for example, IG genes with features for potential secreted and membrane chains).
- engineered
Identifies, whatever the molecule type, the sequences that have been modified by deliberate mutagenesis in vitro [1 source].
- fusion
Identifies, whatever the molecule type, the sequences that do not have a classical organization and result from the fusion in vivo or in vitro of molecules from two (or more) different sources [2 (or more) sources].
- immunotoxin
Identifies an IG (or antibody) or RPI protein fused or conjugated with a toxin, obtained in vitro.
- linker
Identifies a short sequence used to link two other sequences.
- membrane
Identifies, whatever the molecule type, the sequences that have a transmembrane exon or region allowing for a transmembrane chain.
- partially-processed
Identifies the sequence of genes (usually orphons) that have lost part of their introns.
- partially-spliced
Identifies the transcripts or cDNA sequences that have been submitted to partial RNA processing or splicing.
- processed
Identifies the sequences of genes (usually orphons) that have lost their introns.
- regular
Identifies, whatever the molecule type, the sequences that have a classical organization without in vivo or in vitro modification.
- secreted
Identifies, whatever the molecule type, the sequences that have an hydrophilic C-terminal exon or region allowing for a secreted soluble chain, e.g presence of the CH-S sequence for IGHC.
- spliced
Identifies the transcripts or cDNA sequences that have been submitted to complete RNA processing or splicing.
- sterile transcript
Identifies the transcripts that cannot be translated in vivo, and corresponding cDNA. For example, for IG or TR, transcripts of V, D or J genes in germline configuration (also designated as "germline transcripts"), transcripts of C genes in undefined configuration, transcripts of switch regions, and corresponding cDNA, respectively.
- truncated
Identifies a shortened protein (missing amino acids) owing to a premature STOP-CODON.
- unspliced
Identifies the transcripts or cDNA sequences that have not been submitted to RNA processing or splicing.
- unusual
Identifies an IG or TR gene with unexpected feature(s) (for instance, insertion of unknown sequences, unexpected rearrangements by inversion...).
- vector
Identifies a sequence from cloning vector.
LocationType
- orphon
Identifies, whatever the molecule type, a gene that is found in vivo on a different locus from the main locus (either on the same chromosome or on another chromosome).
- transgene
Identifies, whatever the molecule type, a gene that is artificially introduced into a multicellular organism (mouse, plant...).
- translocated
Identifies, whatever the molecule type, a gene that results from a translocation (in vivo).
- transposed
Identifies, whatever the molecule type, a transgene or a retrotransposon that is permanently inserted in a chromosome.
- Author:
- Laëtitia Regnier
- Last updated:
- 30/05/2016