IMGT functionality
Citing IG and TR: Lefranc M-P. Immunoglobulin (IG)
and T cell receptor genes (TR): IMGT® and the birth and rise of
immunoinformatics. Front Immunol. Feb 05;5:22 (2014).
doi:10.3389/fimmu.2014.00022.
Open
access PMID:
24600447, LIGM: 429
The IMGT 'functionality' concept is part of the 'IDENTIFICATION' axiom
of IMGT-ONTOLOGY [1] [2] [3] [4].
IG and TR
The identification of the functionality of an IG or TR
entity is based, for rearranged
entities on the sequence analysis of the entity (productive or
unproductive), and for germline or undefined entities on the sequence
analysis of the corresponding gene unit (functional (F), Open Reading
Frame (ORF) or pseudogene (P))
[4].
IG and TR rearranged entities
A rearranged IG or TR (genomic or cDNA) entity is
productive or unproductive.
- PRODUCTIVE
-
A rearranged IG or TR (genomic or cDNA) entity is productive if the
coding region has an open reading frame, with no stop codon and no
defect described in the initiation codon, splicing sites and/or
regulatory elements, and an in-frame JUNCTION.
productive (IMGT
keyword)
Identifies, whatever the molecule type, the
functionality of entity sequences in rearranged or partially
rearranged configuration, whose coding region has an open reading
frame without stop codon and without described defects in the
initiation codon, splicing sites and/or regulatory elements.
Furthermore, for IG and TR, there is an in-frame junction.
- UNPRODUCTIVE
-
An unproductive rearranged IG or TR (genomic or cDNA) entity is
characterized by an out-of-frame JUNCTION and/or the presence of stop
codon(s) and/or frameshift mutation(s), and/or a defect described in
the splicing sites and/or the regulatory element(s), and/or unusual
features (TRANSLOCATED, GENE FUSION...) and/or changes of conserved
amino acids demonstrated as leading to uncorrect folding.
unproductive (IMGT
keyword)
Identifies, whatever the molecule type, the
functionality of entity sequences in rearranged or partially
rearranged configuration, whose coding region has stop codon(s)
and/or frameshift mutation(s), and/or if a mutation affects the
initiation codon, and/or if there are defects in the splicing sites
and/or in the regulatory element(s), and/or there are unusual
features (translocated, gene fusion...) and/or changes of conserved
amino acids demonstrated as leading to incorrect folding.
Furthermore, for IG and TR an out-of-frame junction.
IG and TR germline and undefined entities
A germline (V-GENE, D-GENE or J-GENE) or an
undefined C-GENE
IG or TR entity is functional (F), Open Reading Frame (ORF) or
pseudogene (P).
The functionality is defined by the sequence analysis of the
gene unit (L-V-GENE-UNIT,
D-GENE-UNIT,
J-GENE-UNIT
or C-GENE-UNIT).
IDENTIFICATION
(IMGT
keywords)
|
DESCRIPTION (IMGT labels)
|
Entity type |
Configuration type |
Molecule type |
Functionality |
Entity prototype |
Gene unit |
V-gene |
germline |
gDNA |
F, ORF, P |
V-GENE |
L-V-GENE-UNIT |
D-gene |
germline |
gDNA |
F, ORF, P |
D-GENE |
D-GENE-UNIT |
J-gene |
germline |
gDNA |
F, ORF, P |
J-GENE |
J-GENE-UNIT |
C-gene |
undefined |
gDNA |
F, ORF, P |
C-GENE |
C-GENE-UNIT |
- FUNCTIONAL
-
A germline entity (V-GENE, D-GENE or J-GENE) or a C-GENE is
functional (F) if the coding region of the respective corresponding gene
unit (L-V-GENE-UNIT, D-GENE-UNIT, J-GENE-UNIT or C-GENE-UNIT) has an
open reading frame without stop codon, and if there is no described
defect in the splicing sites, recombination signals and/or regulatory
elements.
functional (IMGT
keyword)
Identifies, whatever the molecule type, the
functionality of entity sequences in undefined or germline
configuration, whose coding region has an open reading frame without
stop codon, no defect in the splicing sites, recombination signals
and/or regulatory elements.
FUNCTIONAL (
IMGT
annotation rules)
Germline V and J genes and alleles with a
STOP-CODON region end
The IMGT functionality of germline V and J genes and alleles which
have as unique defect an in-frame STOP-CODON at their region end (3'
last codon for V-REGION, 5' first codon for J-REGION based on IMGT [3]) is functional.
Indeed,
this STOP-CODON may very frequently disappear during V-(D)-J
rearrangements owing to the mechanisms of junctional and/or
N-diversity. The resulting rearranged sequences are classically either
productive or unproductive.
- ORF (Open Reading Frame)
-
A germline entity (V-GENE, D-GENE or J-GENE) or a C-GENE is qualified
as ORF (Open Reading Frame) if the coding region of the respective
corresponding gene unit (L-V-GENE-UNIT, D-GENE-UNIT, J-GENE-UNIT or
C-GENE-UNIT) has an open reading frame, but:
- alterations have been described in the splicing sites,
recombination signals and/or regulatory elements.
- and/or changes of conserved amino acids have been suggested
by the authors to lead to uncorrect folding.
- and/or the entity is an orphon.
ORF (IMGT
keyword)
Identifies, whatever the molecule type, the
functionality of entity sequences in undefined or germline
configuration, whose coding region has an open reading frame (ORF),
but alterations have been described in the splicing sites,
recombination signals and/or regulatory elements, and/or changes of
conserved amino acids to lead to incorrect folding, and/or the entity
is an orphon.
ORF (
IMGT
annotation rules)
A gene unit is ORF if:
- one of the following labels is noncanonical: DONOR-SPLICE,
ACCEPTOR-SPLICE, X-HEPTAMER and/or X-NONAMER.
- and/or conserved motifs are mutated: J-MOTIF, 1st-CYS,
CONSERVED-TRP and/or 2nd-CYS.
- and/or there is a deletion of more than 3 AA in a FR-IMGT
or a CDR-IMGT.
- and/or the length of V-INTRON or X-SPACER is unexpected
(difference of 3 nt for X-SPACER and V-INTRON > 500 nt).
- PSEUDOGENE
-
A germline entity (V-GENE, D-GENE or J-GENE) or a C-GENE is qualified
as pseudogene if the coding region of the respective corresponding
gene unit (L-V-GENE-UNIT, D-GENE-UNIT, J-GENE-UNIT or C-GENE-UNIT)
has stop codon(s) and/or frameshift mutation(s).
In
particular, a L-V-GENE-UNIT is considered as pseudogene if these
defects occur in the L-PART1
and/or V-EXON, or if there is a
mutation in the L-PART1 INIT-CODON
atg.
A J-GENE-UNIT is considered as pseudogene if it has been
identified by the presence of a recombination signal upstream of an
open reading frame, but it has no donor splicing site in 5' or the
donor splice is not in the expected sf1 or if no
J-MOTIF is identified.
Pseudogene (IMGT
keyword)
Identifies, whatever the molecule type, the
functionality of entity sequences in undefined or germline
configuration, whose coding region has stop codon(s) and/or
frameshift mutation(s), and/or a mutation affects the initiation
codon.
Pseudogene (
IMGT
annotation rules)
A gene unit is pseudogene (P) if:
- there is a STOP-CODON in the coding region (except the
last codon of the V-REGION and the first codon of the J-REGION).
- and/or there is an insertion or a deletion non-multiple of
3 that causes a frameshift.
- and/or the splicing between DONOR-SPLICE and
ACCEPTOR-SPLICE forms a STOP-CODON.
- and/or one of the following labels is missing: INIT-CODON,
L-PART1, L-PART2, DONOR-SPLICE, ACCEPTOR-SPLICE, X-HEPTAMER and/or
X-NONAMER.
- and/or a part of the V-REGION is missing (too much mutated
to be described).
Gene or allele functionality shown between
parentheses or brackets
Gene or allele functionality is shown between parentheses, (F) or (P),
when the accession number refers to rearranged genomic DNA or cDNA and
the corresponding germline gene has not yet been isolated.
Gene
or allele functionality is shown between brackets, [F] or [P], when
the accession number refers to genomic DNA, but not known as being
germline or rearranged.
Partial gene unit sequences
Functionality is assigned taking into account, whenever possible, the
data from other sequences of the same gene.
A partial sequence
with an open reading frame will be considered as ORF, and not
functional, except if clearly mentioned by the authors.
Genes other than IG or TR
For genes other than IG or TR (L-GENE or GENE), the functionality
instances are the same as for the IG and TR germline or undefined
entities, and base on the sequence of the gene units (L-GENE-UNIT,
GENE-UNIT), that is FUNCTIONAL, ORF
and PSEUDOGENE.
IDENTIFICATION
(IMGT
keywords)
|
DESCRIPTION (IMGT labels)
|
Entity type |
Configuration type |
Molecule type |
Functionality |
Entity prototype |
Gene unit |
L-conventional-gene |
undefined
|
gDNA |
F, ORF, P |
L-GENE |
L-GENE-UNIT |
conventional-gene
|
undefined |
gDNA |
F, ORF, P |
GENE |
GENE-UNIT |