IMGT®, the international ImMunoGeneTics information system®

logo IMGT

Frequently asked questions

Human

  1. Is there a complete list of the human IG and TR genes?
  2. What are the specificities of the commercial monoclonal antibody reagents, for human T cell receptors?
  3. Where can I find known human IG allotype sequences?
  4. Why are there differences in the V and J assignments of rearranged human IG and TR sequences, between IMGT/LIGM-DB and the generalist databases GenBank/EMBL/DDBJ, although the flat file accesssion numbers are identical?
  5. How to get human germline sequences in FASTA format?
  6. How to recover the integrality of the translation of the human immunoglobulin germline sequences?
  7. What is the official nomenclature for human TR genes? What is the correspondence between nomenclatures?
  8. Are the sequences of the T cell receptor (TR) chains from the Jurkat cell line known?
  9. Where to find the complete sequence of the human TRA/TRD and TRB loci?
  10. Is there a way to cross-reference the IMGT named TR genes with the more common names and to correlate IMGT sequences with commercially available flow-cytometry antibodies?
  11. Is it possible to retrieve the human CDR-IMGT from the IMGT® databases?
  12. In the human TRGJP, TRGJP1 and TRGJP2 gene names, what does P stand for and what is the relation between P and P1 and P2?
  13. What are the rules for designating the constant domains of the immunoglobulin heavy chains?
  14. Are there, for my teaching, some exercises with answers to illustrate the use of IMGT® for immunoglobulin sequence analysis and 3D structure visualization of immunoglobulins?
  15. What is the easiest way to identify the N glycosylation sites of the human germline IGHV, IGKV and IGLV? of the human germline IGHJ, IGKJ and IGLJ?
  16. Are there T cell receptor haplotypes defined from the human genome sequencing?
  17. How to make an URL link to an IMGT/LIGM-DB entry?
  18. How to download L-PART1+V-EXON in FASTA format for all genes in Gene tables for human TRBV and human TRAV?
  19. In IMGT/V-QUEST, when obtaining the information 'Nucleotide insertions have been detected and automatically removed...'.What do the insertions (or deletions) mean? Are they artefacts that have been introduced during sequence amplification or sequencing? Are they natural variants?
  20. How to rapidly convert into the IMGT unique numbering, a VH CDR3 old numbering from previous publications?
  21. How to obtain the genomic coordinates of the whole human IGH locus (about 1250 Kb)?
  22. How to get alignments of the leader of the human germline V genes in order to design subgroup specific primers for PCR amplification of the expressed V gene repertoire?
  23. Is it normal to obtain five TRAV genes when querying IMGT/GENE-DB for human TRDV genes?

Is there a complete list of the human IG and TR genes?
Yes, the complete list of the human IG and TR genes, approved by HGNC, is available in "Lists of human IG and TR genes, groups, loci and orphons and links between IMGT, HUGO, GDB, Entrez Gene and OMIM" in IMGT Repertoire. Genes and allele sequences and information can be retrieved from IMGT/GENE-DB.
Is there a complete list of the human IG and TR genes?
Yes, the complete list of the human IG and TR genes, approved by HGNC, is available in "Lists of human IG and TR genes, groups, loci and orphons and links between IMGT, HUGO, GDB, Entrez Gene and OMIM" in IMGT Repertoire. Genes and allele sequences and information can be retrieved from IMGT/GENE-DB.
What are the specificities of the commercial monoclonal antibody reagents, for human T cell receptors?
IMGT Repertoire > 6. Gene regulation and expression > Reagents monoclonal antibodies
Where can I find known human IG allotype sequences?
For Gm allotype sequences, the IMGT/LIGM-DB accession numbers of the sequences corresponding to the Gm allotypes are indicated in "Gene table: Human (Homo sapiens) IGHC" in IMGT Repertoire. The corresponding IGHG allele sequences in FASTA format (per exon) are available from IMGT/GENE-DB.
For Km allotypes, the correspondence between Km alleles and IGKC allele names is available in "Allotypes: Human IGKC" in IMGT Repertoire. The corresponding IGKC allele sequences in FASTA format are available from IMGT/GENE-DB.
Why are there differences in the V and J assignments of rearranged human IG and TR sequences, between IMGT/LIGM-DB and the generalist databases GenBank/EMBL/DDBJ, although the flat file accesssion numbers are identical?
IMGT/LIGM-DB provides annotated flat files and uses the official nomenclature of the human immunoglobulin (IG) and T cell receptor (TR) genes, defined by IMGT and approved by the HUGO Nomenclature committee (HGNC) in 1999. The official nomenclature is used by GeneCards and Entrez Gene at NCBI: example of an Entrez Gene
If you use IMGT/V-QUEST to analyse the rearranged IG or TR sequences, you will find the correct gene and allele assignment. Citing IMGT/V-QUEST: PMID: 15215425.
IMGT Repertoire: Correspondence between nomenclatures
IMGT Index: Nomenclature.
The reference books are the following:
Lefranc, M.-P. and Lefranc, G., The Immunoglobulin FactsBook, Academic Press, 458 pages (2001) ISBN:012441351X
Lefranc, M.-P. and Lefranc, G., The T cell receptor FactsBook, Academic Press, 398 pages (2001) ISBN:0124413528.
How to get human germline sequences in FASTA format?
You can query "IMGT/GENE-DB Direct links".
A link is provided at the bottom of the IMGT/GENE-DB Home page
How to recover the integrality of the translation of the human immunoglobulin germline sequences?
The following direct links will provide the translation of the human immunoglobulin germline sequences including all known alleles (designated as *01, *02, etc). As the alleles are described at the nucleotide level, two amino acid sequences may be identical.
  1. For immunoglobulin V, D and J regions:
  2. For the C-REGION, query the IMGT/GENE-DB Home page
    • Homo sapiens, IGHC and functional
      Do the search
      Next page: Select all genes
      In "Choose your display": Amino acid sequences without gaps
    • Same type of query for IGKC
    • Same type of query for IGLC
What is the official nomenclature for human TR genes? What is the correspondence between nomenclatures?
The official nomenclature for human TR genes is the IMGT nomenclature which has been approved by the Human Genome Organisation (HUGO) Nomenclature Committee (HGNC) in 1999 and entered in IMGT/GENE-DB
http://www.imgt.org/genedb/GENElect?query=2+TRBV7-2&species=Homo+sapiens
and in Entrez Gene at NCBI
http://www.ncbi.nlm.nih.gov/sites/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=28596
The human TR nomenclature has been published in various publications (all available in pdf at the IMGT Locus in Focus page [7-10, 13]) and in The T cell Receptor FactsBook, Academic Press (2001).
Correspondence between nomenclatures is available in the IMGT Scientific chart:
http://www.imgt.org/IMGTScientificChart/
Correspondence between gene names and Reagents monoclonal antibodies specificity is available in IMGT Repertoire:
http://www.imgt.org/IMGTrepertoire/Regulation/
IMGT Gene tables are also very useful for an overview of the genes belonging to a group. For example, for the human TRBV:
http://www.imgt.org/IMGTrepertoire/index.php?section=LocusGenes&repertoire=genetable&species=human&group=TRBV
Are the sequences of the T cell receptor (TR) chains from the Jurkat cell line known?
The sequences of the T cell receptor (TR) chains from the JM/Jurkat cell line (JM and Jurkat are the same cell line) are known. The accession numbers of the TRBV-(D)-J and TRAV-J rearrangements are the following:
  1. TRBV-(D)-J
    K02885/X01417 (JM)
    TRBV12-3*01-TRBJ1-2*01 (no D is recognizable)
    Assignment of rearranged cDNAs and gDNAs to germline genes: Human TRBV, note 10
    For information, the clone YT35 (X00437/K01571) said to be from MOLT-3, and the sequence X02515 said to be from HPB-ALL, most probably derive both from JM/Jurkat.
  2. TRAV-J
    X02592/M12959 (JM) and, with one sequencing or typing error, M12423 (Jurkat)
    TRAV8-4*01-TRAJ3*01
    Assignment of rearranged cDNAs and gDNAs to germline genes: Human TRAV
    For information, the clone HAVT18 (M27368) said to be from thymus most probably derives from JM/Jurkat.
Where to find the complete sequence of the human TRA/TRD and TRB loci?
The complete sequence of the human TRA/TRD locus is covered by the four accession numbers AE000658-AE000662 which are are adjacent to each other. In IMGT/LocusView the TRA/TRD locus localization (start and end: 127701-1058955 bp) and the gene positions are those in the clone contig which starts at the beginning of AE000658 and ends at the extremity of AE000662.
The complete sequence of the human TRB locus is contained in the L36092 accession number in IMGT/LIGM-DB. In IMGT/LocusView, the TRB locus localization (start and end: 91557-667340 bp) and the gene positions are those in the L36092 accession number. The original L36092 sequence (684973 bp) has been split in EMBL into three sequences of 267156 bp (U66059), 215422 bp (U66060) and 232650 bp (U66061) which overlap, L36092 has become secondary accession number of U66059, U66060 and U66061. In IMGT, the original unsplit sequence L36092 which is fully annotated has also been kept as primary accession number, in addition to U66059, U66060 and U66061.
Is there a way to cross-reference the IMGT named TR genes with the more common names and to correlate IMGT sequences with commercially available flow-cytometry antibodies?
The information is available in IMGT Repertoire:
In each case, the query can be done by an automatic search on the page with the common name. Another more general way is to make a search by Google on the IMGT site (available at the IMGT Home page).
Is it possible to retrieve the human CDR-IMGT from the IMGT® databases?
Yes, it is possible to retrieve the human CDR-IMGT from the IMGT® databases.
  1. For CDR1-IMGT and CDR2-IMGT (and germline CDR3-IMGT) from germline genes:
    1. Query IMGT/GENE-DB
      Example: Species: 'Homo sapiens', Group: 'IGHV', Functionality: 'Functional'
      Do the search
    2. In the result page:
      Select all genes
      then at the bottom of the page in: IMGT label extraction from IMGT/LIGM-DB reference sequences
      Choose label(s) for extraction
      For instance CDR1-IMGT
      You will get nucleotide sequences.
      For amino acid sequences, select also below Amino acid sequences
  2. For CDR1-IMGT, CDR2-IMGT and CDR3-IMGT from rearranged sequences
    1. Query IMGT/LIGM-DB in Taxonomy
      English name of species: 'human'
      Configuration: rearranged
      Loci, genes or chains: 'Ig-Heavy'
      Functionality: 'productive'
      Do the search
    2. Then, on the page with the number of results choose "Subsequences" and, in the window, the label (CDR3-IMGT, for example).
      Choose the type of display:
      • Get subsequences
      • Get subsequences in Fasta format
      • Get translated subsequences (Fasta).
  3. For CDR1-IMGT, CDR2-IMGT and CDR3-IMGT amino acid sequences from known 3D structures
    Query IMGT/3Dstructure-DB
    In Search by Species and Group, Subgroup, Gene or Allele (CLASSIFICATION):
    Select: 'Homo sapiens' and then IMGT group: 'IGHV'
    For Results, Choose: FR-IMGT or CDR-IMGT sequences: CDR3-IMGT (for example)
    Do the search
In the human TRGJP, TRGJP1 and TRGJP2 gene names, what does P stand for and what is the relation between P and P1 and P2?
The letter 'P' stands for Kpn as the rearrangements of TRGJP, TRGJP1 and TRGJP2 are detected by that enzyme (the first letter K could not be used as indicating kappa).
TRGJP is a unique J gene (and was the first one found) whereas in contrast TRGJP1 and TRGJP2 (found later) are duplicated genes.
What are the rules for designating the constant domains of the immunoglobulin heavy chains?
The constant domains of the immunoglobulin (IG) heavy chains are designated with CH (CH1, CH2, CH3, CH4, and in teleostei, CH5...). The same designation is used for nucleotide and amino acid sequences, and 3 D structures. The designation CH is valid whatever the heavy chain type (mu, delta, gamma, epsilon, alpha) and whatever the constant gene that encodes the heavy chain constant region (C-REGION). The assignment to a given gene or chain is indicated by the gene name (or by the chain type name).
For instance:
  • IGHM CH1 (or IG heavy mu CH1)
  • IGHM CH2 (or IG heavy mu CH2)...
  • IGHD CH1 (or IG heavy delta CH1)
  • IGHD CH2 (or IG heavy delta CH2)...
  • IGHG1 CH1 (or IG heavy gamma1 CH1)
  • IGHG1 CH2 (or IG heavy gamma1 CH2)...
  • IGHG3 CH1 (or IG heavy gamma 3 CH1)
  • IGHG3 CH2 (or IG heavy gamma 3 CH2)...
  • IGHE CH1 (or IG heavy epsilon CH1)...
  • IGHA1 CH1 (or IG heavy alpha CH1)...
This designation has been approved by the WHO/IUIS Nomenclature Subcommittee for IG and TR. Human IGHC refers to the group that includes all the IG heavy constant genes found in humans.
  1. Correspondence between labels for IG and TR domains in IMGT/3Dstructure-DB and IMGT/LIGM-DB
  2. Protein display: Human IGHC C-REGION
  3. Examples of Chain details for 3D structures in IMGT/3Dstructure-DB
  4. The standardization is useful to compare sequences and 3D structures. Thus IMGT® detected an error in the b12 sequence (1hzh in PDB and IMGT/3Dstructure-DB), the only complete human IG crystallized. This is indicated in a note in the IMGT/3Dstructure-DB card.
  5. "The presence of an A (Ala) in CH1 121 of 1hzh_H is a PDB file error. It should be a V (Val) as in 1n0x_H. The sequence of 1hzh_H should be IGHG1*01 100% in its entirety. This has been confirmed by Ann Hessel and Dennis Burton (21/07/08) in answer to a question by Marie-Paule Lefranc".
  6. Each constant domain can be represented by a standardized IMGT Collier de Perles using the IMGT unique numbering for C-DOMAIN. For examples:
Are there, for my teaching, some exercises with answers to illustrate the use of IMGT® for immunoglobulin sequence analysis and 3D structure visualization of immunoglobulins?
You can use:
  1. IMGT/V-QUEST (copying examples that are in the IMGT/V-QUEST Documentation)
  2. IMGT/3Dstructure-DB (querying, for example, b12 as 'Molecule name').
If you want to explore all the possibilities of the IMGT/V-QUEST tool and IMGT/3Dstructure-DB database, you can easily spend 4 hours on each one, with your students. Many messages on the immunoglobulin synthesis (See IMGT Education, for example Molecular genetics of immunoglobulins), gene and locus organization (IMGT Repertoire), 2D structures (IMGT Colliers de Perles) and 3D structures (3D visualization Jmol or QuickPDB, contact analysis) can be conveyed starting from IMGT/V-QUEST and IMGT/3Dstructure-DB, and their respective Documentation.
What is the easiest way to identify the N glycosylation sites of the human germline IGHV, IGKV and IGLV? of the human germline IGHJ, IGKJ and IGLJ?
For the human germline IGHV, IGKV et IGLV genes, the easiest way is to query IMGT/DomainDisplay for:
  • Species: 'Homo sapiens'
  • Receptor: 'IG'
  • Domain: V
then click on 'Show sequences'.
The N glycosylation sites are indicated with the letter N in green.
The human germline IGHJ, IGKJ et IGLJ genes do not have N glycosylation sites.
Are there T cell receptor haplotypes defined from the human genome sequencing?
The description of T cell receptor haplotypes would be a great step forward. For the time being only genes that have been sequenced on physical BAC or YAC can be linked and associated. Trying to collect that information from the generalist databases is a huge task as there is a lot of uncertainty, the generalist databases having combined different clones to make contigs larger and larger. Indeed the purpose of the human genome was to have it complete, the pieces of DNA coming from several individuals were assembled, without (at least public) information on the possible assignment to one individual.
The current human public sequences are therefore 'virtual' (there is no individual with such a sequence). However human genome sequences from individuals now exist with, first that of James D. Watson followed by that of Craig Venter
In both cases the sequences are coming from diploid genomes so they could come from one or the other allelic chromosome. There is also a huge effort for the characterization of haplotypes using SNPs http://www.nature.com/nature/journal/v449/n7164/abs/nature06258.htm
How to make an URL link to an IMGT/LIGM-DB entry?
For access to one or more IMGT/LIGM-DB sequences: or for direct access to only one IMGT/LIGM-DB flat-file:
How to download L-PART1+V-EXON in FASTA format for all genes in Gene tables for human TRBV and human TRAV?
You can retrieve the L-PART1+V-EXON for the human genes of a same group (TRBV or TRAV) by using the direct link: Note that the page "IMGT/GENE-DB direct links" lists all the available customizable direct links. This page is referenced at the bottom of the IMGT/GENE-DB query and result pages.
In IMGT/V-QUEST, when obtaining the information 'Nucleotide insertions have been detected and automatically removed...'.What do the insertions (or deletions) mean? Are they artefacts that have been introduced during sequence amplification or sequencing? Are they natural variants?
  1. IMGT/V-QUEST detects insertions (or deletions) which may be either artefacts introduced during sequence amplification or sequencing, or indels appearing in some clones (for example, chronic lymphocytic leukemia (CLL)).

    'Insertions (or deletions)' present in some alleles by comparison with other alleles of the same gene (alleles defined as polymorphic variants of the gene at the genomic level in germline configuration, see 'Alignments of alleles') are included in the IMGT reference directory and therefore are not considered as 'insertions (or deletions)' by IMGT/V-QUEST.

    Insertions of amino acids in the CDR1 or CDR2 may have been functionally selected in the rearranged productive domains of antibodies.
    However the quality of the sequencing should be carefully checked if there is no information available on the specificity.
    Insertions in the FR of productive antibodies are rare, but possible (again sequence should be carefully checked).
  2. The option 'Search for insertion or deletions' was added to IMGT/V-QUEST to answer the demand of clinicians who need the percentage of somatic hypermutations in the VH as a pronostic factor in CLL. The 'Search for insertions or deletions' is used by default in IMGT/HighV-QUEST as there is a high frequency of indels due to homopolymer hybridization in NGS 454 sequencing.
For information on the protocol of IMGT/V-QUEST and other IMGT tools, the IMGT booklet (144 pages)can be downloaded at the page http://www.imgt.org/IMGTinformation/IMGTreferences.php?data=publications, for example at the reference "382".
How to rapidly convert into the IMGT unique numbering, a VH CDR3 old numbering from previous publications?
Clues for converting into the IMGT unique numbering, old VH numberings from previous publications
  1. Correspondence between numberings for VH CDR3-IMGT:

    www.imgt.org/IMGTScientificChart/Numbering/IMGTnumberingCDR_VH.html#CDR3-IMGT

  2. Table of correspondence for key AA of VH
    IMGT unique numbering
    IMGT label for conserved AA  AA and IMGT numbering  IMGT label for regions  IMGT label for FR and CDR  Old VH Numbering #1 
    2nd-CYS C104 3'V-REGION FR3-IMGT C92
    R106 (or K106) 3'V-REGION CDR3-IMGT R94 (or K94)
    D116 5'J-REGION CDR3-IMGT D101
    J-TRP W118 5'J-REGION FR4-IMGT W103
    G119 J-REGION FR4-IMGT G104
  3. IMGT Collier de Perles on two layers for VH

    www.imgt.org/3Dstructure-DB/cgi/collier_perles.cgi?domcode=1HEZBD00&layers=2&domdescr=VH&domnum=1

  4. Characteristics of the amino acids
  5. Interactions between amino acids (AA°)

    Among the interactions between AA of the VH CDR3-IMGT: salt bridge between R106 (or K106) and D116 in the CDR3-IMGT.

    Among the interactions between AA of the VH FR3-IMGTand FR4-IMGT: hydrogen bond between C104 (FR3-IMGT) and G119 (FR4-IMGT).

How to obtain the genomic coordinates of the whole human IGH locus (about 1250 Kb)?

The genomic coordinates of the whole Homo sapiens IGH locus in GRCh38.p2 are:
106.879.844 (5' end of the locus, telomeric, 5' end of IGHV(III)-82)
105.584.157 (3' end of the locus, centromeric, 3' end of IGHA2).
(1296 kb in the current assembly)

You can query 'LOCALIZATION in GENOME ASSEMBLIES' at the bottom of the 'IMGT/GENE-DB Query page' (just click on 'Submit' as the 'Homo sapiens IGH' is searched by default), for retrieving gene localization in the genome assembly.

You can modify these coordinates if you would like to encompass more sequence in 5' or 3' of the locus (for ex 5 kb on each side)
106.879.844 + 5.000
105.584.157 - 5.000

How to get alignments of the leader of the human germline V genes in order to design subgroup specific primers for PCR amplification of the expressed V gene repertoire?
These alignments can be obtained as follows:

1. query IMGT/GENE-DB
  • Example: Species: Homo sapiens, Functionality: Functional, Group: IGHV,
  • Submit.

In the result page:
Select all genes
then at the bottom of the page in: IMGT label extraction from IMGT/LIGM-DB reference sequences
Choose label(s) for extraction
Then, select : L-PART1+L-PART2
  • Submit.

On the resulting sequences:
  • Run a ClustalW.
Is it normal to obtain five TRAV genes when querying IMGT/GENE-DB for human TRDV genes?
The TRD locus being embedded in the TRA locus (The T cell receptor FactsBook), a few TRAV genes have been found rearranged, not only as expected to TRAJ but also to TRDD-TRDJ (Locus representation: human (Homo sapiens) TRA/TRD)
The first five Homo sapiens TRAV genes found in that case received the designation TRAV/DV (this does not exclude that other TRAV genes may be found occasionally rearranged to TRDD-TRDJ). They include Homo sapiens TRAV14/DV4, TRAV29/DV5, TRAV23/DV6, TRAV36/DV7 and TRAV38-2/DV8 genes (mentioned in an IMGT note above Gene table: human (Homo sapiens) TRAV and Gene table: human (Homo sapiens) TRDV).