IMGT/JunctionAnalysis program version
IMGT/JunctionAnalysis reference directory release
Table of contents
at Montpellier, is an integrated analysis tool for the analysis of Immunoglobulin (IG)
and T cell receptor (TR)
JUNCTION nucleotide sequences.
IMGT/JunctionAnalysis analyses in a single search up to 5000
junctions provided that the IMGT V-GENE and J-GENE and ALLELE names are identified
- analyses junction nucleotide sequences of rearranged IG
(IGH, IGK or IGL) and TR (TRA, TRB, TRG or TRD) genes,
- identifies accurately D-GENE and ALLELE in the IGH, TRB and TRD junctions,
- delimits precisely the different regions of the junctions:
as well as
- determines the number of mutations in the 3'V-REGION, D-REGION and 5'J-REGION of the IG JUNCTIONs.
- calculates the "gc" content of the N-REGIONs of the IG and TR JUNCTIONs.
The IMGT/JunctionAnalysis Welcome page allows to enter the input information.
The analysis of IG and TR junctions of mice and humans can be
performed exhaustively. Analysis of junctions of other species (rat, rabbit, trout) becomes progressively more available
as genomic sequences are annotated in IMGT.
- Species (drop-down list).
- Locus (radio buttons): IGH, IGK, IGL, TRA, TRB, TRG, TRD
- JUNCTION nucleotide sequences:
- JUNCTION nucleotide sequences can be entered either directly
in the reserved box by typing or by "copy/paste", or by giving the path access to a local file
(click on 'Browse' or 'Parcourir' or type its full path in the reserved box).
- The required format is the FASTA format. Each JUNCTION nucleotide
sequences must be preceded by the following information:
- identifier ("input"), with a 10 character maximum length.
This identifier can be a sequence name, an accession number,
a clone name, etc.
Note that the identifier must be unique in a given set of JUNCTION.
- the name of the V-GENE and ALLELE according to the
IMGT gene name nomenclature.
- the name of the J-GENE and ALLELE according to the
IMGT gene name nomenclature.
- IMGT/JunctionAnalysis accepts up to 5000 junction in a single search. Sequences only need to
be entered in the same format, starting a new line for each sequence:
>Input1, V-GENE and ALLELE name, J-GENE and ALLELE name
nucleotide sequence (in uppercase or lowercase)
>Input2, V-GENE and ALLELE name, J-GENE and ALLELE name
nucleotide sequence (in uppercase or lowercase)
>M62724, IGHV7-4-1*02, IGHJ4*02
>Z47269, IGHV1-69*06, IGHJ5*02
- If the V-GENE ALLELE or J-GENE ALLELE is unknown, the JunctionAnalysis
tool accepts a '?' character instead of the allele number (ex: IGHV1-2*?)
and will run the search against the allele *01 by default.
- If there are several proposed V-GENEs and/or J-GENEs,
the different V-GENE and ALLELE names and/or J-GENE and ALLELE
names have to be separated by the '/' character
(ex: IGHV1-2*01/IGHV1-3*?/IGHV1-18*02, IGHJ1*01/IGHJ2*01).
The IMGT/JunctionAnalysis tool will run the search against the first
V-GENE and ALLELE and J-GENE and ALLELE listed.
- JUNCTION nucleotide sequences must start with the V-REGION 2nd-CYS codon and end with J-REGION J-PHE or J-TRP codon (positions
104 and 118, respectively, in the IMGT unique numbering for V-DOMAIN ).
- "V-GENE and ALLELE" and "J-GENE and ALLELE" are those obtained
by querying IMGT/V-QUEST .
If several alleles give the same score, select the most probable one.
- Example of IMGT/JunctionAnalysis results
The selection of the option 'Example of IMGT/JunctionAnalysis results' allows you to vizualize an example of the results
provided by IMGT/JunctionAnalysis.
Sequences used in the 'Example of IMGT/JunctionAnalysis results':
- Display Results
- "List of all eligible D-GENE."
This option allows one to visualize all D genes that match a junction and to compare their score. It is displayed by
default when only one junction is analyzed in the run, but can be disable.
- "Colored IMGT AA classes and histogram."
The IMGT AA classes and histogram are displayed by
default with colors of the AA according to the 11 IMGT physicochemical
AA classes (Pommié et al. 2004) (IMGT Aide-mémoire>Amino acids, http://www.imgt.org) , but can be disable.
- "Output order" in "CDR3-IMGT length decreasing order" or in "CDR3-IMGT
length increasing order."
The results in "JUNCTION alignments with translation and IMGT AA classes" may be displayed in "Same order as
input" (default), in "CDR3-IMGT length decreasing order," or "CDR3-IMGT length increasing order".
- Advanced Parameters
- 5' and 3' ends of the JUNCTION:
- Default: the JUNCTION nucleotide sequences must start in 5' with a cystein (tgt or tgc) codon and must end in 3' with a
tryptophan (tgg) or phenylalanine (ttt or ttc) codon.
- The JUNCTION nucleotide sequences may start in 5' and/or may end in 3' with any codon.
- Nb of D-GENEs (for IGH, TRB and TRD JUNCTION):
- Default values are 1 for IGH, 1 for TRB and 3 for TRD.
- You may modify it from 0 to 3.
- Number of accepted mutations in 3'V-REGION, D-REGION, and 5'J-REGION:
- Delimitation of 3'V-REGION, D-REGION and 5'J-REGION:
Default: the patterns 'm', 'm-' and 'mm--' are trimmed from the 3'V-REGION and 3' end of the D-REGION,
the patterns 'm', '-m' and '--mm' are trimmed from the 5'J-REGION and 5' end of the D-REGION
where 'm' indicates a mutation and '-' indicates an identical nucleotide
by comparison with the corresponding alleles germline sequences.
Stop trimming with the first encountered identical nucleotide
- D-GENE choice (if several have the same score):
- The less mutated one,
- The longest one,
- The one more upstream in the locus.
The IMGT/JunctionAnalysis Results comprises:
A brief summary at the top of the page with:
- the locus and species name
- a link, for information, to the Locus representation in the IMGT Repertoire
- the number of submitted junctions
- the number of results
- the number of junctions with no results if any, and a link to display the corresponding list (see the section "List of junctions with no results")
- the values of the parameters used by the tool for the analysis:
- Maximum number of accepted mutations
- Deletion limits
- Best D-GENE choice for a same score
- Analysis of the JUNCTIONs:
The "Analysis of the JUNCTION" provides the results of the analysis of the junctions at the nucleotide
- The junctions are displayed according to the order of the sequence submissions with the names
of the input sequences and names of the V and J genes and alleles as provided by the user.
Note that the gene and allele names are preceded by the short name of the species
(encoded on 6 characters, for example Homsap for Homo sapiens or Musmus for Mus musculus).
- Nucleotides of each region identified in a JUNCTION are displayed.
- Dots in 3'V-REGION, D-REGION and 5'J-REGION indicate nucleotides trimmed
in the rearranged sequence, by comparison to the corresponding
germline 3'V-REGION, D-REGION and 5'J-REGION.
- Underlined nucleotides represent the mutated nucleotides. You can click on a mutated nucleotide to see the original one of the germline
region in the little rectangle below the sentence "Click on mutated (underlined) nucleotide to see the original one": note that
- N, N1, N2, N3, N4 indicate N-REGIONs. If there are several N-REGIONs,
they are numbered from left to right.
- P indicate P-REGIONs (there is no numbering for the P-REGIONs).
The information provided in the IMGT/JunctionAnalysis Search page
is reported in 3 columns (blue):
- V name: IMGT V-GENE and ALLELE name
- J name: IMGT J-GENE and ALLELE name
Results from the IMGT/JunctionAnalysis tool are displayed in the other columns:
- D name: IMGT D-GENE and ALLELE name for IGH and TRB loci
(In the case of the TRD locus the names of the 3 IGHD genes are
displayed above their respective sequences)
- Vmut: Number of mutations in the "input" 3'V-REGION identified
by the IMGT/JunctionAnalysis tool, by comparison to
the corresponding germline allele sequence.
- Dmut: Number of mutations in the D-REGION sequence
identified by the IMGT/JunctionAnalysis tool, by comparison
to the corresponding germline allele sequence.
- Jmut: Number of mutations in the "input" 5'J-REGION identified
by the IMGT/JunctionAnalysis tool, compared to the
corresponding germline allele sequence.
- Ngc: Ratio of the number of g+c nucleotides to the
total number of N region nucleotides.
JUNCTION alignments with translation and IMGT AA classes:
- Each JUNCTION nucleotide sequence is translated in amino acid sequences.
In the case of frameshifts, gaps indicated by one or two dots
are inserted to maintain the J-REGION reading frame and to facilitate
- Codons and amino acids are numbered according to the
IMGT unique numbering for V-DOMAIN.
- The numbering is made according to the longest JUNCTION
obtained in the results.
- Colors of the amino acid classes are according to the eleven
IMGT amino acid chemical characteristics classes 
- Underlined amino acids represent the mutated amino acids. You can click on a mutated amino acid to see the original one of the germline
region in the little rectangle below the sentence "Click on mutated (underlined) amino acid to see the original one": note that
The option "Colored IMGT AA classes and histogram" of 'Advanced parameters' allows the display the JUNCTION alignments with translation with or without IMGT AA classes and histogram.
List of junctions with no results
The list of junctions with no results is displayed at the bottom of the results page.
It comprises the sequence identifiers and related comments.
Note that: V gene and allele sequences that are partial in 3' are not included
in the IMGT reference directory of IMGT/JunctionAnalysis because the 3' end of the V gene and allele cannot be delimitated correctly in the junctions.
- The IMGT/JunctionAnalysis is by far a more accurate tool for the D-GENE and ALLELE name identification
and delimitation. However, IMGT/V-QUEST has the advantage of proposing several solutions, which can be
useful in some cases.
- The way IMGT/V-QUEST and IMGT/JunctionAnalysis identify the D-GENEs is not identical,
therefore the scores can be compared for a given tool, but score differences may be observed
between the tools.
- For two D-GENEs ans ALLELEs with an identical score in the IMGT/V-QUEST results, IMGT/JunctionAnalysis,
in the default configuration selects the solution which gives the smallest N regions, or, in other terms,
selects a longer D (accepting nucleotide differences) to a shorter D (without nucleotide differences).
- IMGT/JunctionAnalysis for statistical analysis: see ref .
The first version of IMGT/JunctionAnalysis tool was developed by Mehdi Yousfi,
student in the Licence d'Informatique,
Université Montpellier II,
during a stay in the
Laboratoire d'ImmunoGénétique Moléculaire,
IGH, CNRS, Montpellier, France.
IMGT/JunctionAnalysis in its present version has been developed by Denys Chaume,
Véronique Giudicelli and Patrice Duroux.
||Yousfi Monod, M. et al., Bioinformatics, 20, I379-I385 (2004)
||Lefranc, M.-P., Methods Mol. Biol., 248, 27-49 (2004)
||Lefranc, M.-P., Current Protocols in Immunology, pp. A.1W.1-A.1W.15 (2006)
||Giudicelli, V. and Lefranc, M.-P., Nova Science, pp77-105 (2005)
||Lefranc, M.-P. et al., Dev. Comp. Immunol., 27, 55-77 (2003)
||Giudicelli, V. et al., Nucl. Acids Res., 32, W435-440 (2004)
||Pommié, C. et al., J. Mol. Recognit., 17, 17-32 (2004)
||Bleakley K. et al., In Silico Biology. J. Epub 2006,6,0051