IMGT®, the international ImMunoGeneTics information system®

logo IMGT

Format CLUSTAL

fr

The CLUSTAL format is the format of the ClustalW multialignment tool output. It can be described as follows. The word CLUSTAL is on the first line of the file. The alignment is displayed in blocks of a fixed length, each line in the block corresponding to one sequence. Each line of each block starts with the sequence name (maximum of 10 characters), followed by at least one space character. The sequence is then displayed in upper or lower cases, '-' denotes gaps. The residue number may be displayed at the end of the first line of each block.

Example of alignment in CLUSTAL format

CLUSTAL W (1.8) multiple sequence alignment
1                                                         60
TRGJ1_01        ------------GAATTATTATAAGAAACTCTTTGGCAGTGGAACAACACTGGTTGTCAC
TRGJ2_01        ------------GAATTATTATAAGAAACTCTTTGGCAGTGGAACAACTCTTGTTGTCAC
TRGJP_01        TGGGCAAGAGTTGGGCAAAAAAATCAAGGTATTTGGTCCCGGAACAAAGCTTATCATTAC
TRGJP1_01       --------ATACCACTGGTTGGTTCAAGATATTTGCTGAAGGGACTAAGCTCATAGTAAC
TRGJP2_01       --------ATAGTAGTGATTGGATCAAGACGTTTGCAAAAGGGACTAGGCTCATAGTAAC
**    ****     ** ** *  **  *  * **
61    68
TRGJ1_01        AG------
TRGJ2_01        AG------
TRGJP_01        AG------
TRGJP1_01       TTCACCTG
TRGJP2_01       TTCGCCTG