Description - Multiple Alignment

Description
Protein Sequence Multiple Alignment

Method :

TBIAT (Tree-base Best-first Iterative Algorithm with Tree-dependent partitioning)

To date, most multiple alignment systems have employed a tree-based algorithm, which combines the results of group-to-group pairwise alignment in a tree-like order of sequence similarity. The alignment quality is not, however, high enough when the sequence similarity is low. Once an error occurs in the alignment process, that error can never be corrected.

Our algorithm iteratively apply group-to-group pairwise alignment to partially aligned sequences to improve their alignment quality, whenever two subalignments are merged in a tree-based way. The iteration corrects any errors that may have occurred in the tree-based alignment process. Such an iterative strategy requires heuristic search methods to solve practical alignment problems. We employed best-first search with tree-dependent partitioning, and parallelized its search step to reduce the execution time of the iterative algorithm.

References :

Y. Totoki, Y. Akiyama, K. Onizuka, T. Noguchi, M. Saito, and M. Ando :
"Employing A* Algorithm in Parallel Multiple Protein Sequence Alignment",
IPSJ SIG Notes, 97-MPS-16-4, pp.19-24 (1997).[in Japanese]

M. Hirosawa, Y. Totoki, M. Hoshida, and M. Ishikawa :
"Comprehensive Study on Iterative Algorithms of Multiple Sequence Alignment",
Comput. Applic. Biosci., Vol.11, No.1, pp.13-18 (1995).

Restrictions :

Sequence length
Sorry. Sequence length limit: Sequence length <= 1000
If you want to align sequences of long length, please connect and send your data to papia@m.aist.go.jp.
Sequence data size
Sorry. Data size limit:

(1) Dynamic programming: Maximum length * Number of sequence <= 10000

(2) A* algorithm: Maximum length * Number of sequence <= 5000

If you want to align sequences of big size, please connect and send your data to papia@m.aist.go.jp.

How to use :

See 'Service status' and confirm the service is ON.
Select 'Score Matrix'.
'BLOSUM45','BLOSUM62','BLOSUM80','PAM120' and 'PAM250' are available.
Set 'Gap Cost'.
Select 'System default' or 'User defined'.
If you select 'User defined', fill in each field for 'Gap' penalty.
- Opening Gap : Cost penalty for opening gap
- Extension Gap : Cost penalty for extension gap
- Out Gap : Cost penalty for out gap
Each 'Gap' penalty must be integer and "0 <= Gap_penalty <= 100".
Set 'Searching method' which is used in group-to-group pairwise alignment.
Select 'Dynamic programming' or 'A* algorithm'.
If you select 'Dynamic programming', fill in each field for 'DP Cutoff'.
'DP Cutoff' means the cut-off of the search space in dynamic programming matrix.
Fill the ratio of the cut-off in both fields 'Tree-base' and 'Iterative'.

Tree-base : The ratio of the cut-off used in Tree-base method
Iterative : The ratio of the cut-off used in Iterative method

Enter any label for your query into the field 'Query label='.

Paste your sequences into the text area 'Input Sequences'.
Some formats are available.

5.1.

Fasta format is available.
Example :

>CSRC_HUMAN
KLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLK
>CABL_HUMAN
KLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDY
>EPH_HUMAN
IGEGEFGEVYRGTLRLPSQDCKTVAIKTLKDTSPGGQWWNFLREATIMGQFSHPHILHLEGVVTKRKPIMIITEFMENGA
>FER_HUMAN
LLGKGNFGEVYKGTLKDKTSVAVKTCKEDLPQELKIKFLQEAKILKQYDHPNIVKLIGVCTQRQPVYIIMELVSGGDFLT

5.2.

One-line Format : (label) (amino acid sequence)
One sequence must be pasted on one line.
Example :

CSRC_HUMAN  KLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLK
CABL_HUMAN  KLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDY
EPH_HUMAN   IGEGEFGEVYRGTLRLPSQDCKTVAIKTLKDTSPGGQWWNFLREATIMGQFSHPHILHLEGVVTKRKPIMIITEFMENGA
FER_HUMAN   LLGKGNFGEVYKGTLKDKTSVAVKTCKEDLPQELKIKFLQEAKILKQYDHPNIVKLIGVCTQRQPVYIIMELVSGGDFLT

5.3.

No label Format : (amino acid sequence)
One sequence must be pasted on one line.
Example :

KLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLK
KLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDY
IGEGEFGEVYRGTLRLPSQDCKTVAIKTLKDTSPGGQWWNFLREATIMGQFSHPHILHLEGVVTKRKPIMIITEFMENGA
LLGKGNFGEVYKGTLKDKTSVAVKTCKEDLPQELKIKFLQEAKILKQYDHPNIVKLIGVCTQRQPVYIIMELVSGGDFLT

If you want to reset the input form, Push 'Reset this form'.
Push 'Service status' button to confirm the service status for your query.
Then, push 'Submit' button to submit your query to the server.
Results Example

PAPIA system,

Description Protein Sequence Multiple Alignment

Method :

References :

Restrictions :

How to use :

Description
Protein Sequence Multiple Alignment