LAST : Genome-Scale Sequence Comparison

released on 11th Sep. 2008


LAST aims to make it easy to compare large genome sequences to one another, and also to analyze huge tag datasets from new sequencing technologies such as 454, Solexa, and SOLiD. LAST includes the following features:

*A suffix array to find initial hits of arbitrary size. The size of each hit adapts to its repetitiveness, which avoids getting a catastrophic number of initial hits.

*The suffix array can be discontiguous (analogous to spaced seeds), for higher sensitivity.

*The suffix array can be sparse (analogous to BLAT), to reduce time and storage, at the expense of sensitivity.

*An X-drop algorithm for fast gapped alignment.

*Measures to avoid catastrophe when self-comparing large sequences.

*Flexiblity and transparency: arbitrary alphabets and scoring schemes can be selected, including "generalized affine gap costs".

*Fits within 2 GB of RAM.

*X-treme speed.

