Introduction †
SlideSort is fast and exact method that can find all similar pairs from a string pool in terms of edit distance.
The input to slidesort is a set of sequences of almost equal length and a threshold of edit distance.
From the input sequences, SlideSort exactly find all pairs within the input threshold.
SlideSort also accepts a threshold of maximum number of insertions/deletions.
SlideSort can find similar pairs
- within edit-distance d,
- from sequences whose length range from minimum length L to L+d
- with arbitrary gap length g.
Availability †
Goto Download page.
How to use SLIDESORT †
GotoUsage page.
Application †
- Pre-screening of short reads assembling.
- Clustering Large-scale short reads.
- Constructing Minimum Spanning Trees from all pairs similarity information.
- An example of MSTs: Large MSTs of 10,000,000 short reads obtained by SLIDESORT. (Visualized by Walrus)
- MST construction tool (SSMST) is also available in Download page.
Contact †
- shimizu-kana (AT) aist.go.jp
- Please replace (AT) by atmark.
- ( slidesort(AT)m.aist.go.jp is expired due to replacement of email system in AIST.)
Reference †
- Kana Shimizu and Koji Tsuda "SlideSort:All pairs similarity search for short reads", Bioinformatics (2011) 27 (4): 464-470. open access article
Funding †
Grant-in-Aid for Young Scientists (22700319) by JSPS
Copyright © 2010-2011 Kana Shimizu