Introduction

SlideSort is fast and exact method that can find all similar pairs from a string pool in terms of edit distance.


The input to slidesort is a set of sequences of almost equal length and a threshold of edit distance.

From the input sequences, SlideSort exactly find all pairs within the input threshold.

SlideSort also accepts a threshold of maximum number of insertions/deletions.


SlideSort can find similar pairs

  • within edit-distance d,
  • from sequences whose length range from minimum length L to L+d
  • with arbitrary gap length g.

Availability

Goto Download page.

How to use SLIDESORT

GotoUsage page.

Application

  • Pre-screening of short reads assembling.
  • Clustering Large-scale short reads.
    • Constructing Minimum Spanning Trees from all pairs similarity information.
      • An example of MSTs: Large MSTs of 10,000,000 short reads obtained by SLIDESORT. (Visualized by Walrus)
      • MST construction tool (SSMST) is also available in Download page.
mst.PNG

Contact

  • shimizu-kana (AT) aist.go.jp
  • Please replace (AT) by atmark.
  • ( slidesort(AT)m.aist.go.jp is expired due to replacement of email system in AIST.)

Reference

  • Kana Shimizu and Koji Tsuda "SlideSort:All pairs similarity search for short reads", Bioinformatics (2011) 27 (4): 464-470. open access article

Funding

Grant-in-Aid for Young Scientists (22700319) by JSPS


Copyright © 2010-2011 Kana Shimizu



Attach file: filemst.PNG 881 download [Information]
Front page   Edit Freeze Diff Backup Upload Copy Rename Reload   New List of pages Search Recent changes   Help   RSS of recent changes
Last-modified: 2015-04-21 (Tue) 05:22:09 (1334d). Site admin: Kana Shimizu