Protocols

SATFIND specifications

Information


Unique identifier OMICS_11576
Name SATFIND
Interface Web user interface
Restrictions to use None
Input data DNA sequence(s)
Input format FASTA
Output data The output of the program includes: (i) the sequence of 10 bases which has been found repeated at least 10 times in a region of 2 Kb; (ii) the number of tandem repeats found; (iii) the genome coordinates of the region which contains the satellite; (iv) the length of genome covered by the satellite, which may be longer than the initial 2 Kb, since the program continues searching when repeats are found beyond the end of the 2 Kb. The length detected by the program occasionally is longer than the actual satellite. It happens when the repeated 10 base motif is found embedded in unrelated sequences in the neighborhood of the satellite; (v) the most frequent size of the motif repeated in tandem in the satellite; (vi) in a second output file we give the sequence of the repeated motifs in all satellites. If the repeated motifs show a large variation in size, the satellite is eliminated. In this work we have chosen to accept only those satellites in which 40% of the motifs have a similar size. Thus we eliminate from the output some satellites which are very irregular.
Computer skills Basic
Stability Stable
Maintained Yes

Maintainer


  • person_outline Juan A. Subira

Publication for SATFIND

SATFIND citations

 (2)
library_books

De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads

2018
BMC Biol
PMCID: 5765664
PMID: 29325570
DOI: 10.1186/s12915-017-0473-4

[…] g. 4Fig. 5The final assembly contained a much greater diversity of repeat sequences than seen in the WTSI reference sequence (Fig. ). The repeat with the longest unit length (535 bp) as determined by Satfind [], corresponded to a region with ten tandem copies (Fig. ) of an 5S rRNA gene interspersed with an snRNA gene, the source of the spliced leader RNA that is added to many transcripts. The gene […]

call_split

Satellite DNA: An Evolving Topic

2017
Genes
PMCID: 5615363
PMID: 28926993
DOI: 10.3390/genes8090230
call_split See protocol

[…] enomes, which were not intended for processing the billions of short reads generated by Illumina or 454 sequencing in an operative time [,]. Even so, Subirana and Messeguer [] have recently developed SATFIND and used it for the identification and analysis of the satellite families in Caenorhabditis []. Also, Pavlek et al. [] have recovered the use of the Tandem Repeat Finder (TRF) algorithm [] thr […]

SATFIND institution(s)
Departament d’Enginyeria Química, Universitat Politècnica de Catalunya, Barcelona, Spain; Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, Barcelona, Spain
SATFIND funding source(s)
This work was supported in part by grants BFU2009-10380 and TIN2010-21062-C02-01 from the Ministerio de Innovación y Ciencia, Spain.

SATFIND reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review SATFIND