CD-HIT specifications

Information


Unique identifier OMICS_05157
Name CD-HIT
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
License GNU General Public License version 2.0
Computer skills Advanced
Stability Stable
Maintained Yes

Download


Versioning


Add your version

Information


Unique identifier OMICS_05157
Name CD-HIT
Interface Web user interface
Restrictions to use None
Computer skills Basic
Stability Stable
Maintained Yes

CD-HIT articles

CD-HIT citations

 (31)
2018
PMCID: 5902700

[…] using trinity's insilico_read_normalization.pl script, with 50x adopted as the minimum coverage value. for the assembly, 300 bp was defined as the minimum size of the assembled contigs. the cd-hit package (li and godzik, 2006) and dustmasker (morgulis et al., 2006) were used to remove redundant contigs with more than 95% similarity and low complexity sequences, respectively. […]

2018
PMCID: 5924544

[…] using the following parameters: word size = 54, bubble size = 50, length fraction = 0.8 and similarity fraction = 0.8. to remove redundancy from assemblies, generated contigs were analyzed with cd hit est (version 4.6) [18,19], using the following parameters: -c 0.85 -n 8., de novo assembled contigs were blasted (using nucleotide basic local alignment search tool, blastn, […]

2018
PMCID: 5855608

[…] subsequently, the retrieved data was organised and curated to verify the authenticity of the experimentally validated anticancer amps. thereafter, cluster database at high identity with tolerance (cd hit) (http://www.bioinformatics.org/cd-hit) was used to remove the experimentally validated anticancer amps which are in duplicate., the list of the generated plants experimentally validated […]

2018
PMCID: 5759245

[…] analyses we performed comparing the orthogroup distribution size of the current salmon assembly (at all four filtering steps: unfiltered, after transdecoder single-best orf prediction, after cd-hit clustering at 100% identity and after trinity full-length transcript analysis (e.g. final version)) against the ncbi atlantic salmon refseq proteins. given the high quality of the recently […]

2017
PMCID: 5684102

[…] al., 2011) resulted in 61,563 transcript contigs with n50 value of 2.4 kb, with average length of 1.9 kb. the transcripts were further analyzed using cluster database at high identity with -est i.e. cd-hit_est software (nakasugi et al., 2013). a total of 61,563 transcripts were clustered into 39,330 unigene's using cd-hit-est (li and godzik, 2006; nakasugi et al., 2013). the clustered unigene's […]

CD-HIT institution(s)
Center for Research in Biological Systems, University of California San Diego, La Jolla, CA, USA

CD-HIT reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review CD-HIT