CD-HIT statistics

To access cutting-edge analytics on consensus tools, life science contexts and associated fields, you will need to subscribe to our premium service.

Subscribe
info

Citations per year

Citations chart
info

Popular tool citations

chevron_left Read clustering Gene expression clustering Redundancy reduction Protein clustering chevron_right
Popular tools chart
info

Tool usage distribution map

Tool usage distribution map
info

Associated diseases

Associated diseases

Protocols

To access compelling stats and trends, optimize your time and resources and pinpoint new correlations, you will need to subscribe to our premium service.

Subscribe

CD-HIT specifications

Information


Unique identifier OMICS_05157
Name CD-HIT
Alternative names CD-HIT-EST, cdhit-est
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
License GNU General Public License version 2.0
Computer skills Advanced
Stability Stable
Maintained No

Download


Versioning


Add your version

Information


Unique identifier OMICS_05157
Name CD-HIT
Alternative names CD-HIT-EST, cdhit-est
Interface Web user interface
Restrictions to use None
Computer skills Basic
Stability Stable
Maintained No

Publications for CD-HIT

CD-HIT in pipelines

 (345)
2018
PMCID: 5749766
PMID: 29293601
DOI: 10.1371/journal.pone.0190266

[…] in the microbiome by using fraggenescan []. the predicted proteins in each sample were clustered with the proteins in 13 complete reference synechococcus. in order to analyze the various aspects, cd-hit [] clustering was conducted with two different options of protein sequence similarity and shorter sequence coverage options: 70 over 70 (i.e. 70% sequence similarity over 70% of the shorter […]

2018
PMCID: 5759245
PMID: 29310597
DOI: 10.1186/s12864-017-4379-x

[…] species, resulting in sets of 60,856 transcripts for atlantic salmon, 60,943 for brown trout, 55,674 for arctic charr and 57,734 for european whitefish. we clustered the remaining sequences with cd-hit (100% amino acid identity), which collapsed around 12% of the transcripts. the resulting non-redundant assemblies consisted of 53,547 transcripts for atlantic salmon, 53,804 for brown trout, […]

2018
PMCID: 5759245
PMID: 29310597
DOI: 10.1186/s12864-017-4379-x

[…] analyses we performed comparing the orthogroup distribution size of the current salmon assembly (at all four filtering steps: unfiltered, after transdecoder single-best orf prediction, after cd-hit clustering at 100% identity and after trinity full-length transcript analysis (e.g. final version)) against the ncbi atlantic salmon refseq proteins. given the high quality of the recently […]

2018
PMCID: 5766104
PMID: 29329292
DOI: 10.1371/journal.pone.0189898

[…] universal single-copy orthologs [] version 1.1 was used and the rsem-eval package distributed with detonate [] represented our reference-free evaluation method to calculate assembly scores. because cd-hit [] reduced our busco scores, we finally filtered the raw assembly by applying rsem-eval’s contig impact score []. contigs with impact scores less or equal than zero were removed […]

2018
PMCID: 5769438
PMID: 29335005
DOI: 10.1186/s40168-017-0387-y

[…] resfinder [], and bacmet []. these antibiotic resistance databases were combined within a non-redundant set. proteins were clustered in protein families by homology, using cd-hit with parameters of 80% identity and 80% coverage. each protein family was aligned by muscle v. 3.7 [] with default parameters and a hidden markov model (hmm) was built for each family […]


To access a full list of citations, you will need to upgrade to our premium service.

CD-HIT in publications

 (1547)
PMCID: 5945678
PMID: 29748645
DOI: 10.1038/s41598-018-25826-6

[…] from 19–71 based on good quality read sequences and corresponding contigs/scaffolds were produced for each k-mer. tool gapcloser was used to close the gaps emerged during the scaffolding process. cd-hit-est version 4.6, a clustering program was used to search similar sequences with minimum similarity cut-off of 90% (http://weizhongli-lab.org/cd-hit). another clustering step was performed […]

PMCID: 5954267
PMID: 29764375
DOI: 10.1186/s12864-018-4463-x

[…] without draft genome., to improve the assembly quality, three computational tools (trinity, oases and soapdenovo-trans) were employed to enhance individual transcriptome assembly, and cap3 and cd-hit-est software were then used to merge these three assembled transcriptomes. in addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length […]

PMCID: 5943448
PMID: 29743490
DOI: 10.1038/s41598-018-25368-x

[…] were retained for further analysis. high quality reads were assembled via the trinity program, which generated 224,631 contigs (fig.  and table ). to reduce the number of redundant sequences, the cd-hit-est program was used to cluster similar sequences at a threshold set at 95% nucleotide similarity. the output from the cd-hit-est program (205,243 contigs) was used for filtering […]

PMCID: 5930961
PMID: 29720106
DOI: 10.1186/s12864-018-4684-z

[…] out sequences with unknown nucleotides or low quality (quality scores< 20), de novo assembly of the reads was performed using trinity with default settings. the resultant data were processed with cd-hit-est to eliminate redundancy., the protein sequences and the corresponding coding dna sequences (cds) of the following fish genomes were downloaded from ensembl database: g. morhua […]

PMCID: 5931083
PMID: 29718005
DOI: 10.1038/sdata.2018.79

[…] genome using quast v2.3, and those unaligned contigs were then selected. all the unaligned contigs with length >500 bp from each rice accession were merged into a single sequence set. next, cd-hit v4.6.123 was used to remove redundant sequences at an identity cut-off of 90% with command “cd-hit-est -i input.fa -o output.fa -c 0.9 -t 16 -m 50000”. this process was carried out for 3 times […]


To access a full list of publications, you will need to upgrade to our premium service.

CD-HIT institution(s)
Center for Research in Biological Systems, University of California San Diego, La Jolla, CA, USA

CD-HIT reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review CD-HIT