CD-HIT statistics

info info

Citations per year

Number of citations per year for the bioinformatics software tool CD-HIT
info

Tool usage distribution map

This map represents all the scientific publications referring to CD-HIT per scientific context
info info

Associated diseases

This word cloud represents CD-HIT usage per disease context
info

Popular tool citations

chevron_left Gene expression clustering Read clustering Redundancy reduction Protein clustering chevron_right
Want to access the full stats & trends on this tool?

Protocols

CD-HIT specifications

Information


Unique identifier OMICS_05157
Name CD-HIT
Alternative names CD-HIT-EST, cdhit-est
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
License GNU General Public License version 2.0
Computer skills Advanced
Stability Stable
Maintained No

Download


debian.png
conda.png

Versioning


No version available

Information


Unique identifier OMICS_05157
Name CD-HIT
Alternative names CD-HIT-EST, cdhit-est
Interface Web user interface
Restrictions to use None
Computer skills Basic
Stability Stable
Maintained No

Publications for CD-HIT

CD-HIT citations

 (1291)
library_books

Transcriptome and Co Expression Network Analyses Identify Key Genes Regulating Nitrogen Use Efficiency in Brassica juncea L.

2018
Sci Rep
PMCID: 5945678
PMID: 29748645
DOI: 10.1038/s41598-018-25826-6

[…] ing from 19–71 based on good quality read sequences and corresponding contigs/scaffolds were produced for each k-mer. Tool GapCloser was used to close the gaps emerged during the scaffolding process. CD-HIT-EST version 4.6, a clustering program was used to search similar sequences with minimum similarity cut-off of 90% (http://weizhongli-lab.org/cd-hit). Another clustering step was performed using […]

library_books

The aquatic animals’ transcriptome resource for comparative functional analysis

2018
BMC Genomics
PMCID: 5954267
PMID: 29764375
DOI: 10.1186/s12864-018-4463-x

[…] Date: 2007–10-15 [] with an overlap length cutoff of 200 and overlap percent identity cutoff of 99 (−o 200 -p 99) []. To obtain comprehensive transcriptomes from various assemblers [, ], we used the CD-HIT-EST v4.6 [] cluster tool with a sequence identity cutoff of 90% to merge results from Oases, SOAPdenovo-Trans, and Trinity. […]

library_books

Transcriptomic analysis of crustacean molting gland (Y organ) regulation via the mTOR signaling pathway

2018
Sci Rep
PMCID: 5943448
PMID: 29743490
DOI: 10.1038/s41598-018-25368-x

[…] tware with default settings (version number: 2.0.1). The minimum contig length was set at 201 bp. Following assembly, the contigs were clustered based on a 95% sequence similarity threshold using the CD-HIT-EST program (version number: 4.6.1) (Fig. ). The assembled transcriptome is available in fasta format in the Cyverse Discovery environment at: https://de.cyverse.org/dl/d/915A4462-D13E-443C-88F […]

library_books

Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping

2018
Front Microbiol
PMCID: 5945981
PMID: 29780368
DOI: 10.3389/fmicb.2018.00836

[…] PHASTER were extracted from each genome and every sequence was renamed by adding the sample name to the sequence header. All identified prophage regions from the strains analyzed were clustered using CD–HIT-EST (Fu et al., ) based on 90% identity and 90% prophage sequence length coverage. Increasing the stringency to identity and sequence coverage of 95, 99, and 100% did not improve the clustering […]

library_books

Novel sequences, structural variations and gene presence variations of Asian cultivated rice

2018
Sci Data
PMCID: 5931083
PMID: 29718005
DOI: 10.1038/sdata.2018.79

[…] reference genome using QUAST v2.3, and those unaligned contigs were then selected. All the unaligned contigs with length >500 bp from each rice accession were merged into a single sequence set. Next, CD-HIT v4.6.123 was used to remove redundant sequences at an identity cut-off of 90% with command “cd-hit-est -i input.fa -o output.fa -c 0.9 -T 16 -M 50000”. This process was carried out for 3 times […]

library_books

MIKCC type MADS box genes in Rosa chinensis: the remarkable expansion of ABCDE model genes and their roles in floral organogenesis

2018
PMCID: 5928068
PMID: 29736250
DOI: 10.1038/s41438-018-0031-4

[…] luding SMART (http://smart.embl-heidelberg.de/) and Pfam to confirm the integrity of the MADS-box domains. Finally, the corresponding nucleotide sequences of all candidate genes were submitted to the CD-HIT Suite server, with 90% sequence identity cut-off (http://weizhong-lab.ucsd.edu/cdhit_suite/cgi-bin/index.cgi?cmd=Server%20home) to remove redundant sequences along with manual checking (Supplem […]


Want to access the full list of citations?
CD-HIT institution(s)
Center for Research in Biological Systems, University of California San Diego, La Jolla, CA, USA

CD-HIT reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review CD-HIT