Tools in the Bioinformatics Toolbox

Tool	Description
R	The R Project for Statistical Computing
BioPerl	The Bioperl Project is an international association of users & developers of open source Perl tools for bioinformatics, genomics and life science
Biopython	The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology
NCBI BLAST+	Basic Local Alignment Search Tool
DIAMOND	Fast and sensitive protein alignment using DIAMOND
HMMER	Accelerated Profile HMM Searches
CD-HIT	CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences
VSEARCH	VSEARCH: a versatile open source tool for metagenomics
MUSCLE	MUSCLE: multiple sequence alignment with high accuracy and high throughput
MAFFT	MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
JAligner	Open-source Java implementation of the Smith-Waterman algorithms for biological pairwise sequence alignment
BWA	Fast and accurate short read alignment with Burrows–Wheeler transform
HISAT2	Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
Bowtie2	Fast gapped-read alignment with Bowtie 2
STAR	STAR: ultrafast universal RNA-seq aligner
Salmon	Alignment and mapping methodology influence transcript abundance estimation
kallisto	Near-optimal probabilistic RNA-seq quantification
BBMap	BBMerge – Accurate paired shotgun read merging via overlap
FASTX	Collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing
Trimmomatic	Trimmomatic: a flexible trimmer for Illumina sequence data
SeqKit	SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
seqtk	Toolkit for processing sequences in FASTA/Q formats
fastp	fastp: an ultra-fast all-in-one FASTQ preprocessor
HTStream	A toolset for high throughput sequence analysis using a streaming approach facilitated by Linux pipes
fqtrim	trimming & filtering of next-gen reads
TreeTime	TreeTime: Maximum-likelihood phylodynamic analysis
FastTree	FastTree 2–approximately maximum-likelihood trees for large alignments
RAxML	RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
RAxML-NG	RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
PhyML	Estimating maximum likelihood phylogenies with PhyML
Pplacer	pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree
SAMtools	The Sequence Alignment/Map format and SAMtools
BCFtools	https://www.ncbi.nlm.nih.gov/pubmed/28205675
Bamtools	BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files
VCFtools	The Variant Call Format and VCFtools
BEDTools	BEDTools: a flexible suite of utilities for comparing genomic features
deepTools	deepTools2: a next generation web server for deep-sequencing data analysis
BEDOPS	BEDOPS: high-performance genomic feature operations
Sambamba	Sambamba: fast processing of NGS alignment formats
SPAdes	SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing
ABySS	ABySS: a parallel assembler for short read sequence data
Velvet	Velvet: algorithms for de novo short read assembly using de Bruijn graphs
MEGAHIT	MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MetaVelvet	MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads
Prodigal	Prodigal: prokaryotic gene recognition and translation initiation site identification
Infernal	inference of RNA alignments
antiSMASH	antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters
DeepBGC	A deep learning genome-mining strategy for biosynthetic gene cluster prediction
GECCO	Accurate de novo identification of biosynthetic gene clusters with GECCO
Miniconda	Package, dependency and environment management for any language
CD-HIT	Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Nextflow	Nextflow enables reproducible computational workflows
GATK	The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
Centrifuge	Centrifuge: rapid and sensitive classification of metagenomic sequences
Pavian	Pavian: interactive analysis of metagenomics data for microbiome studies and pathogen identification
Kraken2	Metagenome analysis using the Kraken software suite
Bracken	Bracken: estimating species abundance in metagenomics data
MetaPhlAn	Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4
HUMAnN	Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3
mothur	Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
UCHIME	UCHIME improves sensitivity and speed of chimera detection