View On GitHub
ZIP
TAR
DOWNLOADS
bioinformatics-toolbox
Docker container for bioinformatics tools and pipelines
Project maintained by
ahmedmoustafa
Hosted on GitHub Pages — Theme by
mattgraham
Tools in the Bioinformatics Toolbox
Tool
Description
R
The R Project for Statistical Computing
BioPerl
The Bioperl Project is an international association of users & developers of open source Perl tools for bioinformatics, genomics and life science
Biopython
The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology
NCBI BLAST+
Basic Local Alignment Search Tool
DIAMOND
Fast and sensitive protein alignment using DIAMOND
HMMER
Accelerated Profile HMM Searches
CD-HIT
CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences
VSEARCH
VSEARCH: a versatile open source tool for metagenomics
MUSCLE
MUSCLE: multiple sequence alignment with high accuracy and high throughput
MAFFT
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
JAligner
Open-source Java implementation of the Smith-Waterman algorithms for biological pairwise sequence alignment
BWA
Fast and accurate short read alignment with Burrows–Wheeler transform
HISAT2
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
Bowtie2
Fast gapped-read alignment with Bowtie 2
STAR
STAR: ultrafast universal RNA-seq aligner
Salmon
Alignment and mapping methodology influence transcript abundance estimation
kallisto
Near-optimal probabilistic RNA-seq quantification
BBMap
BBMerge – Accurate paired shotgun read merging via overlap
FASTX
Collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing
Trimmomatic
Trimmomatic: a flexible trimmer for Illumina sequence data
SeqKit
SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
seqtk
Toolkit for processing sequences in FASTA/Q formats
fastp
fastp: an ultra-fast all-in-one FASTQ preprocessor
HTStream
A toolset for high throughput sequence analysis using a streaming approach facilitated by Linux pipes
fqtrim
trimming & filtering of next-gen reads
TreeTime
TreeTime: Maximum-likelihood phylodynamic analysis
FastTree
FastTree 2–approximately maximum-likelihood trees for large alignments
RAxML
RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
RAxML-NG
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
PhyML
Estimating maximum likelihood phylogenies with PhyML
Pplacer
pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree
SAMtools
The Sequence Alignment/Map format and SAMtools
BCFtools
https://www.ncbi.nlm.nih.gov/pubmed/28205675
Bamtools
BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files
VCFtools
The Variant Call Format and VCFtools
BEDTools
BEDTools: a flexible suite of utilities for comparing genomic features
deepTools
deepTools2: a next generation web server for deep-sequencing data analysis
BEDOPS
BEDOPS: high-performance genomic feature operations
Sambamba
Sambamba: fast processing of NGS alignment formats
SPAdes
SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing
ABySS
ABySS: a parallel assembler for short read sequence data
Velvet
Velvet: algorithms for de novo short read assembly using de Bruijn graphs
MEGAHIT
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MetaVelvet
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads
Prodigal
Prodigal: prokaryotic gene recognition and translation initiation site identification
Infernal
inference of RNA alignments
antiSMASH
antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters
DeepBGC
A deep learning genome-mining strategy for biosynthetic gene cluster prediction
GECCO
Accurate de novo identification of biosynthetic gene clusters with GECCO
Miniconda
Package, dependency and environment management for any language
CD-HIT
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Nextflow
Nextflow enables reproducible computational workflows
GATK
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
Centrifuge
Centrifuge: rapid and sensitive classification of metagenomic sequences
Pavian
Pavian: interactive analysis of metagenomics data for microbiome studies and pathogen identification
Kraken2
Metagenome analysis using the Kraken software suite
Bracken
Bracken: estimating species abundance in metagenomics data
MetaPhlAn
Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4
HUMAnN
Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3
mothur
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
UCHIME
UCHIME improves sensitivity and speed of chimera detection