bioinformatics-toolbox

Docker container for bioinformatics tools and pipelines


Project maintained by ahmedmoustafa Hosted on GitHub Pages — Theme by mattgraham

Tools in the Bioinformatics Toolbox

Tool Description
R The R Project for Statistical Computing
BioPerl The Bioperl Project is an international association of users & developers of open source Perl tools for bioinformatics, genomics and life science
Biopython The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology
NCBI BLAST+ Basic Local Alignment Search Tool
DIAMOND Fast and sensitive protein alignment using DIAMOND
HMMER Accelerated Profile HMM Searches
CD-HIT CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences
VSEARCH VSEARCH: a versatile open source tool for metagenomics
MUSCLE MUSCLE: multiple sequence alignment with high accuracy and high throughput
MAFFT MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
JAligner Open-source Java implementation of the Smith-Waterman algorithms for biological pairwise sequence alignment
BWA Fast and accurate short read alignment with Burrows–Wheeler transform
HISAT2 Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
Bowtie2 Fast gapped-read alignment with Bowtie 2
STAR STAR: ultrafast universal RNA-seq aligner
Salmon Alignment and mapping methodology influence transcript abundance estimation
kallisto Near-optimal probabilistic RNA-seq quantification
BBMap BBMerge – Accurate paired shotgun read merging via overlap
FASTX Collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing
Trimmomatic Trimmomatic: a flexible trimmer for Illumina sequence data
SeqKit SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
seqtk Toolkit for processing sequences in FASTA/Q formats
fastp fastp: an ultra-fast all-in-one FASTQ preprocessor
HTStream A toolset for high throughput sequence analysis using a streaming approach facilitated by Linux pipes
fqtrim trimming & filtering of next-gen reads
TreeTime TreeTime: Maximum-likelihood phylodynamic analysis
FastTree FastTree 2–approximately maximum-likelihood trees for large alignments
RAxML RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
RAxML-NG RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
PhyML Estimating maximum likelihood phylogenies with PhyML
Pplacer pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree
SAMtools The Sequence Alignment/Map format and SAMtools
BCFtools https://www.ncbi.nlm.nih.gov/pubmed/28205675
Bamtools BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files
VCFtools The Variant Call Format and VCFtools
BEDTools BEDTools: a flexible suite of utilities for comparing genomic features
deepTools deepTools2: a next generation web server for deep-sequencing data analysis
BEDOPS BEDOPS: high-performance genomic feature operations
Sambamba Sambamba: fast processing of NGS alignment formats
SPAdes SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing
ABySS ABySS: a parallel assembler for short read sequence data
Velvet Velvet: algorithms for de novo short read assembly using de Bruijn graphs
MEGAHIT MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MetaVelvet MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads
Prodigal Prodigal: prokaryotic gene recognition and translation initiation site identification
Infernal inference of RNA alignments
antiSMASH antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters
DeepBGC A deep learning genome-mining strategy for biosynthetic gene cluster prediction
GECCO Accurate de novo identification of biosynthetic gene clusters with GECCO
Miniconda Package, dependency and environment management for any language
CD-HIT Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Nextflow Nextflow enables reproducible computational workflows
GATK The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
Centrifuge Centrifuge: rapid and sensitive classification of metagenomic sequences
Pavian Pavian: interactive analysis of metagenomics data for microbiome studies and pathogen identification
Kraken2 Metagenome analysis using the Kraken software suite
Bracken Bracken: estimating species abundance in metagenomics data
MetaPhlAn Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4
HUMAnN Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3
mothur Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
UCHIME UCHIME improves sensitivity and speed of chimera detection