Research

Research themes, representative work, and software tools

Our group applies computational, statistical, and machine-learning methods to large-scale genomic, metagenomic, and transcriptomic data. We are broadly interested in precision and population genomics and in how microbial communities shape the health of their hosts and environments.

Precision and Population Genomics

We contribute to the bioinformatics effort for the Egypt Genome Project, building and applying pipelines for variant calling, population structure analysis, ancestry inference, and multi-omics integration at a national scale. A central question is how population-specific genetic variation should inform precision medicine — for example, whether polygenic risk models developed in other populations transfer to Egyptian and North African genomes.

Representative work: the Egypt Genome Project (Nature Genetics, 2024) and EGP1K, a whole-genome analysis of 1,024 Egyptians characterizing population structure and genetic diversity (bioRxiv, 2026).

Machine Learning for Genomics

We develop deep-learning and statistical models for problems in genomics and diagnostics, including taxonomic classification, pathogen detection, and predictive modeling from multi-omics data. Much of the challenge is making these models not just accurate but interpretable and easy to run in everyday analysis pipelines.

Representative work: DeepTaxa, a hybrid CNN–BERT framework for 16S rRNA taxonomic classification (Bioinformatics Advances, 2026; code).

Microbiome, Metagenomics, and Viromes

We study how microbial and viral communities relate to health and disease across a range of systems — from the human gut, oral, and urinary microbiomes to environmental and marine communities — using metagenomic and transcriptomic approaches. Our goal is to understand how these communities differ across populations, diseases, and environments, and how those differences can be translated into reliable biomarkers.

Representative work: the blood DNA virome in 8,000 humans (PLOS Pathogens, 2017), the microbial metagenome of urinary tract infection (Scientific Reports, 2018), and host genome and gut microbiome in inflammatory bowel disease (Clinical and Translational Gastroenterology, 2018).

Comparative and Evolutionary Genomics

A long-running thread of our work examines genome evolution, endosymbiosis, and horizontal gene transfer across the tree of life, and the phylogenomic methods needed to study them.

Representative work: genomic footprints of a cryptic plastid endosymbiosis in diatoms (Science, 2009) and an aerobic eukaryotic parasite with functional mitochondria that likely lacks a mitochondrial genome (Science Advances, 2019).

Tools and Software

We build open bioinformatics tools and pipelines that other researchers can pick up and reuse — including DeepTaxa (16S taxonomy), and earlier tools such as JAligner and PhyloSort. See the Software & Tools section of the CV for the full list.


For the lab and its alumni, see the Systems Genomics Lab; for the full publication record, see Publications.