EMerAld (EMA)

massachusetts institute of technology (mit)
computer science and artificial intelligence laboratory (csail)
theory of computation group (toc)
computation and biology group (compbio)

email queries bab@mit.edu


EMA is an alignment tool for barcoded short-read sequencing data, such as those produced by 10x Genomics' Chromium platform. EMA is faster and more accurate than current aligners, and produces not only the final alignments but interpretable per-alignment probabilities.

EMA takes a set of barcoded FASTQs as input, preprocesses them into a series of barcode buckets that can be processed in parallel, and produces a standard SAM/BAM file as output.

Source code and documentation are available on GitHub. Detailed guidelines and resources for reproducing our results are available here.

Ariya Shajii, Ibrahim Numanagić, Bonnie Berger; Latent variable model for aligning barcoded short-reads improves downstream analyses.


With brew 🍺
brew install brewsci/bio/ema
With conda 🐍
conda install -c bioconda ema
From source 🛠
git clone --recursive https://github.com/arshajii/ema
cd ema

Test Datasets

10x data Other data