Hopper is a mathematically motivated sketching algorithm for single-cell data, which preserves transcriptional diversity using a farthest-first traversal on the data. The algorithm will be presented at ISMB 2020, will appear in an upcoming issue of Bioinformatics, and is available as a pre-print on bioRxiv.

Here we provide sketches of ten of the largest publicly available single-cell datasets, produced with the help of Brian Hie. Each download contains a directory with two files: a sketched dataset with 20,000 cells, and a table with gene labels from the individual studies. These facilitate rapid analyses of these datasets without the heavy computational burden that would otherwise be incurred, and without compromising rare cell populations.

Original Study Link Original Data Accession Sketched Data
Cao Mouse Embryo GSE119945 Download
Moffit Mouse Brain GSE113576 Download
Saunders Mouse Brain GSE116470 Download
Zeisel Mouse Brain SRP135960 Download
Han Mouse Organs GSE108097 Download
Smillie Human Colon SCP259 Download
Kanton Cerebral Organoids E-MTAB-7552 Download
Popescu Human Organs E-MTAB-7407 Download
Guo Mouse Culture GSE103221 Download
Pijuan-Sala Mouse Embryo E-MTAB-6967 Download