Hopper is a mathematically motivated sketching algorithm for single-cell data, which preserves transcriptional diversity using a farthest-first traversal on the data. The algorithm will be presented at ISMB 2020, will appear in an upcoming issue of Bioinformatics, and is available as a pre-print on bioRxiv.
Here we provide sketches of ten of the largest publicly available single-cell datasets, produced with the help of Brian Hie. Each download contains a directory with two files: a sketched dataset with 20,000 cells, and a table with gene labels from the individual studies. These facilitate rapid analyses of these datasets without the heavy computational burden that would otherwise be incurred, and without compromising rare cell populations.
Original Study Link | Original Data Accession | Sketched Data |
Cao Mouse Embryo | GSE119945 | Download |
Moffit Mouse Brain | GSE113576 | Download |
Saunders Mouse Brain | GSE116470 | Download |
Zeisel Mouse Brain | SRP135960 | Download |
Han Mouse Organs | GSE108097 | Download |
Smillie Human Colon | SCP259 | Download |
Kanton Cerebral Organoids | E-MTAB-7552 | Download |
Popescu Human Organs | E-MTAB-7407 | Download |
Guo Mouse Culture | GSE103221 | Download |
Pijuan-Sala Mouse Embryo | E-MTAB-6967 | Download |