Binder

Single Cell Sequencing Analysis Project

Jennefer, Claudea; Kim, Wendy; Tsai, Gordon; Villouta, Catalina

Project Scope

In this project we work directly with public available single cell RNA-seq data with the aim of classifying Mus musculus (house mouse) cells to the appropiate organ they came from. Given limited computational resources we decided to work with cells from kidney and liver only, however the work presented here is generalizable to as many organs as needed.

Project Goals

  • Implement an autoencoder to find a low-dimensional latent representation of the cells.

  • Show that the latent representation is more useful than PCA.

  • Implement a model built on top of the encoder for classifying cells into kidney or liver.

  • Obtain a high performance for the classifier out-of-sample.

Reference

We obtained the data from the Tabula Muris project released in 2017 by The Chan Zuckerberg Biohub. All matrices of gene-cell counts and metadata are available as CSVs on Figshare. We specifically used the data for kidney and liver cells from the FACS-based full-length transcript analysis released in 2018.

  • Consortium, Tabula Muris; Webber, James; Batson, Joshua; Pisco, Angela (2018): Single-cell RNA-seq data from Smart-seq2 sequencing of FACS sorted cells (v2). figshare. Dataset. DOI