Converting Parse Biosciences Evercode™ data to be compatible with Cellenics® analysis software
Update: January 2024
Good news!! Parse Biosciences Evercode WT data can now be uploaded directly to Cellenics® through the data upload feature in the Data Management module - just select ‘Parse Evercode WT’ from the dropdown menu and drag and drop your files into the platform!
Happy analyzing :)
Introduction
Currently, the output formats of the Parse Bioscience's Evercode™ data is not directly compatible with the Cellenics® open source tool for secondary analysis and visualization of single cell RNA-seq data.
The Parse Biosciences output files are typically provided in 3 file formats:
all_genes.csv. CSV file containing gene ids and gene names.
cell_metadata.csv. CSV file containing barcodes and the associated metadata
DGE.mtx. MatrixMarket file containing the count matrix.
Can you use Cellenics® for downstream analysis of this data type? Yes, you can! But first they need to be converted to be compatible…
In this blog post we explain how to convert the above three files into a format that can be directly uploaded to the Biomage-hosted community instance of Cellenics® that’s available at https://scp.biomage.net/.
Checking the file
Converting outputs of the Evercode™ platform into a Cellenics® compatible format is quite straightforward if the data contains only a single sample. However, it involves a bit more work if the data is contains multiple samples (multiplexed). You can check if the data is multiplexed or not by running the following command in your terminal:
cat cell_metadata.csv | tail -n +2| cut -d, -f 3 | sort | uniq
The command will output the names of samples that are in the data. For example, a dataset with two samples will output:
Converting single-sample data
If the dataset contains a single sample, you can use a combination of Seurat::ReadParseBio() and DropletUtils::write10xcounts() functions as illustrated below.
The script should output files of the ‘10x file format’ into an `output_dir`. These ‘10x files’ can then be directly uploaded to the Biomage-hosted community instance of Cellenics®.
library(Seurat)
library(DropletUtils)
input_dir <- “path/to/parse/data/directory”
output_dir <- “path/to/output/dir”
# Read in as dgCMatrix
seurat <- Seurat::ReadParseBio(input_dir)
# Read in cell meta data, if available
cell_meta <- read.csv(paste0(input_dir, "/cell_metadata.csv"), row.names = 1)
# Create Seurat object.
# If you do not have metadata, remove the `meta.data` option from the function call.
seurat_obj <- CreateSeuratObject(seurat, meta.data = cell_meta)
# Write to 10x files format
DropletUtils::write10xCounts(output_dir, seurat_obj@assays$RNA@counts, version="3")
Converting multi-sample/multiplexed data
If the data is multiplexed, the conversion process is a little bit more involved. You can read the Parse Biosciences output data into a Seurat object using the Seurat::ReadParseBio function, and then follow the instructions in the blog post: How to demultiplex a Seurat object and convert it to 10X files.
library(Seurat)
input_dir <- “path/to/parse/data/directory”
# Read in as dgCMatrix
seurat <- Seurat::ReadParseBio(input_dir)
# Read in cell meta data, if available
cell_meta <- read.csv(paste0(input_dir, "/cell_metadata.csv"), row.names = 1)
# Create Seurat object
# If you do not have metadata, remove the `meta.data` option from the function call.
seurat_obj <- CreateSeuratObject(seurat, meta.data = cell_meta)
## Continue with the steps in the blog post
Conclusion
Converting files from Parse Biosciences output to ‘10x files’ that can be uploaded into the Biomage-hosted community instance of Cellenics® is a straightforward way to perform your secondary data analysis and visualization.
If you face difficulties in converting your data, reach out to us via the community forum (https://community.biomage.net/) to discuss our bioinformatics services.
If you’d like to learn more about scRNAseq data analysis, check out this blog post about our online course: Unlock the Secrets of Single-Cell RNA-Seq Data Analysis with Our Comprehensive Course