RNA

ADT

ATAC

COMPOSITE data preparation tutorial
library(Seurat)
library(Signac)
library(Matrix)

This tutorial will demonstrate the process of preparing each modality of input data for COMPOSITE using a Seurat object. The provided ‘demo_data.rdata’ is a downsampled Seurat object that contains multiomics single-cell data, including three modalities: RNA, ADT, and ATAC (peaks).

Note that the ATAC data is not provided in the demo data, since the “GeneActivity” function requires fragment files which are too large to be included into the demo data. Therefore, the code related to ATAC data below will not work for the demo dataset, but they should work smoothly on your seurat object if it contains a “peak” assay that is generated following the instructions on the Signac tutorial.

#Load a downsampled peripheral blood sample DOGMA-seq dataset
load("demo_data.rdata")

The RNA and ADT raw counts are directly extracted from the Seurat object and stored as .mtx files, making them ready for input into COMPOSITE. For the ATAC peaks, an additional step is required. Firstly, based on the ATAC data, the gene activity is inferred using the GeneActivity function from the Signac package. Subsequently, the inferred gene activity data is saved as a .mtx file, which can then be used as input for COMPOSITE.

Preparing data for the Shiny app

To prepare the data for the Shiny app, we recommend the users to pre-filter the features for RNA and ATAC locally using the the following code.

# directly extract RNA and ADT
rna = demo_data@assays$RNA@counts
adt = demo_data@assays$ADT@counts
# infer gene activity from ATAC peaks
DefaultAssay(demo_data) <- "peaks"
inferred_gene_activity = GeneActivity(demo_data)

# keep the RNA features with high mean values as the candidate stable features
row_means <- rowMeans(rna)
top_rows <- order(row_means, decreasing = TRUE)[1:500]
rna <- rna[top_rows, ]
# keep the ATAC features with high mean values as the candidate stable features
row_means <- rowMeans(inferred_gene_activity)
top_rows <- order(row_means, decreasing = TRUE)[1:500]
inferred_gene_activity <- inferred_gene_activity[top_rows, ]



writeMM(rna, file = 'RNA.mtx')
writeMM(adt, file = 'ADT.mtx')
writeMM(inferred_gene_activity, file = 'ATAC.mtx')

Preparing data for the Python package

# directly extract RNA and ADT
rna = demo_data@assays$RNA@counts
adt = demo_data@assays$ADT@counts

# infer gene activity from ATAC peaks
DefaultAssay(demo_data) <- "peaks"
inferred_gene_activity = GeneActivity(demo_data)

writeMM(rna, file = 'RNA.mtx')
writeMM(adt, file = 'ADT.mtx')
writeMM(inferred_gene_activity, file = 'ATAC.mtx')

We provide readily available demo datasets that can be used directly as input.

RNA.mtx ADT.mtx ATAC.mtx