Single-cell sequencing (scSeq) was coined the “method of the year” in 2013 and since then the continuous innovations and diverse uses have invited further interest in this technology. Biocompare recently hosted a Bench Tips webinar where scientists shared their experiences and best practices related to sample preparation, experimental set-up, and data analysis for scSeq, to help achieve accurate and reproducible results. Here are some key questions addressed by the panelists during the webinar.
Which single-cell sequencing technique to use?
“After determining that scSeq is indeed what you need to answer the biological question on hand, one needs to figure out what type of single-cell assay to use and how to analyze the data obtained,” says Maria Monberg, Senior Doctoral Student in the Laboratory of Dr. Anirban Maitra at The University of Texas MD Anderson Cancer Center. “Figuring out the proper study design and how to analyze the data are critical.” Sometimes this requires taking into account the resources you have in hand in terms of budget, personnel, and equipment. There are also many QC requirements to be aware of before embarking on a scSeq experiment, as the experiments tend to be costly and time and labor intensive. “Some people think of single-cell experiments as a fishing tool,” says Monberg. “However, it can be used to investigate and validate hypothesis.”
Do you need to use complementary techniques to validate single-cell data?
scSeq typically refers to RNA sequencing. However, there are other multi-modal and integrative methods for single-cell analyses that measure DNA methylation, histone modifications, chromatin accessibility, spatial mapping, and more. “Just because you see something at the scRNAseq level doesn’t mean that it will hold true at other levels of regulation,” says Monberg. “Hence, you always need some secondary or tertiary dataset to validate your RNA-based findings.” She uses scRNAseq and multiplex immunofluorescence (mIF) to spatially correlate what is seen at the RNA and protein levels. “For some of my experiments, using different dissociation protocols from the same cell population, I will perform whole genome scRNAseq and use the isolated nuclei for ATAC sequencing. Then I will merge the different datasets to help with orthogonal validation of the sequencing experiments.”
Harm Wessels, Ph.D., Post-doctoral Fellow in the Laboratory of Dr. Neville Sanjana and Dr. Rahul Satija in the Center for Genomics and Systems Biology at New York University, uses pooled CRISPR screening along with scRNAseq to enable high-content phenotyping for functional genomic studies. “Using single-cell sequencing allows the capture of cells that contribute to a certain phenotype that is observed,” says Wessels. Their laboratory has developed a new technique (CaRPoolSeq) that uses Cas13 that targets RNA to perform Perturbseq.
How to work with challenging samples?
Monberg works with pancreatic cancer samples that have very low viability and are notorious for auto-digesting when taken out of the human body. “Despite having challenging samples, thoughtful sample selection and sample prep optimization can make a big difference on data quality,” she says. Depending on the scientific question that needs to be addressed, the source of the sample also becomes important. Sample-prep protocols vary depending on whether the sample is from a fresh tissue, frozen tissue, organoid, cell line, or animal model. Some samples that are flash frozen or formalin-fixed paraffin-embedded (FFPE) cannot be used for scSeq experiments or the protocols have to be modified to analyze these samples. Hence, having information about the sample source and storage is critical for experimental planning and downstream analysis. “We have done studies that show the quality of data from flash frozen samples being much lower than from fresh tissues,” says Monberg. “Some of this poor quality can be addressed by using many replicates of the sample for the study.”
Search NGS Sample Prep Tools Search Now Search our directory to find the right NGS sample prep products for your research needs.
Knowing the biological source of the sample is also very important for pooling samples for multiplexed experiments,” says Francisco Galdos, Senior M.D.-Ph.D. Candidate in the Laboratory of Dr. Sean Wu at Stanford University School of Medicine. Although it may seem counterintuitive, running a simple pilot experiment with few samples can help save costs later. “Often times, people try to do too much, too early,” says Galdos. “It’s important to get familiar with the technology you are using and get good at the protocols before moving to more complex designs. Many times, the preparation of the sample takes longer than the capture of the cells.” Incorrect labeling, cross contamination during pooling, insufficient time for staining, not using species-specific antibodies, and inadequate sample washing are all factors that could lead to inaccuracies, irrespective of the sample type or source.
How many cells per sample and how many sample replicates do you need to capture biological and technical variability?
Finding the right number of sample replicates to run and knowing how many cells or nuclei to capture per sample is important to minimize data variability. According to Monberg, as long as you have good cell viability and good sample prep optimization going into the experiment, you can run anywhere from 2000–7000 cells per sample and anywhere from 4–6 libraries for a simple scSeq experiment. “You also need to know the minimum sequencing depth needed to have ‘usable’ data.” Wessels uses about 20,000 cells per lane and 100 cells per perturbation. This is approximately 200 genes per perturbation and 3–4 single guide RNAs (sgRNAs) per gene which essentially contributes to the number of replicates needed. Negative controls include 2–5% of the library. Their lab website has a calculator that people can use to estimate the cost of sequencing.
How do you know if there are any batch effects in your data?
Ideally all the experiments should be done at the same time, with the same reagents, and processed in the same way. Analyzing samples independently can add variability to the analysis. “Running sequencing experiments for independent samples can very expensive and can range from $2000–$5000 for each run,” says Galdos. Operational and personnel-related variation can also arise during sample handling and preparation and lead to higher technical noise.
Galdos warns that you have to constantly think about where the batch effect will come from. “Different cell lines or individuals will generate batch effects that cannot be avoided.” However, he says that while multiplexing can help minimize some of the batch effects, it cannot avoid the ones that are intrinsic to the sample. Hence, if you are running time-course experiments it may be better to pool samples across various time points from one individual, as you cannot avoid batch effects from different individuals.
As Monberg says, you can’t always design the perfect single-cell experiment. However, being aware of where things can go wrong and taking advantage of the many advancements in sample and library preparation, as well as, in analytical tools and bioinformatic algorithms, can certainly help avoid or minimize errors.
Resources
HTAN publication
Single Cell Analysis Review
scRNAseq QC guide
Practical guide to scRNAseq experimental design
Seurat
CITE-Seq
Music
scMC
Slide-seq
Perturb-seq
scRNA-seq in PDAC
Single Cell isolation techniques for scRNA
MULTI-Seq Paper
CITE-Seq Paper
Cell hashing informatic pipeline