benchmarking doublet detection methods¶
Simulation datasets¶
To simulate doublets for benchmarking, we randomly selected the gene expression counts data from two cells that were found to be true singlets by singletCode. We averaged the counts from these two cells to generate simulated doublets and create datasets with various doublet percentages for benchmarking.
Benchmarking¶
- We used these datasets to benchmark four doublet detection methods:
We evaluated the AUPRC, AUROC, TNR, and doublet scores and calls of the four methods and found lower than expected performance for all methods. A plot of our results for AUPRC value is found below.
We examined the consistency of doublet labeling across different doublet detection methods by introducing a ‘similarity score’—a measure of the fraction of doublets identically classified by two methods. With an average similarity score of 0.66 across all datasets and methods, we observed variability in doublet detection.
We further evaluated doublet detection on ensemble doublet detection methods (hybrid, Chord) and across sequencing technologies (10X Genomics, Smart-seq3). For a more detailed evaluation of our results, refer to our paper.
Heterogeneity effects¶
We wanted to know whether heterogeneity of a dataset affects the performance of doublet detection methods. Because heterogeneity can be impacted by many properties of a dataset, such as experimental design and data processing, we made conclusions based on heterogeneity within a sample. We did this by subsampling singlets and doublets within a single PC cluster for a sample (less heterogeneous, low Euclidean distance), and across all clusters for a sample (more heterogeneity, higher Euclidean distance).
singletCode for scATAC-seq doublet detection¶
We evaluated our method’s ability to assess doublet detection in scATAC-seq datasets using AMULET for a proof-of-concept analysis. We generated Watermelon-barcoded 10X Genomics Multiome datasets and applied AMULET to scATAC-seq fragments to categorize singlets and doublets. Concurrently, we identified true singlets in a barcoded scRNA-seq library from the same cells. By comparing the true negative and false positive rates between AMULET and singletCode across six datasets, we calculated AMULET’s average true negative rate (TNR) at 0.924. Our results demonstrate that singletCode can be used to benchmark doublet detection in other modalities besides scRNA-seq.