Could dropping a few cells change the takeaways from differential expression?
arxiv(2023)
摘要
Differential expression (DE) plays a fundamental role toward illuminating the
molecular mechanisms driving a difference between groups (e.g., due to
treatment or disease). While any analysis is run on particular cells/samples,
the intent is to generalize to future occurrences of the treatment or disease.
Implicitly, this step is justified by assuming that present and future samples
are independent and identically distributed from the same population. Though
this assumption is always false, we hope that any deviation from the assumption
is small enough that A) conclusions of the analysis still hold and B) standard
tools like standard error, significance, and power still reflect
generalizability. Conversely, we might worry about these deviations, and
reliance on standard tools, if conclusions could be substantively changed by
dropping a very small fraction of data. While checking every small fraction is
computationally intractable, recent work develops an approximation to identify
when such an influential subset exists. Building on this work, we develop a
metric for dropping-data robustness of DE; namely, we cast the analysis in a
form suitable to the approximation, extend the approximation to models with
data-dependent hyperparameters, and extend the notion of a data point from a
single cell to a pseudobulk observation. We then overcome the inherent
non-differentiability of gene set enrichment analysis to develop an additional
approximation for the robustness of top gene sets. We assess robustness of DE
for published single-cell RNA-seq data and discover that 1000s of genes can
have their results flipped by dropping <1% of the data, including 100s that are
sensitive to dropping a single cell (0.07%). Surprisingly, this non-robustness
extends to high-level takeaways; half of the top 10 gene sets can be changed by
dropping 1-2% of cells, and 2/10 can be changed by dropping a single cell.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要