Performs pseudobulk differential expression (DE) analysis for genes/markers with two modes.
Usage
get.pbDE(x, ...)
# S3 method for class 'TDRObj'
get.pbDE(
x,
.source = NULL,
.mode = NULL,
.design = NULL,
.contrasts = NULL,
.block = NULL,
.geneset.ls = NULL,
.id = NULL,
.id.idx = NULL,
.id2 = "..all.other.landmarks..",
.id2.idx = NULL,
.id.from = NULL,
.model.name = "default",
.result.name = NULL,
.population.name = NULL,
.comparison.name = NULL,
.force.recalc = FALSE,
.verbose = TRUE,
.label.confidence = 0.5,
...
)Arguments
- x
A
TDRObj, Seurat, SingleCellExperiment, or HDF5AnnData (anndataR) object processed throughget.map().- ...
Additional arguments passed to methods.
- .source
The raw data object for non-file backends.
NULL(default) for the files backend; otherwise a Seurat, SingleCellExperiment, or anndataR AnnData object. Used by.get_sample_matrix()to retrieve per-sample expression matrices.- .mode
Character: analysis mode. One of
"design"or"marker". IfNULL(default), auto-detected from arguments:.designprovided \(\Rightarrow\)"design".id/.id.idxprovided without.design\(\Rightarrow\)"marker"
- .design
Design matrix specifying experimental design (design mode only). Rows = samples, columns = coefficients.
- .contrasts
Optional contrast matrix for specific comparisons (design mode only). Create with
limma::makeContrasts(). If NULL, tests all.designcoefficients.- .block
Optional character: column name in
.tdr.obj$metadatafor blocking factor (design mode only, e.g., "Donor"). Accounts for within-block correlation.- .geneset.ls
Optional named list of character vectors defining gene sets for GSVA enrichment analysis. Only for RNA data. Example:
list("Tcell" = c("CD3D", "CD3E"), "Bcell" = c("CD19", "MS4A1")).- .id
Optional character vector of cluster/celltype IDs. In design mode, restricts analysis to cells matching these IDs. In marker mode, defines group 1 (test group).
- .id.idx
Optional integer vector specifying landmark indices. In design mode, restricts analysis to cells confidently assigned to these landmarks. In marker mode, defines group 1 landmark indices.
- .id2
Character vector of cluster/celltype IDs for group 2 (marker mode only). Default
"..all.other.landmarks.."compares group 1 to all other cells. Can specify specific IDs for pairwise comparisons.- .id2.idx
Optional integer vector specifying landmark indices for group 2 (marker mode only). When provided, takes priority over
.id2.- .id.from
Character:
"clustering"or"celltyping". Source of IDs in.idand.id2. DefaultNULL(resolved to"clustering"when needed).- .model.name
Character string naming this model fit (default
"default").- .result.name
Character string naming this result. In design mode defaults to
"all"; in marker mode auto-generated from.idand.id2. Used as the storage key:$pbDE[[.model.name]][[.result.name]](design) or$markerDE[[.model.name]][[.result.name]](marker).- .population.name
NULL. Deprecated alias for.result.name(design mode).- .comparison.name
NULL. Deprecated alias for.result.name(marker mode).- .force.recalc
Logical: if TRUE, overwrite existing results in the specified slot (default FALSE).
- .verbose
Logical: print progress messages? Default TRUE.
- .label.confidence
Numeric scalar in
[0,1]controlling the minimum posterior confidence required to assign a cell to a set of target landmarks (used when.id.idxor.id2.idxis provided). Default 0.5.
Value
The modified .tdr.obj with results stored depending on mode:
Design mode
Results in .tdr.obj$pbDE[[.model.name]][[.result.name]]:
coefficientsLog fold change matrix (features x coefficients)
p.valueP-values (features x coefficients)
adj.pFDR-adjusted p-values (features x coefficients)
smpl.outlierLogical vector indicating outlier samples
id.idxPer-sample list of cell indices used
n.pseudoInteger vector of pseudobulk cell counts per sample
geneset(RNA +
.geneset.ls) GSVA results
Marker mode
Results in .tdr.obj$markerDE[[.model.name]][[.result.name]]:
coefficientsLog fold changes (features x coefficients)
p.valueP-values (features x coefficients)
adj.pFDR-adjusted p-values (features x coefficients)
smpl.outlier.1Logical: samples excluded from group 1
smpl.outlier.2Logical: samples excluded from group 2
id1.idxPer-sample cell indices for group 1
id2.idxPer-sample cell indices for group 2
n.pseudo1Pseudobulk cell counts per sample for group 1
n.pseudo2Pseudobulk cell counts per sample for group 2
geneset(RNA +
.geneset.ls) GSVA results
Details
Design mode (.mode = "design"): Aggregates cells into pseudobulk samples
using fuzzy landmark membership, then uses limma-voom (RNA) or limma (cytometry) to test for
DE across experimental conditions. Uses a user-supplied design matrix with samples as replicates.
Marker mode (.mode = "marker"): Identifies marker genes/proteins distinguishing
one cell population from another (or all others) via a within-sample paired comparison. Unlike
design mode which tests experimental conditions, marker mode compares cell populations to find
defining features.
Design mode
Tests for DE across experimental conditions using a user-supplied design matrix:
Cell selection: If
.idspecified, select matching cells. If.id.idxspecified, use fuzzy confidence thresholding. Otherwise use all cells.Pseudobulk aggregation: Fuzzy-weighted expression per sample.
Outlier removal: (RNA only) Exclude samples with <10\
Normalization: TMM + voom (RNA) or as-is (cytometry).
Linear modeling:
limma::lmFitwith optional blocking.Empirical Bayes:
limma::eBayes(robust = TRUE).GSVA (optional, RNA only): Gene set variation analysis.
Marker mode
Compares cell populations via a within-sample paired design (~ .ids + .pairs):
Cell selection: Extract cells for group 1 (
.id) and group 2 (.id2).Pseudobulk aggregation: Independent aggregation per group per sample.
Outlier removal: (RNA only) Cross-group outlier flagging.
Paired comparison:
limma::lmFitwith sample-as-pair blocking.
Positive logFC = higher in group 1, negative = higher in group 2.
See also
get.map (required), plotPbDE, plotMarkerDE,
get.plsD
Examples
if (FALSE) { # \dontrun{
# After mapping
lm.cells <- setup.tdr.obj(.cells = .cells, .meta = .meta) |>
get.landmarks() |> get.graph() |> get.map()
# --- Design mode (default when .design is provided) ---
design <- model.matrix(~ Condition, data = .meta)
lm.cells <- get.pbDE(lm.cells, .design = design)
plotPbDE(lm.cells, .coefs = "ConditionB")
# DE within specific cell type
lm.cells <- get.pbDE(lm.cells, .design = design,
.id = c("1", "2", "3"),
.id.from = "clustering",
.result.name = "tcells")
# --- Marker mode (auto-detected when .id is provided without .design) ---
lm.cells <- get.pbDE(lm.cells, .mode = "marker",
.id = "cluster.3",
.result.name = "cluster3_markers")
plotMarkerDE(lm.cells, .comparison.name = "cluster3_markers")
# Pairwise comparison
lm.cells <- get.pbDE(lm.cells, .mode = "marker",
.id = c("cluster.1", "cluster.3"),
.id2 = c("cluster.2", "cluster.4"),
.result.name = "cd4_vs_cd8")
# Access results
lm.cells$pbDE$default$all$adj.p
lm.cells$markerDE$default$cluster3_markers$coefficients
} # }
