Creates and validates the main tinydenseR object structure that will hold expression data, metadata, and analysis results throughout the workflow. This is the required first step before landmark identification.
setup.lm.obj() is deprecated; use setup.tdr.obj() instead.
Usage
setup.tdr.obj(
.cells,
.meta,
.markers = NULL,
.harmony.var = NULL,
.assay.type = "cyto",
.celltype.vec = NULL,
.verbose = TRUE,
.seed = 123,
.prop.landmarks = 0.1,
.n.threads = if (is.hpc()) {
max(RhpcBLASctl::blas_get_num_procs(),
RhpcBLASctl::omp_get_num_procs(), RhpcBLASctl::omp_get_max_threads(), na.rm = TRUE)
} else {
parallel::detectCores(logical = TRUE)
}
)
setup.lm.obj(...)Arguments
- .cells
A named list of file paths (character strings) pointing to RDS files, each containing one expression matrix per sample. For RNA: sparse matrix (dgCMatrix) with genes as rows and cells as columns. For cytometry: matrix with cells as rows and markers as columns. List names become sample identifiers.
- .meta
A data.frame with sample-level metadata. Rownames must match names in
.cellsexactly. Used for batch correction and downstream modeling.- .markers
Character vector of marker names to use for landmark identification. Only applicable for
.assay.type = "cyto". If NULL, uses all markers from the first sample. Ignored for RNA data (uses HVG selection instead).- .harmony.var
Character vector of column names from
.metato use for Harmony batch correction. Supported for both"RNA"and"cyto"assay types. For cytometry, Harmony operates on the SVD embedding of the (centered, scaled) marker matrix. If NULL, no batch correction is performed.- .assay.type
Character string: "cyto" for cytometry (default) or "RNA" for scRNA-seq. Determines normalization strategy and feature selection approach.
- .celltype.vec
Optional named character vector mapping cell IDs to cell type labels.
- .verbose
Logical, whether to print progress messages. Default TRUE.
- .seed
Integer for random seed to ensure reproducibility. Default 123.
- .prop.landmarks
Numeric between 0 and 1 specifying proportion of cells to sample as landmarks. Default 0.1 (10%). Total landmarks capped at about 5000 regardless.
- .n.threads
Integer for parallel processing. Default automatically detects maximum available threads (using BLAS settings on HPC, or
detectCores()locally).- ...
Arguments passed to
setup.tdr.obj().
Value
A list object with initialized structure containing:
$cellsInput file paths
$metadataSample metadata with added cell count columns
$configConfiguration parameters:
$keyNamed vector mapping future landmarks to samples
$samplingSampling parameters including
n.cells,n.perSample$assay.typeAssay type ("cyto" or "RNA")
$markersMarker list (cytometry only)
$n.threadsNumber of threads for parallel processing
$integrationIntegration/batch correction results:
$harmony.varBatch variables (if provided)
$harmony.objSymphony reference object (populated by
get.landmarks)
- Empty slots
landmarks, scaled.landmarks, raw.landmarks, pca, graph, map, etc. - populated by downstream functions
Details
This function:
Validates input data structure and compatibility
Calculates the number of landmarks to sample per sample (max 5000 total)
Creates a "key" vector mapping landmarks to samples
Initializes empty slots for downstream analyses (PCA, graph, mapping)
Performs quality checks (warns if sample sizes vary >10-fold)
The landmark sampling strategy aims for proportional representation across samples while capping total landmarks at 5000 for computational efficiency. Large samples are capped to prevent domination and ensure adequate representation.
Examples
if (FALSE) { # \dontrun{
# Prepare data (from README workflow)
.meta <- get.meta(.obj = sim_trajectory, .sample.var = "Sample")
.cells <- get.cells(.exprs = sim_trajectory,
.meta = .meta,
.sample.var = "Sample")
# Basic setup for RNA data
tdr.obj <- setup.tdr.obj(
.cells = .cells,
.meta = .meta,
.assay.type = "RNA",
.prop.landmarks = 0.15
)
# Cytometry workflow with marker selection
tdr.obj <- setup.tdr.obj(
.cells = .cells,
.meta = .meta,
.markers = c("CD3", "CD4", "CD8", "CD19"),
.assay.type = "cyto"
)
# RNA workflow with batch correction
tdr.obj <- setup.tdr.obj(
.cells = .cells,
.meta = .meta,
.harmony.var = "batch",
.assay.type = "RNA"
)
} # }
