quadsv.detectors.irregular#
Classes#
Detect spatial patterns on irregular samples (AnnData spots / cells). |
Module Contents#
- class quadsv.detectors.irregular.DetectorIrregular(kernel_method='matern', backend='matrix', **kernel_params)[source]#
Bases:
quadsv.detectors.base.DetectorDetect spatial patterns on irregular samples (AnnData spots / cells).
Univariate (Q-test) and bivariate (R-test) kernel-based spatial statistics. Supports two backends:
backend='matrix'—MatrixKernel(dense or implicit sparse-precision, auto-selected byn). Good up to ~10⁴ spots.backend='nufft'—NUFFTKernel,O(n log n)quadratic forms on arbitrary point sets. Recommended for ≥ 10⁴ spots.
The core test statistics are:
Univariate: \(Q = \\mathbf{x}^T \\mathbf{K} \\mathbf{x}\)
Bivariate: \(R = \\mathbf{x}^T \\mathbf{K} \\mathbf{y}\)
Workflow#
Construct with kernel method + backend + kernel hyperparameters.
Setup with
setup_data()passing theanndata.AnnDataplus spatial source (obsm_keyinobsm, orobsp_keyfor precomputed adjacency / distance).Compute with
compute_qstat()/compute_rstat().
- param kernel_method:
One of
'gaussian','matern','moran','graph_laplacian','car'.- type kernel_method:
str, default
'matern'- param backend:
Kernel backend.
- type backend:
{
'matrix','nufft'}, default'matrix'- param **kernel_params:
Method- and backend-specific kernel hyperparameters. Matrix backend:
bandwidth,nu,rho,k_neighbors,standardize. NUFFT backend:bandwidth,nu,rho,neighbor_degree, plus grid controlsgrid_shape,spacing,unit_scale,oversample,eps.- ivar backend_:
Which backend was selected at construction.
- vartype backend_:
{
'matrix','nufft'}- ivar adata:
Input container set by
setup_data().- vartype adata:
anndata.AnnDataor None- ivar min_cells:
Minimum non-zero count per feature; set by
setup_data().- vartype min_cells:
int or None
- ivar kernel_:
The built kernel; populated by
setup_data().- vartype kernel_:
Kernelor None- ivar kernel_method_, kernel_params_, n:
See
Detector.
Examples
>>> import anndata as ad, numpy as np >>> from quadsv import DetectorIrregular >>> rng = np.random.default_rng(0) >>> adata = ad.AnnData(X=rng.standard_normal((200, 5))) >>> adata.obsm["spatial"] = rng.standard_normal((200, 2)) >>> det = DetectorIrregular(kernel_method="car", rho=0.9, k_neighbors=8) >>> det.setup_data(adata, min_cells=5) <DetectorIrregular ...> >>> # q = det.compute_qstat()
- compute_qstat(source='var', features=None, n_jobs=-1, layer=None, return_pval=True, chunk_size='auto', show_progress=True)[source]#
Compute univariate spatial Q-statistic for selected features.
Tests each feature for significant spatial clustering or dispersion using the pre-built kernel. Parallelizes across features and applies Benjamini-Hochberg multiple testing correction.
- Parameters:
source (str, default 'var') – Feature source: ‘var’ (genes) or ‘obs’ (metadata columns).
features (Optional[List[str]]) – Feature names to test. If None, tests all features in source.
n_jobs (int, default -1) – Number of parallel jobs. -1 uses all available cores; 1 for sequential.
layer (Optional[str]) – If source=’var’, which layer to use (e.g., ‘raw’, ‘log1p’). If None, uses .X.
return_pval (bool, default True) – If True, returns p-values and BH-corrected p-values. If False, returns Q only.
chunk_size (int or
'auto', default'auto') – Number of features each worker densifies at once (inner batch).'auto'targets ~256 MB per batch using_auto_chunk_size(), yieldingchunk_size ≈ clip(16, 512, 256 MB / (4 · n · 8 B)). Override with an integer when memory is tight or you want deterministic batching.show_progress (bool, default True) – Show a tqdm progress bar over worker chunks.
- Returns:
df – Results sorted by Q (descending). Columns: - Feature: feature name - Q: test statistic (univariate spatial variability) - Z_score: standardized Q by null mean/std - P_value: tail probability under null (if return_pval=True) - P_adj: Benjamini-Hochberg adjusted p-value (if return_pval=True)
- Return type:
pd.DataFrame
- Raises:
ValueError – If kernel not initialized, or source is invalid.
Notes
Under H₀: feature has no spatial structure. Under H₁: significant spatial signal (clustering or dispersion).
Zero-variance features are assigned Q=0, P_value=1.0.
The null-distribution approximation is auto-selected from
self.kernel_method_('clt'for Moran’s I,'welch'for all other kernels) and cannot be overridden through this method. For full control over the null method (including'liu'), callquadsv.statistics.spatial_q_test()directly.Examples
>>> detector.setup_data(adata) >>> results = detector.compute_qstat(source='var', features=['Gene1', 'Gene2'], n_jobs=-1) >>> top_genes = results.iloc[:10]
- compute_rstat(features_x=None, features_y=None, source='var', n_jobs=-1, layer=None, return_pval=True, chunk_size='auto', show_progress=True)[source]#
Compute bivariate spatial R-statistic (cross-spatial correlation) for feature pairs.
Tests for significant spatial co-variation between pairs of features using the pre-built kernel. Supports symmetric (all pairs within one set) or bipartite (all X vs Y pairs) modes. Parallelizes computation and applies multiple testing correction.
- Parameters:
features_x (Optional[List[str]]) – Feature names for the first set. If None and features_y is None, uses all features (symmetric mode).
features_y (Optional[List[str]]) – Feature names for the second set. If None, computes all pairwise within features_x. If provided, computes all X vs Y pairs (bipartite mode).
source (str, default 'var') – Feature source: ‘var’ (genes) or ‘obs’ (metadata columns).
n_jobs (int, default -1) – Number of parallel jobs. -1 uses all available cores; 1 for sequential.
layer (Optional[str]) – If source=’var’, which layer to use (e.g., ‘raw’, ‘log1p’). If None, uses .X.
return_pval (bool, default True) – If True, returns p-values and BH-corrected p-values. If False, returns R only.
chunk_size (int or
'auto', default'auto') – Number of Y features to batch together when pre-computingK @ Y_chunk.'auto'uses_auto_chunk_size()(~256 MB per batch target); integer values override the heuristic.show_progress (bool, default True) – Show a tqdm progress bar over the Y-chunk loop.
- Returns:
df – Results sorted by absolute Z_score (descending). Columns:
Feature_1: name of first feature
Feature_2: name of second feature
R: test statistic (bivariate spatial correlation, range approximately [-1, 1])
Z_score: standardized R by null mean/std
P_value: two-tailed p-value under null (if return_pval=True)
P_adj: Benjamini-Hochberg adjusted p-value (if return_pval=True)
- Return type:
pd.DataFrame
- Raises:
ValueError – If kernel not initialized, features_x is None when features_y is provided, or no valid pairs generated.
Notes
Under H₀: features are spatially independent. Under H₁: significant spatial co-clustering or co-dispersion.
Unlike
quadsv.statistics.spatial_r_test(), this method always returns R-statistics for all requested feature pairs in the symmetric mode (features_y=None). Forfeatures_x=[A, B, C], the output contains(A, A), (A, B), (A, C), (B, A), (B, B), (B, C), (C, A), (C, B), (C, C).P-value calculation uses a normal approximation based on Tr(K²) and is not configurable through this method. For finer control over the null model, call
quadsv.statistics.spatial_r_test()directly.Zero-variance features are handled gracefully (assigned R=0, P=1).
Examples
>>> detector.setup_data(adata) >>> # All pairwise correlations within gene set >>> results = detector.compute_rstat(features_x=['Gene1', 'Gene2', 'Gene3'], n_jobs=-1) >>> # Cross-correlation between two gene sets >>> results = detector.compute_rstat( ... features_x=['Gene1', 'Gene2'], ... features_y=['Gene3', 'Gene4'], ... n_jobs=-1 ... )
- setup_data(adata, *, obsm_key='spatial', obsp_key=None, is_distance=False, min_cells=1, min_cells_frac=None)[source]#
Attach
adata, apply feature filters, build the kernel.- Parameters:
adata (
anndata.AnnData) – Input container. Must haveadata.obsm[obsm_key](unlessobsp_keyis provided instead).obsm_key (str, default
'spatial') – Key inadata.obsmholding(n_obs, 2)spatial coordinates. Used whenobsp_keyisNone.obsp_key (str, optional) – If provided, build the kernel from
adata.obsp[obsp_key]instead of from coordinates. Not compatible withbackend='nufft'.is_distance (bool, default
False) – Whenobsp_keyis given: treat the matrix as pairwise distances (True) or adjacency / connectivity (False).min_cells (int, default 1) – Minimum number of cells with non-zero value for a feature to be tested. Clamped to
[1, n_obs].min_cells_frac (float, optional) – If provided, overrides
min_cellswithmax(1, int(min_cells_frac * n_obs)).
- Returns:
self
- Return type:
- adata: Any | None = None[source]#
Reference to the input
anndata.AnnData, set bysetup_data().
- min_cells: int | None = None[source]#
Minimum non-zero-count threshold applied in
setup_data().
- Parameters:
kernel_method (str)
backend (str)
kernel_params (Any)