Quick Start#
A 5-minute tour. quadsv does three things:
Score one feature with
spatial_q_test(). Does its expression depend on space?Score every feature in a tissue with
Detector(). Which genes are spatially variable?Compare slides with
Comparator(). Do two groups of samples differ in spatial pattern?
The kernel you pass to the test decides what kind of spatial structure earns a high score. CAR and Matérn kernels reward smooth gradients; a graph-Laplacian kernel rewards sharp differences between neighbouring spots. See Kernel Design.
This page covers (1) and (2). For (3), see Cross-sample Comparison.
The four layers#
Every name listed below is importable from the top-level package
with from quadsv import ....
Layer |
What it does |
Public names |
|---|---|---|
Kernels |
Encode the spatial structure to look for. |
|
Tests |
Compute the test statistic and a p-value on a feature vector or batch. |
|
Detectors |
Genome-wide pattern screening on one sample. |
|
Comparators |
Cross-sample pattern comparison between groups of slides. |
|
Test one feature#
Score whether a gene’s expression depends on space, given a kernel.
import numpy as np
from quadsv import NUFFTKernel, spatial_q_test
rng = np.random.default_rng(0)
coords = rng.uniform(0, 20, size=(500, 2))
gene = rng.standard_normal(500)
kernel = NUFFTKernel(coords, method="matern", bandwidth=2.0, nu=1.5)
Q, pval = spatial_q_test(gene, kernel)
print(f"Q = {Q:.4f}, p-value = {pval:.4e}")
Reading the result:
High Q with low p-value. The gene’s expression depends on location, in the way this kernel looks for. Here the kernel is Matérn, which looks for smooth large-scale gradients.
Low Q with high p-value. The gene looks spatially independent under this kernel.
The kernel choice matters. Swap the Matérn for a graph-Laplacian kernel and a gene that scored low above can score high if its expression changes sharply between neighbouring spots. See Kernel Design for picking a kernel.
The same spatial_q_test() call works with any kernel
type. Pass a MatrixKernel for an arbitrary
coordinate cloud or graph, an FFTKernel for a
regular 2-D grid, or NUFFTKernel for irregular 2-D
coordinates. The companion spatial_r_test() tests two
features at a time for spatial co-expression.
Reuse the null fit across many features
When you test many features against the same kernel, precompute
the null distribution once with
compute_null_params() and pass the result back into
the test:
from quadsv import compute_null_params, spatial_q_test
null = compute_null_params(kernel, method="liu") # one-time cost
for gene in gene_matrix.T:
Q, pval = spatial_q_test(gene, kernel, null_params=null)
spatial_q_test() and spatial_r_test()
also accept a chunk_size keyword. The default "auto"
dispatches to auto_chunk_size() to size each batch
for the kernel’s cache sweet spot. See Scalable Computation for
the cost model.
Test every feature in an AnnData#
The Detector() factory picks the right detector class
from the input type. An anndata.AnnData returns a
DetectorIrregular; a
spatialdata.SpatialData returns a
DetectorGrid.
Expected adata layout:
adata.X(or a layer inadata.layers) is the(n_obs, n_vars)count or expression matrix. Sparse formats are fine. The detector consumes one column at a time, so you do not need to densify up front.adata.obsm[obsm_key]is an(n_obs, 2)or(n_obs, 3)array of spatial coordinates in some physical unit. Thebandwidthargument should be in the same unit. Required for the"nufft"backend and for any distance-based kernel ("gaussian","matern").adata.obsp[obsp_key]is an(n_obs, n_obs)adjacency, affinity, or distance matrix. Used by the"matrix"backend when you want to feed a precomputed graph instead of building one from coordinates.
You need at least one of obsm_key or obsp_key. If you pass
both, obsp_key wins.
Build the detector, attach the data with
setup_data(), then run
compute_qstat():
import anndata as ad
from quadsv import Detector
adata = ad.read_h5ad("spatial_tissue.h5ad")
print(f"Data: {adata.n_obs} spots × {adata.n_vars} genes")
detector = Detector(
adata,
kernel_method="matern",
backend="nufft",
bandwidth=25.0, # same units as adata.obsm["spatial"]
nu=1.5,
).setup_data(adata, obsm_key="spatial", min_cells_frac=0.05)
results = detector.compute_qstat(n_jobs=4, return_pval=True)
svgs = results[results["P_adj"] < 0.05]
print(f"Found {len(svgs)} SVGs at FDR < 5%")
The same detector handles spatial co-expression through
compute_rstat():
top_genes = results.nlargest(100, "Q").index.tolist()
coexp = detector.compute_rstat(
features_x=top_genes,
features_y=None, # all pairs within ``features_x``
n_jobs=4,
return_pval=True,
)
Picking a backend (matrix vs nufft)
DetectorIrregular ships two backends, selected
with the backend keyword.
backend="nufft" builds a NUFFTKernel. It
runs at O(n log n) per feature and never materialises an
(n, n) matrix, so it scales to large n. Use it with
smooth kernels (Gaussian, Matérn).
backend="matrix" builds a MatrixKernel,
which picks dense, sparse, or sparse-precision storage based on
n. Use it for graph kernels (car, moran,
graph_laplacian) or when you have a precomputed adjacency in
adata.obsp[obsp_key].
Example with the matrix backend and a CAR kernel:
detector = Detector(
adata,
kernel_method="car",
backend="matrix",
rho=0.9,
k_neighbors=15,
).setup_data(adata, obsm_key="spatial", min_cells_frac=0.05)
Large regular grids (Visium HD)#
For rasterised grids in spatialdata.SpatialData containers,
the same Detector() factory returns a
DetectorGrid. Kernel hyper-parameters go to the
constructor. The bin / table / coordinate layout goes to
setup_data().
Expected sdata layout:
A bin element
sdata[bins](typically ageopandas.GeoDataFrameof bin polygons) that defines the rasterisation grid. For Visium HD this is one of thesquare_002um/square_008um/square_016umshape collections; for imaging data, any shape collection whose footprint covers the rectangular grid you want to rasterise against.A table
sdata.tables[table_name]whoseXis the(n_bins, n_vars)expression matrix and whoseobscarries the integer column / row indices of each bin (col_keyandrow_key).The pair
(col_key, row_key)must yield a contiguous rectangular layout. Missing bins are filled with zeros.
Code:
import spatialdata as sd
from quadsv import Detector
sdata = sd.read_zarr("visium_hd.zarr")
detector = Detector(
sdata,
kernel_method="car",
rho=0.9,
neighbor_degree=1,
topology="square",
).setup_data(
sdata,
bins="square_008um", # name of the bin element in sdata
table_name="square_008um", # name of the table in sdata.tables
col_key="array_col", # integer column index in table.obs
row_key="array_row", # integer row index in table.obs
min_count=10,
)
results = detector.compute_qstat(n_jobs=4, return_pval=True)
Next steps#
Kernel Design. Pick a kernel for your data.
Cross-sample Comparison. Compare spatial patterns across groups of slides.
Scalable Computation. Performance and complexity reference.
Theoretical Results. Mathematical background.
quadsv. Full API reference.