Scalable Computation#
Evaluating \(Q_n = \mathbf{z}^\top \tilde{\mathbf{K}} \mathbf{z}\) and its p-value naively requires the dense \(n \times n\) matrix \(\tilde{\mathbf{K}}\) and its full spectrum, which costs \(\mathcal{O}(n^3)\) time and \(\mathcal{O}(n^2)\) memory. Modern spatial-omics datasets (with \(n\) from \(10^3\) to \(10^7\) spots and \(m\) from \(10^3\) to \(10^4\) features) make that intractable.
quadsv replaces the dense baseline with scalable operators that
never materialise \(\tilde{\mathbf{K}}\) and never run a full
eigendecomposition. This page summarises the three orthogonal axes
of the design: null-distribution approximations, structured or
sparse kernel operators, and Fourier-accelerated primitives. It
also tabulates their combined complexity.
Notation: \(c_p := \operatorname{tr}(\tilde{\mathbf{K}}^p)\) is the \(p\)-th spectral power sum. Under the null, \(\mathbb{E}[Q_n] = c_1\) and \(\operatorname{Var}[Q_n] = 2 c_2 + \epsilon_{\text{kurtosis}}\).
Null-distribution approximations#
Under \(H_0\), \(Q_n \xrightarrow{d} \sum_{i=1}^n \lambda_i^{(n)} \chi_1^2\). An exact p-value can be obtained by Davies’ inversion of
but only after paying \(\mathcal{O}(n^3)\) for the full spectrum
plus \(\mathcal{O}(Kn)\) per feature, where \(K\) is the
number of quadrature nodes. To avoid the eigendecomposition
entirely, quadsv ships two moment-matching fits that evaluate in
\(\mathcal{O}(1)\) per feature once the cumulants are cached:
Welch-Satterthwaite and CLT match \(c_1, c_2\) to a scaled central \(\chi^2\) (Welch, PSD kernels only) or to a Normal (CLT, valid for indefinite \(\tilde{\mathbf{K}}\) too). Both need only \(\operatorname{tr}(\tilde{\mathbf{K}})\) and \(\operatorname{tr}(\tilde{\mathbf{K}}^2)\).
Liu’s approximation matches the first four cumulants \(c_1 \ldots c_4\) to a shifted non-central \(\chi^2\). The higher cumulants are estimated via Hutchinson’s trace estimator,
\[c_p \;=\; \mathbb{E}\bigl[\mathbf{v}^\top \tilde{\mathbf{K}}^p \mathbf{v}\bigr] \;\approx\; \tfrac{1}{V}\,\textstyle\sum_{s=1}^{V} \mathbf{v}_s^\top \tilde{\mathbf{K}}^p \mathbf{v}_s,\]with iid Rademacher probes \(\{\mathbf{v}_s\} \stackrel{\text{iid}}{\sim} \{\pm 1\}^n\). Two matvecs per probe deliver all four cumulants from the same probe pool.
Finite-:math:`n` Dirichlet correction. Because \(\mathbf{z} = \mathbf{H}\mathbf{x} / \hat{\sigma}\) uses the sample variance \(\hat{\sigma}^2 = \mathbf{x}^\top \mathbf{H} \mathbf{x} / (n-1)\) in the denominator, and that sample variance is correlated with the numerator, the exact null variance follows a Dirichlet(1/2) ratio:
Welch and Liu both use this corrected variance. Without it, Liu’s right tail collapses to zero on broad-spectrum kernels where \(c_1^{2} \approx (n-1)\,c_2\). CAR on a dense grid is the prototypical failure mode.
Note
Moment-matching’s right tail decays faster than the true null on heavy-tailed spectra, which can over-reject in the extreme tails. Liu is the most tail-accurate of the three, Welch is tighter in the bulk, and CLT is the coarsest. When in doubt, run Liu.
Structured and sparse kernel operators#
For an arbitrary spatial coordinate cloud \(\{\mathbf{s}_i\}\), three strategies let matvecs scale sub-quadratically:
Sparse kernels (\(k\)-NN, thresholded distance). \(\mathbf{K}\) has \(\mathcal{O}(nk)\) non-zeros. Matvecs and Hutchinson cumulants cost \(\mathcal{O}(nk)\). Caveat: many ad-hoc sparse kernels (notably Moran’s I) are indefinite, which induces the cancellation pathology and costs power on composite spatial patterns.
Low-rank kernels \(\mathbf{K} \approx \mathbf{U}\mathbf{U}^\top\) with \(\mathbf{U} \in \mathbb{R}^{n \times r}\), \(r \ll n\). Matvecs cost \(\mathcal{O}(nr)\). Spectra fall out of an \(\mathcal{O}(r^3)\) eigen-problem on \(\mathbf{U}^\top\mathbf{U}\). Factorisation cost is \(\mathcal{O}(nr^2)\) via Nyström or Random Fourier Features. Caveat: finite rank creates an infinite-dimensional null space (blind spots), so power vanishes on high-frequency patterns.
Sparse precision (CAR). The precision matrix \(\mathbf{\Omega} = \mathbf{I} - \rho \widetilde{\mathbf{W}}\) is sparse with \(\mathcal{O}(nk)\) non-zeros even though \(\mathbf{K} = \mathbf{\Omega}^{-1}\) is dense. On typical 2-D graphs a sparse LU (or Cholesky) factorisation of \(\mathbf{\Omega}\) costs \(\mathcal{O}(n^{3/2})\) time with \(\mathcal{O}(n \log n)\) fill-in. Each subsequent \(\mathbf{K}\mathbf{v}\) is an \(\mathcal{O}(n \log n)\) triangular solve. Preserves full rank and positive definiteness.
Fourier-accelerated methods#
Under periodic boundary conditions, any translation-invariant kernel on a regular 2-D or 3-D grid is diagonalised by the Fourier basis:
The grid spectrum \(\boldsymbol{\mu}\) is analytic in the kernel hyper-parameters (bandwidth, \(\nu\), \(\rho\), neighbour structure), so no eigendecomposition is needed. Matvecs become element-wise products at the cost of one \(\mathcal{O}(n \log n)\) FFT:
Irregular coordinates (NUFFT). For non-uniform \(\{\mathbf{s}_i\}\), we approximate \(\mathbf{K}\) with an oversampled \(n'\)-grid (\(n' > n\)):
Matvecs are a type-1 / type-2 NUFFT round-trip at cost \(\mathcal{O}(n' \log n')\):
Because \(\mathbf{U}\) is non-unitary, eigenvalues of \(\mathbf{K}\) do not strictly follow \(\boldsymbol{\mu}\). The first two cumulants still have a closed form via Toeplitz FFT convolutions. With \(\mathbf{G} := \mathbf{U}^{\mathsf H}\mathbf{U}\), \(G_{k,k'} = n\,\phi(k'-k)\), and \(\phi(\boldsymbol{\Delta}) = \tfrac{1}{n}\sum_i e^{\mathrm{i}\boldsymbol{\Delta}\cdot\mathbf{s}_i}\):
\(c_2\) is evaluated by two 2-D FFTs (one on \(|\phi|^2\), one on \(\boldsymbol{\mu}\)) at \(\mathcal{O}(n' \log n')\).
Note
Graph kernels under NUFFT define a different test. NUFFT implicitly builds a grid-stencil kernel on the auxiliary grid, which is topologically different from an exact \(k\)-NN graph on \(\{\mathbf{s}_i\}\) with open boundaries. So \(\mathbf{K}^{\text{NUFFT}}\) is not an approximation of \(\mathbf{K}^{\text{sparse}}\), and their \(Q_n\) statistics are not expected to match numerically.
This does not invalidate the NUFFT Q-test. It is just a different Q-test, evaluated on the grid-stencil operator rather than on the \(k\)-NN graph. It is:
well-defined (the NUFFT operator is real, symmetric, and translation-invariant by construction);
consistent whenever its effective spectrum is non-negative (e.g.
"car","graph_laplacian", and"moran"under CLT through the null approximations above);powerful, at \(\mathcal{O}((n + n') \log n')\) per feature with no \(k\)-NN construction step, so it scales to \(n\) where the sparse-graph path cannot.
Distance-based kernels (Gaussian, Matérn) are unaffected. Their NUFFT and Matrix forms converge to the same translation-invariant operator as the oversampling factor grows.
Complexity summary#
Backend |
Precompute |
Memory |
Time per \(Q\) |
Key limitation |
|---|---|---|---|---|
Dense + Davies (baseline) |
\(\mathcal{O}(n^3)\) |
\(\mathcal{O}(n^2)\) |
\(\mathcal{O}(n^2) + \mathcal{O}(Kn)\) |
Intractable for large \(n\). |
Dense + Liu / Welch / CLT [1] |
\(\mathcal{O}(n^2)\) |
\(\mathcal{O}(n^2)\) |
\(\mathcal{O}(n^2)\) |
Inaccurate right tail. |
Sparse kernel (\(k\)-NN) |
\(\mathcal{O}(nk)\) |
\(\mathcal{O}(nk)\) |
\(\mathcal{O}(nk)\) |
Power loss (cancellation). |
Low-rank kernel (rank-\(r\)) |
\(\mathcal{O}(nr + r^3)\) [2] |
\(\mathcal{O}(nr)\) |
\(\mathcal{O}(nr)\) |
Power loss (blind spots). |
Sparse precision (CAR) |
\(\mathcal{O}(n^{3/2})\) |
\(\mathcal{O}(n \log n)\) |
\(\mathcal{O}(n \log n)\) |
Triangular solves hard to parallelise. |
FFT (full spectrum) |
\(\mathcal{O}(n \log n)\) |
\(\mathcal{O}(n)\) |
\(\mathcal{O}(n \log n)\) |
Requires periodic grid. |
NUFFT |
\(\mathcal{O}((n + n') \log n')\) |
\(\mathcal{O}(n')\) |
\(\mathcal{O}((n + n') \log n')\) |
Mismatch with graph kernels [3]. |
The per-\(Q\) cost is dominated by the kernel matvec. The null evaluation on top is either Davies inversion (\(\mathcal{O}(Kn)\) per feature) or constant-time Liu / Welch once the cumulants are cached.
Default recommendation#
Continuous kernels on arbitrary coordinates (Gaussian, Matérn). Use
NUFFTKernelwith Liu. \(\mathcal{O}(n)\) memory, \(\mathcal{O}(n \log n)\) time per feature, no power loss.Regular rasterised grids (Visium HD, imaging). Use
FFTKernelwith Liu. Same asymptotic complexity as NUFFT, smaller constant.General graph kernels (phylogenetic trees, single-cell k-NN graphs). Use
MatrixKernelwithmethod="car"(sparse precision). \(\mathcal{O}(n \log n)\) per feature after a \(\mathcal{O}(n^{3/2})\) one-time factorisation.Small \(n\) (under ~5 000). Any backend will do. The dense matrix path is the simplest.
See also#
Theoretical Results for the derivation of the null distribution and the Dirichlet correction.
Kernel Design for kernel-selection practicalities.
Quick Start for end-to-end usage recipes.