liana.utils.spatial_pair_proximity

liana.utils.spatial_pair_proximity#

liana.utils.spatial_pair_proximity(adata, groupby, spatial_key='spatial', bandwidth=250, contact_bandwidth=None, min_cells_in_proximity=10, trim_fraction=0.1, kernel='gaussian', verbose=False)#

Computes aggregated spatial statistics and proximity scores between cell types.

This function calculates pairwise proximity between cell types based on nearest neighbor distances in spatial coordinates. It returns a DataFrame with proximity scores that can be used to weight ligand-receptor interactions by spatial co-localization.

Parameters:
  • adata (AnnData) – Annotated data object.

  • groupby (str) – Key to be used for grouping.

  • spatial_key (default: 'spatial') – Key in adata.obsm that contains the spatial coordinates. Default is 'spatial'.

  • bandwidth (default: 250) – Denotes signaling length and controls the maximum distance at which two spots/cells are considered. Corresponds to the units in which spatial coordinates are expressed.

  • contact_bandwidth (default: None) – Bandwidth for contact proximity calculation and distance threshold for contact interactions. If None, contact proximity is not calculated. Default is None.

  • min_cells_in_proximity (int, optional) – Minimum number of cell pairs within range required to flag an interaction as significant. Default is 10.

  • trim_fraction (float, optional) – Fraction of outliers to trim from each tail when calculating mean distance (0-0.5). Default is 0.1 (trim 10% from each tail).

  • kernel (default: 'gaussian') – Kernel function used to generate connectivity/proximity weights. It controls the shape of the connectivity weights. The following options are available: [‘gaussian’, ‘exponential’, ‘linear’, ‘misty_rbf’].

  • verbose (default: False) – Verbosity flag.

Returns:

pd.DataFrame DataFrame with columns: - source: source cell type - target: target cell type - mean_distance: trimmed mean distance between cell types - interacting: binary flag (1 if >= min_cells_in_proximity pairs within bandwidth, else 0) - proximity: proximity score calculated by applying kernel to mean_distance with bandwidth - contact_interacting: (optional, if contact_bandwidth is not None) binary flag for contact interactions - contact_proximity: (optional, if contact_bandwidth is not None) proximity score using contact_bandwidth

Notes

  • Performance scales as O(n_cell_types² × n_cells), which is acceptable for typical datasets (5-30 cell types) but may be slower with 100+ cell types.

  • Self-interactions exclude the cell itself as its own neighbor to avoid zero distances.

  • Missing proximity values (e.g., cell types that never co-localize) will result in NaN, which should be filled with 0.0 when merging with interaction results.

Examples

>>> import scanpy as sc
>>> adata = sc.datasets.pbmc68k_reduced()
>>> adata.obsm['spatial'] = np.random.randn(adata.shape[0], 2) * 100
>>> proximity_df = spatial_pair_proximity(adata, groupby='bulk_labels')
>>> proximity_df.head()