liana.utils.zi_minmax

Contents

liana.utils.zi_minmax#

liana.utils.zi_minmax(X, cutoff=0.5)#

Zero-inflated min-max scaling, adopted from CiteFuse (Kim et al., 2020; https://academic.oup.com/bioinformatics/article/36/14/4137/5827474).

This function scales the data to the range [0, 1] for each column of a two-dimensional array and sets values below a specified cutoff to 0 (after scaling).

Parameters:
  • X (Union[Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], complex, bytes, str, _NestedSequence[complex | bytes | str]]) – Data to be scaled.

  • cutoff (float (default: 0.5)) – Cutoff value for zero-inflation - values less than this are set to 0. Default is 0.5.

Return type:

csr_matrix

Returns:

X The scaled data matrix

Examples

>>> x = np.array([[0.1, 0.3],
...               [2.0, 4.0],
...               [5.5, 7.1]])
>>> print(zi_minmax(x))
<Compressed Sparse Row sparse matrix of dtype 'float64'
        with 6 stored elements and shape (3, 2)>
Coords        Values
(0, 0)        0.0
(0, 1)        0.0
(1, 0)        0.0
(1, 1)        0.5441176470588236
(2, 0)        1.0
(2, 1)        1.0
>>> print(zi_minmax(x, cutoff=0.1))
<Compressed Sparse Row sparse matrix of dtype 'float64'
        with 6 stored elements and shape (3, 2)>
Coords        Values
(0, 0)        0.0
(0, 1)        0.0
(1, 0)        0.3518518518518518
(1, 1)        0.5441176470588236
(2, 0)        1.0
(2, 1)        1.0