The tool (tl) accessor

The tool accessor contains methods that utilize third-party tools to enable operations such as segmentation or cell type prediction.

class spatialproteomics.tl.tool.ToolAccessor(xarray_obj)

The tool accessor enables the application of external tools such as StarDist or Astir.

astir(marker_dict: dict, key: str = '_intensity', threshold: float = 0, seed: int = 42, learning_rate: float = 0.001, batch_size: float = 64, n_init: int = 5, n_init_epochs: int = 5, max_epochs: int = 500, cell_id_col: str = 'cell_id', cell_type_col: str = 'cell_type', **kwargs)

This method predicts cell types from an expression matrix using the Astir algorithm.

Parameters:
  • marker_dict (dict) – Dictionary mapping markers to cell types. Can also include cell states. Example: {“cell_type”: {‘B’: [‘PAX5’], ‘T’: [‘CD3’], ‘Myeloid’: [‘CD11b’]}}

  • key (str, optional) – Layer to use as expression matrix.

  • threshold (float, optional) – Certainty threshold for astir to assign a cell type. Defaults to 0.

  • seed (int, optional) – Random seed. Defaults to 42.

  • learning_rate (float, optional) – Learning rate. Defaults to 0.001.

  • batch_size (float, optional) – Batch size. Defaults to 64.

  • n_init (int, optional) – Number of initializations. Defaults to 5.

  • n_init_epochs (int, optional) – Number of epochs for each initialization. Defaults to 5.

  • max_epochs (int, optional) – Maximum number of epochs. Defaults to 500.

  • cell_id_col (str, optional) – Column name for cell IDs. Defaults to “cell_id”.

  • cell_type_col (str, optional) – Column name for cell types. Defaults to “cell_type”.

Raises:

ValueError – If no expression matrix was present or the image is not of type uint8.

Returns:

A DataArray with the assigned cell types.

Return type:

DataArray

cellpose(channel: str | None = None, key_added: str | None = '_cellpose_segmentation', diameter: float = 0, channel_settings: list = [0, 0], num_iterations: int = 2000, cellprob_threshold: float = 0.0, flow_threshold: float = 0.4, batch_size: int = 8, gpu: bool = True, model_type: str = 'cyto3', postprocess_func: ~typing.Callable = <function ToolAccessor.<lambda>>, return_diameters: bool = False)

Segment cells using Cellpose. Adds a layer to the spatialproteomics object with dimension (X, Y) or (C, X, Y) dependent on whether channel argument is specified or not.

Parameters:
  • channel (str, optional) – Channel to use for segmentation. If None, all channels are used.

  • key_added (str, optional) – Key to assign to the segmentation results.

  • diameter (float, optional) – Expected cell diameter in pixels.

  • channel_settings (List[int], optional) – Channels for Cellpose to use for segmentation. If [0, 0], independent segmentation is performed on all channels. If it is anything else (e. g. [1, 2]), joint segmentation is attempted.

  • num_iterations (int, optional) – Maximum number of iterations for segmentation.

  • cellprob_threshold (float, optional) – Threshold for cell probability.

  • flow_threshold (float, optional) – Threshold for flow.

  • batch_size (int, optional) – Batch size for segmentation.

  • gpu (bool, optional) – Whether to use GPU for segmentation.

  • model_type (str, optional) – Type of Cellpose model to use.

  • postprocess_func (Callable, optional) – Function to apply to the segmentation masks after prediction.

  • return_diameters (bool, optional) – Whether to return the cell diameters.

Returns:

Dataset containing original data and segmentation mask.

Return type:

xr.Dataset

Notes

This method requires the ‘cellpose’ package to be installed.

convert_to_anndata(expression_matrix_key: str = '_intensity', obs_key: str = '_obs', additional_layers: dict | None = None, additional_uns: dict | None = None)

Convert the spatialproteomics object to an anndata.AnnData object. The resulting AnnData object does not store the original image or segmentation mask.

Parameters:
  • expression_matrix_key (str, optional) – The key of the expression matrix in the spatialproteomics object. Default is ‘_intensity’.

  • obs_key (str, optional) – The key of the observation data in the spatialproteomics object. Default is ‘_obs’.

  • additional_layers (dict, optional) – Additional layers to include in the anndata.AnnData object. The keys are the names of the layers and the values are the corresponding keys in the spatialproteomics object.

  • additional_uns (dict, optional) – Additional uns data to include in the anndata.AnnData object. The keys are the names of the uns data and the values are the corresponding keys in the spatialproteomics object.

Returns:

adata – The converted anndata.AnnData object.

Return type:

anndata.AnnData

Raises:
  • AssertionError – If the expression matrix key or additional layers are not found in the spatialproteomics object.

  • Notes:

  • ------

  • - The expression matrix is extracted from the spatialproteomics object using the provided expression matrix key.

  • - If additional layers are specified, they are extracted from the spatialproteomics object and added to the anndata.AnnData object.

  • - If obs_key is present in the spatialproteomics object, it is used to create the obs DataFrame of the anndata.AnnData object.

  • - If additional_uns is specified, the corresponding uns data is extracted from the spatialproteomics object and added to the anndata.AnnData object.

convert_to_spatialdata(image_key: str = '_image', segmentation_key: str = '_segmentation', **kwargs)

Convert the spatialproteomics object to a spatialdata object.

Parameters:
  • image_key (str) – The key of the image data in the object. Defaults to Layers.IMAGE.

  • segmentation_key (str) – The key of the segmentation data in the object. Defaults to Layers.SEGMENTATION.

  • **kwargs – Additional keyword arguments to be passed to the convert_to_anndata method.

Returns:

The converted spatialdata object.

Return type:

spatial_data_object (spatialdata.SpatialData)

mesmer(key_added: str | None = '_mesmer_segmentation', postprocess_func: ~typing.Callable = <function ToolAccessor.<lambda>>, **kwargs)

Segment cells using Mesmer. Adds a layer to the spatialproteomics object with dimension (C, X, Y). Assumes C is two and has the order (nuclear, membrane).

Parameters:
  • key_added (str, optional) – Key to assign to the segmentation results.

  • postprocess_func (Callable, optional) – Function to apply to the segmentation masks after prediction.

Returns:

Dataset containing original data and segmentation mask.

Return type:

xr.Dataset

Notes

This method requires the ‘mesmer’ package to be installed.

stardist(channel: str | None = None, key_added: str | None = '_stardist_segmentation', scale: float = 3, n_tiles: int = 12, normalize: bool = True, predict_big: bool = False, postprocess_func: ~typing.Callable = <function ToolAccessor.<lambda>>, **kwargs) Dataset

Apply StarDist algorithm to perform instance segmentation on the nuclear image.

Parameters:
  • scale (float, optional) – Scaling factor for the StarDist model (default is 3).

  • n_tiles (int, optional) – Number of tiles to split the image into for prediction (default is 12).

  • normalize (bool, optional) – Flag indicating whether to normalize the nuclear image (default is True).

  • nuclear_channel (str, optional) – Name of the nuclear channel in the image (default is “DAPI”).

  • predict_big (bool, optional) – Flag indicating whether to use the ‘predict_instances_big’ method for large images (default is False).

  • postprocess_func (Callable, optional) – Function to apply to the segmentation masks after prediction (default is lambda x: x).

  • **kwargs (dict, optional) – Additional keyword arguments to be passed to the StarDist prediction method.

Returns:

obj – Xarray dataset containing the segmentation mask and centroids.

Return type:

xr.Dataset

Raises:

ValueError – If the object already contains a segmentation mask.