Plotting¶

Spatialproteomics has a variety of plotting functions, enabling the plotting of intensities, segmentation masks, and predicted cell type labels.

[1]:

%reload_ext autoreload
%autoreload 2

import spatialproteomics
import matplotlib.pyplot as plt
import xarray as xr

[2]:

# loading in a data set and performing some formatting for convenience
ds = xr.open_zarr("../../data/BNHL_166_4_I2_LK_2.zarr")

[3]:

# having a look at the dataset
ds

[3]:

<xarray.Dataset>
Dimensions:         (channels: 56, y: 3000, x: 3000, labels: 8, la_props: 2,
                     cells: 12560, features: 3)
Coordinates:
  * cells           (cells) int64 1 2 3 4 5 6 ... 12556 12557 12558 12559 12560
  * channels        (channels) <U11 'DAPI' 'Helios' 'CD10' ... 'CD79a' 'Ki-67'
  * features        (features) <U10 '_labels' 'centroid-0' 'centroid-1'
  * la_props        (la_props) <U6 '_color' '_name'
  * labels          (labels) int64 1 2 3 4 5 6 7 8
  * x               (x) int64 0 1 2 3 4 5 6 ... 2994 2995 2996 2997 2998 2999
  * y               (y) int64 0 1 2 3 4 5 6 ... 2994 2995 2996 2997 2998 2999
Data variables:
    _image          (channels, y, x) uint8 dask.array<chunksize=(7, 375, 750), meta=np.ndarray>
    _la_properties  (labels, la_props) <U20 dask.array<chunksize=(8, 2), meta=np.ndarray>
    _obs            (cells, features) float64 dask.array<chunksize=(6280, 3), meta=np.ndarray>
    _segmentation   (y, x) int64 dask.array<chunksize=(375, 375), meta=np.ndarray>

Plotting Channel Intensities¶

The slicing capabilities of xarray make data visualization very easy. The general pattern is here to first select the channels, image regions and labels of interest, and then to use the plotting functions that are available via the .pl accessor. Images can be shown by using the pl.show() method.

[4]:

# selecting the DAPI channel and showing the result with pl.show()
_ = ds.pp["DAPI"].pl.show()

If we want different colors, we can specify those using the pl.colorize method.

[5]:

_ = ds.pp["DAPI"].pl.colorize(["orange"]).pl.show()

The pl.colorize function allows us to make image overlays quickly by first selecting the channels of interest. We illustrate this by first selecting the CD4 and CD8 channels and then assigning the colors.

[6]:

_ = ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000].pl.colorize(colors=["orange", "deepskyblue"]).pl.show()

All plotting functions accept matplotlib.Axes via the ax argument. This enables to easily modify plots using the normal matplotlib API. For example, let’s say we want to plot multiple channels in different subplots.

[7]:

# creating two subplots
fig, ax = plt.subplots(1, 2, figsize=(16, 8))

# populating the first subplot
# note how we set ax=ax[0] to tell the plotting function to use the first subplot
_ = ds.pp["DAPI", 1000:1500, 500:1000].pl.colorize(colors=["red"]).pl.show(ax=ax[0])
# populating the second subplot
_ = ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000].pl.colorize(colors=["orange", "deepskyblue"]).pl.show(ax=ax[1])

By default, a legend showing the colors corresponding to the channels is shown. You can disable this with the legend_image argument. In addition, matplotlib functions can be used to further alter the plot, e. g. to remove ticklabels. The code snipped below also highlights the pl.autocrop() function, which can be useful to remove empty space around a tissue microarray.

[8]:

fig = plt.figure(figsize=(8, 8))
_ = ds.pp["DAPI"].pl.autocrop().pl.colorize(colors=["red"]).pl.show(legend_image=False)
plt.xticks([])
plt.yticks([])
plt.tight_layout()

Rendering Segmentation Masks¶

If your spatialproteomics object contains a segmentation mask, you can simple plot it by setting render_segmentation to true.

[9]:

fig = plt.figure(figsize=(8, 8))
_ = (
    ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000]
    .pl.colorize(colors=["orange", "deepskyblue"])
    .pl.show(render_segmentation=True)
)

Rendering Obs¶

Sometimes, you might want to visualize specific features, such as the cell size. With spatialproteomics, you can do this while retaining the cell shapes obtained by the segmentation.

[10]:

fig = plt.figure(figsize=(8, 8))
_ = ds.pp.add_observations("area").pl.autocrop().pl.render_obs(feature="area", cmap="plasma").pl.imshow(legend_obs=True)

[11]:

fig = plt.figure(figsize=(8, 8))
_ = ds.pl.autocrop().pl.render_obs(feature="centroid-0", cmap="plasma").pl.imshow(legend_obs=True)

Rendering Labels¶

Similar to render_segmentation, pl.show() also has an argument called render_labels, which can be used to render cell type labels. We can simultaneously deactivate the rendering of the intensities to get an image of only the labels.

[12]:

fig = plt.figure(figsize=(8, 8))
_ = ds.pp[1000:1500, 500:1000].pl.show(render_image=False, render_labels=True)

We can combine this with rendering the segmentation masks for a clearer picture of where exactly the cell boundaries lie.

[13]:

fig = plt.figure(figsize=(8, 8))
_ = ds.pp[1000:1500, 500:1000].pl.show(render_image=False, render_segmentation=True, render_labels=True)

Using the .la[] accessor we can also subset cell labels, rather than displaying all of them. We can also use the la.set_label_colors() to set custom colors for our labels.

[14]:

fig, ax = plt.subplots(1, 2, figsize=(16, 8))
_ = ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000].pl.colorize(["orange", "deepskyblue"]).pl.show(ax=ax[0])
_ = (
    ds.pp[1000:1500, 500:1000]
    .la["T (CD3)", "B (PAX5)"]
    .la.set_label_colors(["T (CD3)", "B (PAX5)"], ["orange", "deepskyblue"])
    .pl.show(render_image=False, render_segmentation=True, render_labels=True, ax=ax[1])
)

Rendering Neighborhoods¶

Lastly, we can also render neighborhoods. How to obtain such neighborhoods is shown in the Neighborhoods notebook.

[15]:

# obtaining an object which contains neighborhoods
ds_neighborhood = xr.open_zarr("../../data/sample_1_with_neighborhoods.zarr")

[16]:

# plotting the neighborhoods
fig = plt.figure(figsize=(8, 8))
_ = ds_neighborhood.pl.show(render_image=False, render_neighborhoods=True)

Annotation of Regions or Cells¶

We can also annotate regions or cells within the plots.

[17]:

# highlighting a region with pl.add_box()
fig = plt.figure(figsize=(8, 8))
_ = (
    ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000]
    .pl.colorize(colors=["orange", "deepskyblue"])
    .pl.add_box([1230, 1480], [640, 840])
    .pl.show()
)

[18]:

# selecting all B cells and annotating them with pl.annotate()
fig = plt.figure(figsize=(8, 8))
_ = (
    ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000]
    .la["B (PAX5)"]
    .pl.colorize(colors=["orange", "deepskyblue"])
    .la.set_label_colors(["B (PAX5)"], ["deepskyblue"])
    .pl.annotate()
    .pl.show(render_labels=True, label_kwargs={"alpha": 0.0})
)

Expert Plotting¶

To gain more control over your plots, you can use the functions colorize, render_segmentation, and render_labels in combination with imshow (instead of show, which provides a high-level wrapper around these). Those methods allow for more nuanced adjustments of the final plot. Here are a few examples, for more details please consult the documentation of the corresponding methods.

[19]:

_ = ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000].pl.colorize(colors=["orange", "deepskyblue"]).pl.imshow()

[20]:

_ = ds.pl.render_segmentation().pl.imshow()

[21]:

_ = ds.pl.render_labels().pl.imshow()

[22]:

_ = (
    ds.pp[["CD3", "PAX5"], 1000:1500, 500:1000]
    .pl.colorize(colors=["orange", "deepskyblue"])
    .pl.render_segmentation()
    .pl.imshow()
)

You can also use scatterplots instad of always plotting the images.

[23]:

# creating a scatter plot of the labels
_ = ds.pl.scatter_labels(legend=False)

[24]:

# quickly adding a quantification
ds = ds.pp.add_quantification(func="intensity_mean").pp.transform_expression_matrix(method="arcsinh")

[25]:

# creating a scatter plot of any categorical variable in obs
# as an example, we are adding a binarization column to the obs
ds_with_binarization_cd4 = ds.la.threshold_labels({"CD4": 2.5}, layer_key="_intensity")

fig, ax = plt.subplots(1, 2, figsize=(10, 20))
_ = ds_with_binarization_cd4.pp["CD4"].pl.colorize("blue").pl.show(ax=ax[0])
_ = ds_with_binarization_cd4.pl.scatter(feature="CD4_binarized", size=0.1, palette={0: "gray", 1: "blue"}, ax=ax[1])

[26]:

# creating a scatterplot of continuous variables
# adding area of each cell and plotting it
ds_tmp = ds.pp.add_observations("area")
_ = ds_tmp.pl.scatter(feature="area", scatter_kws={"cmap": "YlOrRd"})

If you have multiple segmentation masks, you can show all of them at the same time like this.

[27]:

_ = ds.pp[["DAPI", "CD11c"]].pp[2000:2500, 2000:2500].pl.colorize(["orange", "deepskyblue"]).pl.show()

[29]:

ds_multisegmentation = ds.pp[["DAPI", "CD11c"]].pp[2000:2500, 2000:2500].tl.cellpose(key_added="_cellpose_segmentation")

Neither TORCH CUDA nor MPS version not installed/working.
>>>> using CPU
>>>> using CPU
>> cyto3 << model set to be used
>>>> loading model /home/meyerben/.cellpose/models/cyto3
/home/meyerben/meyerben/.conda/envs/tmp_env_2/lib/python3.10/site-packages/cellpose/resnet_torch.py:280: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  state_dict = torch.load(filename, map_location=torch.device("cpu"))
>>>> model diam_mean =  30.000 (ROIs rescaled to this size during training)
Performing independent segmentation on all markers. If you want to perform joint segmentation, please set the channel_settings argument appropriately.
channels set to [0, 0]
~~~ ESTIMATING CELL DIAMETER(S) ~~~
estimated cell diameter(s) in 6.95 sec
>>> diameter(s) =
[ 12.36 ]
~~~ FINDING MASKS ~~~
>>>> TOTAL TIME 15.91 sec
channels set to [0, 0]
~~~ ESTIMATING CELL DIAMETER(S) ~~~
estimated cell diameter(s) in 7.66 sec
>>> diameter(s) =
[ 19.38 ]
~~~ FINDING MASKS ~~~
>>>> TOTAL TIME 18.80 sec

[30]:

plt.figure(figsize=(16, 16))
_ = ds_multisegmentation.pl.colorize(["gold", "deepskyblue"]).pl.show(
    render_segmentation=True, segmentation_kwargs={"layer_key": "_cellpose_segmentation", "colors": ["white", "red"]}
)

We can also plot histograms of any table, for example the observations (e. g. cell area) or the marker distribution.

[31]:

# quick check what our object looks like
ds_tmp

[31]:

<xarray.Dataset>
Dimensions:         (cells: 12560, channels: 56, features: 4, la_props: 2,
                     labels: 8, x: 3000, y: 3000)
Coordinates:
  * cells           (cells) int64 1 2 3 4 5 6 ... 12556 12557 12558 12559 12560
  * channels        (channels) <U11 'DAPI' 'Helios' 'CD10' ... 'CD79a' 'Ki-67'
  * features        (features) <U10 '_labels' 'area' 'centroid-0' 'centroid-1'
  * la_props        (la_props) <U6 '_color' '_name'
  * labels          (labels) int64 1 2 3 4 5 6 7 8
  * x               (x) int64 0 1 2 3 4 5 6 ... 2994 2995 2996 2997 2998 2999
  * y               (y) int64 0 1 2 3 4 5 6 ... 2994 2995 2996 2997 2998 2999
Data variables:
    _image          (channels, y, x) uint8 dask.array<chunksize=(7, 375, 750), meta=np.ndarray>
    _la_properties  (labels, la_props) <U20 dask.array<chunksize=(8, 2), meta=np.ndarray>
    _obs            (cells, features) float64 dask.array<chunksize=(6280, 2), meta=np.ndarray>
    _segmentation   (y, x) int64 dask.array<chunksize=(375, 375), meta=np.ndarray>
    _intensity      (cells, channels) float64 3.111 0.0 1.391 ... 1.324 0.4174

[32]:

# plotting a histogram for a feature in the obs table
df = ds_tmp.pp.get_layer_as_df("_obs")
_ = plt.hist(df["area"], bins=50)

[33]:

# plotting a histogram for a feature in the expression matrix
df = ds_tmp.pp.get_layer_as_df("_intensity")
_ = plt.hist(df["DAPI"], bins=50)

Sidenote: you can also create a Vornoi diagram of your data by simply growing segmentation masks by a specified distance.

[34]:

plt.figure(figsize=(15, 15))
_ = ds_tmp.pp["DAPI"].pp.grow_cells(iterations=20).pl.autocrop().pl.colorize("gold").pl.show(render_segmentation=True)

Mask growing requires recalculation of the observations. All features other than the centroids will be removed and should be recalculated with pp.add_observations().

For visualization, it is also possible to adjust the background color of the image. However, caution is advised when plotting fluorescence intensities on light backgrounds, as the background colors get mixed with the actual fluorescence values.

[35]:

fig, ax = plt.subplots(1, 2, figsize=(20, 10))
# you can pass the background either in the pl.colorize() method or in pl.show()
_ = ds_tmp.pp["DAPI"].pl.autocrop().pl.colorize("gold", background="#2e2e2e").pl.show(ax=ax[0])
_ = ds_tmp.pp["DAPI"].pl.autocrop().pl.show(render_image=False, render_labels=True, background="lightgray", ax=ax[1])

Interactive Removal of Artifacts Image Processing