Logo
latest

Getting Started

  • Overview
    • License
  • Installation

Preprocessing

  • Loading Images
    • Individual Images
    • Datasets of Images
    • Supported slide types
    • Supported file formats
  • Creating Preprocessing Pipelines
    • What is a Transform?
    • What is a Pipeline?
    • Creating custom Transforms
  • Running Preprocessing Pipelines
    • How it works
    • Preprocessing a single WSI
    • Preprocessing a dataset of WSI
    • Distributed processing
  • HDF5 Integration
    • Overview
    • How it Works
    • About HDF5
    • .h5path File Format
    • Reading and Writing

Datasets

  • Datasets
    • DataModules
    • Using public datasets
    • References

Machine Learning

  • DataLoaders
  • Models
    • References

Examples

  • Loading Images: Quickstart
    • Aperio SVS
    • Generic tiled TIFF
    • Hamamatsu NDPI
    • Hamamatsu VMS
    • Leica SCN
    • MIRAX
    • Olympus VSI
    • Trestle TIFF
    • Ventana BIF
    • Zeiss ZVI
    • DICOM
    • Volumetric + time-series OME-TIFF
    • CODEX spatial proteomics
    • MERFISH spatial gene expression
    • Visium 10x spatial gene expression
  • H&E Stain Deconvolution and Color Normalization
    • References
  • Brightfield Imaging: Quickstart
  • Multiparametric Imaging: Quickstart
    • Defining a Multiparametric Pipeline
    • AnnData Integration and Spatial Single Cell Analysis
    • References
  • Multiparametric Imaging: CODEX
    • Reading the slides
    • Define and run the preprocessing pipeline
    • Extract and concatenate the resulting count matrices
    • Annotate the clusters based on the markers intensity
    • Identification of cellular neighborhoods
  • Training an ML Model (HoVer-Net)
    • Data augmentation
    • Load PanNuke dataset
    • Model Training
      • Training with multi-GPU
      • Main training loop
    • Evaluate Model
    • Examples
    • Conclusion
    • References
    • Session info
  • Preprocessing Transforms Gallery
    • Transforms that modify an image
      • Blurring Transforms
      • Superpixel Interpolation
      • Stain Normalization
    • Transforms that create a mask
      • Binary Threshold
      • Nucleus Detection
    • Transforms that modify a mask
      • Morphological Opening
      • Morphological Closing
      • Foreground Detection
      • Tissue Detection

API Reference

  • Core API
    • SlideData
      • Convenience SlideData Classes
        • HESlide
        • VectraSlide
        • MultiparametricSlide
        • CODEXSlide
    • Slide Types
    • Tile
    • SlideDataset
    • Tiles and Masks helper classes
    • Slide Backends
      • OpenslideBackend
      • BioFormatsBackend
      • DICOMBackend
        • OpenSlideBackend
        • BioFormatsBackend
        • DICOMBackend
    • h5pathManager
      • SlideData
        • counts
        • extract_region
        • generate_tiles
        • plot
        • run
        • shape
        • write
      • SlideType
        • asdict
      • Tile
        • plot
        • shape
      • SlideDataset
        • run
        • write
      • Tiles
        • add
        • keys
        • remove
        • tile_shape
        • update
      • Masks
        • add
        • keys
        • remove
        • slice
      • h5pathManager
        • add_mask
        • add_tile
        • get_mask
        • get_slidetype
        • get_tile
        • remove_mask
        • remove_tile
        • slice_masks
        • update_mask
  • Preprocessing API
    • Pipeline
    • Transforms
      • Pipeline
        • apply
        • save
      • MedianBlur
        • F
        • apply
      • GaussianBlur
        • F
        • apply
      • BoxBlur
        • F
        • apply
      • BinaryThreshold
        • F
        • apply
      • MorphOpen
        • F
        • apply
      • MorphClose
        • F
        • apply
      • ForegroundDetection
        • F
        • apply
      • SuperpixelInterpolation
        • F
        • apply
      • StainNormalizationHE
        • F
        • apply
        • fit_to_reference
      • NucleusDetectionHE
        • F
        • apply
      • TissueDetectionHE
        • F
        • apply
      • LabelArtifactTileHE
        • F
        • apply
      • LabelWhiteSpaceHE
        • F
        • apply
      • SegmentMIF
        • F
        • apply
      • QuantifyMIF
        • F
        • apply
      • CollapseRunsVectra
        • F
        • apply
      • CollapseRunsCODEX
        • F
        • apply
      • RescaleIntensity
        • F
        • apply
      • HistogramEqualization
        • F
        • apply
      • AdaptiveHistogramEqualization
        • F
        • apply
  • Datasets API
    • PanNuke
    • DeepFocus
      • PanNukeDataModule
        • test_dataloader
        • train_dataloader
        • valid_dataloader
      • DeepFocusDataModule
        • test_dataloader
        • train_dataloader
        • valid_dataloader
  • ML API
    • h5path Dataset
    • HoVer-Net
      • Helper functions
      • TileDataset
      • HoVerNet
        • forward
        • compute_hv_map
        • loss_hovernet
        • remove_small_objs
        • post_process_batch_hovernet
  • Utilities API
    • Logging Utils
    • Core Utils
    • Datasets Utils
    • ML Utils
    • Miscellaneous Utils
      • PathMLLogger
        • disable
        • enable
      • readtupleh5
      • writedataframeh5
      • writedicth5
      • writestringh5
      • writetupleh5
      • readcounts
      • writecounts
      • pannuke_multiclass_mask_to_nucleus_mask
      • center_crop_im_batch
      • dice_loss
      • dice_score
      • get_sobel_kernels
      • wrap_transform_multichannel
      • upsample_array
      • pil_to_rgb
      • segmentation_lines
      • plot_mask
      • contour_centroid
      • sort_points_clockwise
      • pad_or_crop
      • RGB_to_HSI
      • RGB_to_OD
      • RGB_to_HSV
      • RGB_to_LAB
      • RGB_to_GREY
      • normalize_matrix_rows
      • normalize_matrix_cols
      • plot_segmentation

Contributing

  • Contributing
    • Submitting a bug report
    • Requesting a new feature
    • For developers
      • Coordinate system conventions
      • Setting up a local development environment
      • Running tests
      • Building documentation locally
      • Checking code coverage
      • How to contribute code, documentation, etc.
      • Versioning and Distributing
      • Code Quality
      • Documentation Standards
      • Testing Standards
    • Thank You!
PathML
  • »
  • Preprocessing Transforms Gallery
  • View PathML on GitHub
Previous Next

Preprocessing Transforms Gallery

View on GitHub

In PathML, preprocessing pipelines are created by composing modular Transforms.

The following tutorial contains an overview of the PathML pre-processing Transforms, with examples.

We will divide Transforms into three primary categories, depending on their function:

  1. Transforms that modify an image

    • Gaussian Blur

    • Median Blur

    • Box Blur

    • Stain Normalization

    • Superpixel Interpolation

  2. Transforms that create a mask

    • Nucleus Detection

    • Binary Threshold

  3. Transforms that modify a mask

    • Morphological Closing

    • Morphological Opening

    • Foreground Detection

    • Tissue Detection

[2]:
import matplotlib.pyplot as plt
import copy

from pathml.core import HESlide, Tile, types
from pathml.utils import plot_mask, RGB_to_GREY
from pathml.preprocessing import (
    BoxBlur, GaussianBlur, MedianBlur,
    NucleusDetectionHE, StainNormalizationHE, SuperpixelInterpolation,
    ForegroundDetection, TissueDetectionHE, BinaryThreshold,
    MorphClose, MorphOpen
)

fontsize = 14

Note that a Transform operates on Tile objects. We must first load a whole-slide image, extract a smaller region, and create a Tile:

[3]:
wsi = HESlide("./../data/CMU-1-Small-Region.svs")
region = wsi.slide.extract_region(location = (900, 800), size = (500, 500))

def smalltile():
    # convenience function to create a new tile
    return Tile(region, coords = (0, 0), name = "testregion", slide_type = types.HE)

Transforms that modify an image

Blurring Transforms

We’ll start with the 3 blurring transforms: GaussianBlur, MedianBlur, and BoxBlur

Blurriness can be control with the kernel_size parameter. A larger kernel width yields a more blurred result for all blurring transforms:

[4]:
blurs = ["Original Image", GaussianBlur, MedianBlur, BoxBlur]
blur_name = ["Original Image", "GaussianBlur", "MedianBlur", "BoxBlur"]
k_size = [5, 11, 21]
fig, axarr = plt.subplots(nrows=4, ncols=3, figsize=(7.5, 10))
for i, blur in enumerate(blurs):
    for j, kernel_size in enumerate(k_size):
        tile = smalltile()
        if blur != "Original Image":
            b = blur(kernel_size = kernel_size)
            b.apply(tile)
        ax = axarr[i, j]
        ax.imshow(tile.image)
        if i == 0:
            ax.set_title(f"Kernel_size = {kernel_size}", fontsize=fontsize)
        if j == 0:
            ax.set_ylabel(blur_name[i], fontsize = fontsize)
for a in axarr.ravel():
    a.set_xticks([])
    a.set_yticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_5_0.png

Superpixel Interpolation

Superpixel interpolation is a method for grouping together nearby similar pixels to form larger “superpixels.” The SuperpixelInterpolation Transform divides the input image into superpixels using SLIC algorithm, then interpolates each superpixel with average color. The region_size parameter controls how big the superpixels are:

[5]:
region_sizes = ["original", 10, 20, 30]
fig, axarr = plt.subplots(nrows=1, ncols=4, figsize=(10, 10))
for i, region_size in enumerate(region_sizes):
    tile = smalltile()
    if region_size == "original":
        axarr[i].set_title("Original Image", fontsize = fontsize)
    else:
        t = SuperpixelInterpolation(region_size = region_size)
        t.apply(tile)
        axarr[i].set_title(f"Region Size = {region_size}", fontsize = fontsize)
    axarr[i].imshow(tile.image)
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_7_0.png

Stain Normalization

H&E images are a combination of two stains: hematoxylin and eosin. Stain deconvolution methods attempt to estimate the relative contribution of each stain for each pixel. Each stain can then be pulled out into a separate image, and the deconvolved images can then be recombined to normalize the appearance of the image.

The StainNormalizationHE Transform implements two algorithms for stain deconvolution.

[6]:
fig, axarr = plt.subplots(nrows=2, ncols=3, figsize=(10, 7.5))
fontsize = 18
for i, method in enumerate(["macenko", "vahadane"]):
    for j, target in enumerate(["normalize", "hematoxylin", "eosin"]):
        tile = smalltile()
        normalizer = StainNormalizationHE(target = target, stain_estimation_method = method)
        normalizer.apply(tile)
        ax = axarr[i, j]
        ax.imshow(tile.image)
        if j == 0:
            ax.set_ylabel(f"{method} method", fontsize=fontsize)
        if i == 0:
            ax.set_title(target, fontsize = fontsize)
for a in axarr.ravel():
    a.set_xticks([])
    a.set_yticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_9_0.png

Transforms that create a mask

Binary Threshold

The BinaryThreshold transform creates a mask by classifying whether each pixel is above or below the given threshold. Note that you can supply a threshold parameter, or use Otsu’s method to automatically determine a threshold:

[7]:
thresholds = ["original", 50, 180, "otsu"]
fig, axarr = plt.subplots(nrows=1, ncols=len(thresholds), figsize=(12, 6))
for i, thresh in enumerate(thresholds):
    tile = smalltile()
    if thresh == "original":
        axarr[i].set_title("Original Image", fontsize = fontsize)
        axarr[i].imshow(tile.image)
    elif thresh == "otsu":
        t = BinaryThreshold(mask_name = "binary_threshold",
                            inverse = True, use_otsu = True)
        t.apply(tile)
        axarr[i].set_title(f"Otsu Threshold", fontsize = fontsize)
        axarr[i].imshow(tile.masks["binary_threshold"])
    else:
        t = BinaryThreshold(mask_name = "binary_threshold", threshold = thresh,
                            inverse = True, use_otsu = False)
        t.apply(tile)
        axarr[i].set_title(f"Threshold = {thresh}", fontsize = fontsize)
        axarr[i].imshow(tile.masks["binary_threshold"])
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_11_0.png

Nucleus Detection

The NucleusDetectionHE transform employs a simple nucleus detection algorithm for H&E stained images. It works by first separating hematoxylin channel, then doing interpolation using superpixels, and finally using Otsu’s method for binary thresholding. This is an example of a compound Transform created by combining several other Transforms:

[8]:
tile = smalltile()
nucleus_detection = NucleusDetectionHE(mask_name = "detect_nuclei")
nucleus_detection.apply(tile)

fig, axarr = plt.subplots(nrows=1, ncols=2, figsize=(8, 8))
axarr[0].imshow(tile.image)
axarr[0].set_title("Original Image", fontsize=fontsize)
axarr[1].imshow(tile.masks["detect_nuclei"])
axarr[1].set_title("Nucleus Detection", fontsize=fontsize)
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_13_0.png

We can also overlay the results on the original image to see which regions were identified as being nuclei:

[9]:
fig, ax = plt.subplots(figsize=(7, 7))
plot_mask(im = tile.image, mask_in=tile.masks["detect_nuclei"], ax = ax)
plt.title("Overlay", fontsize = fontsize)
plt.axis('off')
plt.show()
../_images/examples_link_gallery_15_0.png

Transforms that modify a mask

For the following transforms, we’ll use a Tile containing a larger region extracted from the slide.

[10]:
bigregion = wsi.slide.extract_region(location = (800, 800), size = (1000, 1000))

def bigtile():
    # convenience function to create a new tile with a binary mask
    bigtile = Tile(bigregion, coords = (0, 0), name = "testregion", slide_type = types.HE)
    BinaryThreshold(mask_name = "binary_threshold", inverse=True,
                    threshold = 100, use_otsu = False).apply(bigtile)
    return bigtile

plt.imshow(bigregion)
plt.axis("off")
plt.show()
../_images/examples_link_gallery_17_0.png

Morphological Opening

Morphological opening reduces noise in a binary mask by first applying binary erosion n times, and then applying binary dilation n times. The effect is to remove small objects from the background. The strength of the effect can be controlled by setting n

[11]:
ns = ["Original Mask", 1, 3, 5]
fig, axarr = plt.subplots(nrows=1, ncols=4, figsize=(10, 10))
for i, n in enumerate(ns):
    tile = bigtile()
    if n == "Original Mask":
        axarr[i].set_title("Original Mask", fontsize = fontsize)
    else:
        t = MorphOpen(mask_name = "binary_threshold", n_iterations=n)
        t.apply(tile)
        axarr[i].set_title(f"n_iter = {n}", fontsize = fontsize)
    axarr[i].imshow(tile.masks["binary_threshold"])
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_19_0.png

Morphological Closing

Morphological closing is similar to opening, but in the opposite order: first, binary dilation is applied n times, then binary erosion is applied n times. The effect is to reduce noise in a binary mask by closing small holes in the foreground. The strength of the effect can be controlled by setting n

[12]:
ns = ["Original Mask", 1, 3, 5]
fig, axarr = plt.subplots(nrows=1, ncols=4, figsize=(10, 10))
for i, n in enumerate(ns):
    tile = bigtile()
    if n == "Original Mask":
        axarr[i].set_title("Original Mask", fontsize = fontsize)
    else:
        t = MorphClose(mask_name = "binary_threshold", n_iterations=n)
        t.apply(tile)
        axarr[i].set_title(f"n_iter = {n}", fontsize = fontsize)
    axarr[i].imshow(tile.masks["binary_threshold"])
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_21_0.png

Foreground Detection

This transform operates on binary masks and identifies regions that have a total area greater than specified threshold. Supports including holes within foreground regions, or excluding holes above a specified area threshold.

[13]:
tile = bigtile()
foreground_detector = ForegroundDetection(mask_name = "binary_threshold")
original_mask = tile.masks["binary_threshold"].copy()
foreground_detector.apply(tile)

fig, axarr = plt.subplots(nrows=1, ncols=2, figsize=(8, 8))
axarr[0].imshow(original_mask)
axarr[0].set_title("Original Mask", fontsize=fontsize)
axarr[1].imshow(tile.masks["binary_threshold"])
axarr[1].set_title("Detected Foreground", fontsize=fontsize)
for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_23_0.png

Tissue Detection

TissueDetectionHE is a Transform for detecting regions of tissue from an H&E image. It is composed by applying a sequence of other Transforms: first a median blur, then binary thresholding, then morphological opening and closing, and finally foreground detection.

[14]:
tile = bigtile()

tissue_detector = TissueDetectionHE(mask_name = "tissue", outer_contours_only=True)
tissue_detector.apply(tile)

fig, axarr = plt.subplots(nrows=1, ncols=3, figsize=(8, 8))
axarr[0].imshow(tile.image)
axarr[0].set_title("Original Image", fontsize=fontsize)
axarr[1].imshow(tile.masks["tissue"])
axarr[1].set_title("Detected Tissue", fontsize=fontsize)
plot_mask(im = tile.image, mask_in=tile.masks["tissue"], ax = axarr[2])
axarr[2].set_title("Overlay", fontsize=fontsize)

for ax in axarr.ravel():
    ax.set_yticks([])
    ax.set_xticks([])
plt.tight_layout()
plt.show()
../_images/examples_link_gallery_25_0.png
Previous Next

© Copyright 2021, Dana-Farber Cancer Institute and Weill Cornell Medicine. Revision c71bc9bd.

Read the Docs v: latest
Versions
latest
stable
dev
Downloads
On Read the Docs
Project Home
Builds