Utilities API
Documentation for various utilities from all modules.
Logging Utils
- class pathml.PathMLLogger
Convenience methods for turning on or off and configuring logging for PathML. Note that this can also be achieved by interfacing with loguru directly
Example:
from pathml import PathMLLogger as pml # turn on logging for PathML pml.enable() # turn off logging for PathML pml.disable() # turn on logging and output logs to a file named 'logs.txt', with colorization enabled pml.enable(sink="logs.txt", colorize=True)
- static disable()
Turn off logging for PathML
- static enable(sink=sys.stderr, level='DEBUG', fmt='PathML:{level}:{time:HH:mm:ss} | {module}:{function}:{line} | {message}', **kwargs)
Turn on and configure logging for PathML
- Parameters
sink (str or io._io.TextIOWrapper, optional) – Destination sink for log messages. Defaults to
sys.stderr
.level (str) – level of logs to capture. Defaults to ‘DEBUG’.
fmt (str) – Formatting for the log message. Defaults to: ‘PathML:{level}:{time:HH:mm:ss} | {module}:{function}:{line} | {message}’
**kwargs (dict, optional) – additional options passed to configure logger. See: loguru documentation
Core Utils
- pathml.core.utils.readtupleh5(h5, key)
Read tuple from h5.
- Parameters
h5 (h5py.Dataset or h5py.Group) – h5 object that will be read from
key (str) – key where data to read is stored
- pathml.core.utils.writedataframeh5(h5, name, df)
Write dataframe as h5 dataset.
- Parameters
h5 (h5py.Dataset) – root of h5 object that df will be written into
name (str) – name of dataset to be created
df (pd.DataFrame) – dataframe to be written
- pathml.core.utils.writedicth5(h5, name, dic)
Write dict as attributes of h5py.Group.
- Parameters
h5 (h5py.Dataset) – root of h5 object that dic will be written into
name (str) – name of dataset to be created
dic (str) – dict to be written
- pathml.core.utils.writestringh5(h5, name, st)
Write string as h5 attribute.
- Parameters
h5 (h5py.Dataset) – root of h5 object that st will be written into
name (str) – name of dataset to be created
st (str) – string to be written
- pathml.core.utils.writetupleh5(h5, name, tup)
Write tuple as h5 attribute.
- Parameters
h5 (h5py.Dataset) – root of h5 object that tup will be written into
name (str) – name of dataset to be created
tup (str) – tuple to be written
- pathml.core.utils.readcounts(h5)
Read counts using anndata h5py.
- Parameters
h5 (h5py.Dataset) – h5 object that will be read
- pathml.core.utils.writecounts(h5, counts)
Write counts using anndata h5py.
- Parameters
h5 (h5py.Dataset) – root of h5 object that counts will be written into
name (str) – name of dataset to be created
tup (anndata.AnnData) – anndata object to be written
Datasets Utils
- pathml.datasets.utils.pannuke_multiclass_mask_to_nucleus_mask(multiclass_mask)
Convert multiclass mask from PanNuke to a single channel nucleus mask. Assumes each pixel is assigned to one and only one class. Sums across channels, except the last mask channel which indicates background pixels in PanNuke. Operates on a single mask.
- Parameters
multiclass_mask (torch.Tensor) – Mask from PanNuke, in classification setting. (i.e.
nucleus_type_labels=True
). Tensor of shape (6, 256, 256).- Returns
Tensor of shape (256, 256).
ML Utils
- pathml.ml.utils.center_crop_im_batch(batch, dims, batch_order='BCHW')
Center crop images in a batch.
- Parameters
batch – The batch of images to be cropped
dims – Amount to be cropped (tuple for H, W)
- pathml.ml.utils.dice_loss(true, logits, eps=0.001)
Computes the Sørensen–Dice loss. Note that PyTorch optimizers minimize a loss. In this case, we would like to maximize the dice loss so we return 1 - dice loss. From: https://github.com/kevinzakka/pytorch-goodies/blob/c039691f349be9f21527bb38b907a940bfc5e8f3/losses.py#L54
- Parameters
true – a tensor of shape [B, 1, H, W].
logits – a tensor of shape [B, C, H, W]. Corresponds to the raw output or logits of the model.
eps – added to the denominator for numerical stability.
- Returns
the Sørensen–Dice loss.
- Return type
dice_loss
- pathml.ml.utils.dice_score(pred, truth, eps=0.001)
Calculate dice score for two tensors of the same shape. If tensors are not already binary, they are converted to bool by zero/non-zero.
- Parameters
pred (np.ndarray) – Predictions
truth (np.ndarray) – ground truth
eps (float, optional) – Constant used for numerical stability to avoid divide-by-zero errors. Defaults to 1e-3.
- Returns
Dice score
- Return type
float
- pathml.ml.utils.get_sobel_kernels(size, dt=torch.float32)
Create horizontal and vertical Sobel kernels for approximating gradients Returned kernels will be of shape (size, size)
- pathml.ml.utils.wrap_transform_multichannel(transform)
Wrapper to make albumentations transform compatible with a multichannel mask. Channel should be in first dimension, i.e. (n_mask_channels, H, W)
- Parameters
transform – Albumentations transform. Must have ‘additional_targets’ parameter specified with a total of n_channels key,value pairs. All values must be ‘mask’ but the keys don’t matter. e.g. for a mask with 3 channels, you could use: additional targets = {‘mask1’ : ‘mask’, ‘mask2’ : ‘mask’, ‘pathml’ : ‘mask’}
- Returns
function that can be called with a multichannel mask argument
Miscellaneous Utils
- pathml.utils.upsample_array(arr, factor)
Upsample array by a factor. Each element in input array will become a CxC block in the upsampled array, where C is the constant upsampling factor. From https://stackoverflow.com/a/32848377
- Parameters
arr (np.ndarray) – input array to be upsampled
factor (int) – Upsampling factor
- Returns
np.ndarray
- pathml.utils.pil_to_rgb(image_array_pil)
Convert PIL RGBA Image to numpy RGB array
- pathml.utils.segmentation_lines(mask_in)
Generate coords of points bordering segmentations from a given mask. Useful for plotting results of tissue detection or other segmentation.
- pathml.utils.plot_mask(im, mask_in, ax=None, color='red', downsample_factor=None)
plot results of segmentation, overlaying on original image_ref
- Parameters
im (np.ndarray) – Original RGB image_ref
mask_in (np.ndarray) – Boolean array of segmentation mask, with True values for masked pixels. Must be same shape as im.
ax – Matplotlib axes object to plot on. If None, creates a new plot. Defaults to None.
color – Color to plot outlines of mask. Defaults to “red”. Must be recognized by matplotlib.
downsample_factor – Downsample factor for image_ref and mask to speed up plotting for big images
- pathml.utils.contour_centroid(contour)
Return the centroid of a contour, calculated using moments. From OpenCV implementation
- Parameters
contour (np.array) – Contour array as returned by cv2.findContours
- Returns
(x, y) coordinates of centroid.
- Return type
tuple
- pathml.utils.sort_points_clockwise(points)
Sort a list of points into clockwise order around centroid, ordering by angle with centroid and x-axis. After sorting, we can pass the points to cv2 as a contour. Centroid is defined as center of bounding box around points.
- Parameters
points (np.ndarray) – Array of points (N x 2)
- Returns
Array of points, sorted in order by angle with centroid (N x 2)
- Return type
np.ndarray
Return sorted points
- pathml.utils.pad_or_crop(array, target_shape)
Make dimensions of input array match target shape by either zero-padding or cropping each axis.
- Parameters
array (np.ndarray) – Input array
target_shape (tuple) – Target shape of output
- Returns
Input array cropped/padded to match target_shape
- Return type
np.ndarray
- pathml.utils.RGB_to_HSI(imarr)
Convert imarr from RGB to HSI colorspace.
- Parameters
imarr (np.ndarray) – numpy array of RGB image_ref (m, n, 3)
- Returns
numpy array of HSI image_ref (m, n, 3)
- Return type
np.ndarray
References
- pathml.utils.RGB_to_OD(imarr)
Convert input image from RGB space to optical density (OD) space. OD = -log(I), where I is the input image in RGB space.
- Parameters
imarr (numpy.ndarray) – Image array, RGB format
- Returns
Image array, OD format
- Return type
numpy.ndarray
- pathml.utils.RGB_to_HSV(imarr)
convert image from RGB to HSV
- pathml.utils.RGB_to_LAB(imarr)
convert image from RGB to LAB color space
- pathml.utils.RGB_to_GREY(imarr)
convert image_ref from RGB to HSV
- pathml.utils.normalize_matrix_rows(A)
Normalize the rows of an array.
- Parameters
A (np.ndarray) – Input array.
- Returns
Array with rows normalized.
- Return type
np.ndarray
- pathml.utils.normalize_matrix_cols(A)
Normalize the columns of an array.
- Parameters
A (np.ndarray) – An array
- Returns
Array with columns normalized
- Return type
np.ndarray
- pathml.utils.plot_segmentation(ax, masks, palette=None, markersize=5)
Plot segmentation contours. Supports multi-class masks.
- Parameters
ax – matplotlib axis
masks (np.ndarray) – Mask array of shape (n_masks, H, W). Zeroes are background pixels.
palette – color palette to use. if None, defaults to matplotlib.colors.TABLEAU_COLORS
markersize (int) – Size of markers used on plot. Defaults to 5