ndblock.py

View source at ndblock.py.

APIs

class cvpl_tools.im.ndblock.NDBlock(arr: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Array | NDBlock | None)

This class represents an N-dimensional grid, each block in the grid is of arbitrary ndarray shape

When the grid is of size 1 in all axes, it represents a Numpy array; When the grid has blocks of matching size on all axes, it represents a Dask array; In the general case the blocks can be of varying sizes e.g. A block of size (2, 2) may be neighboring to a block of size (5, 10)

Currently, we assume block_index is a list always ordered in (0, …, 0), (0, …, 1)… (N, …, M-1) Increasing from the side of the tail first

as_dask_array(storage_options: dict | None = None) → Array

Get a copy of the array value as dask array

Parameters:: storage_options – Optionally, specify a compression format when persist
Returns:: converted/retrieved dask array

get_chunksize() → tuple[int, ...]

Get a single tuple of chunk size on each axis

Returns:: A tuple of int of length ndim

is_numpy() → bool

Returns True if this is Numpy array

Note besides type ReprFormat.NUMPY, ReprFormat.DICT_BLOCK_INDEX_SLICES may have either Numpy arrays as each block, or Dask delayed objects each returning a Numpy array; in the former case is_numpy() will return True, and in the latter case is_numpy() will return False.

Returns:: Returns True if this is Numpy array

static load(file: str | RDirFileSystem, storage_options: dict | None = None) → NDBlock

Load the NDBlock from the given path.

Parameters:

file – The path to load from, same as used in the save() function
storage_options – Specifies options in saving method and saved file format; this includes ‘compressor’ - compressor (numcodecs.abc.Codec, optional): The compressor to use to compress array or chunks

Returns:

The loaded NDBlock. Guaranteed to have the same properties as the saved one, and the content of each block will be the same as when they are saved.

async static map_ndblocks(inputs: Sequence[NDBlock], fn: Callable, out_dtype: dtype, use_input_index_as_arrloc: int = 0, new_slices: list = None, fn_args: dict = None) → NDBlock

Similar to da.map_blocks, but works with NDBlock.

Different from dask array’s map_blocks: For each input i, the block_info[i] provided to the mapping function will contain only two keys, ‘chunk-location’ and ‘array-location’.

Parameters:

inputs – A list of inputs to be mapped, either all dask or all numpy; All inputs must have the same number of blocks, block indices, and over the same slices as well if inputs are dask images
fn – fn(*block, block_info) maps input blocks to an output block
out_dtype – Output block type (provide a Numpy dtype to this)
use_input_index_as_arrloc – output slices_list will be the same as this input (ignore this for variable sized blocks)
new_slices – will be used to replace the slices attribute if specified
fn_args – extra arguments to be passed to the mapping function

Returns:

Result mapped, of format ReprFormat.DICT_BLOCK_INDEX_SLICES

property ndim: int: Returns: Integer indicating the dimensionality of the image contained in the NDBlock object

persist(compressor=None) → NDBlock

Using dask client persist() to save and reload the NDBlock object

Parameters:

ndblock – The NDBlock to be saved
compressor – The compression used

Returns:

Reloaded NDBlock object; if is numpy, then no saving is done and the object is directly returned

async reduce(force_numpy: bool = False) → ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Array

Concatenate all blocks on the first axis

Parameters:: force_numpy – If True, the result will be forced from dask to a numpy array, if not already; useful for outputting to analysis that requires numpy input
Returns:: The concatenated result, is Numpy if previous array is Numpy, or if force_numpy is True

async static save(file: str | RDirFileSystem, ndblock: NDBlock, storage_options: dict | None = None)

Save the NDBlock to the given path

Will compute immediately if the ndblock is delayed dask computations

Storage Opitons

multiscale (int) = 0: This only applies if ndblock is dask, will write multilevel if non-zero
compressor = None: The compressor to use to compress array or chunks

Parameters:

file – The file to save to
ndblock – The block which will be saved
storage_options – Specifies options in saving method and saved file format

select_columns(cols: slice | Sequence[int] | int) → NDBlock: Performs columns selection on a 2d array

sum(axis: Sequence | None = None, keepdims: bool = False): sum over axes for each block

to_dask_array(storage_options: dict | None = None)

Convert representation format to dask array

Parameters:: storage_options – Optionally, specify a compression format when persist

to_dict_block_index_slices(): Convert representation format to ReprFormat.DICT_BLOCK_INDEX_SLICES

async to_numpy_array(): Convert representation format to numpy array

class cvpl_tools.im.ndblock.ReprFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

ReprFormat specifies all possible NDBlock formats to use.

NUMPY_ARRAY = 0 DASK_ARRAY = 1 DICT_BLOCK_INDEX_SLICES = 2 # block can be either numpy or dask