cvpl_tools/im/ndblock.py
View source at ndblock.py.
APIs
- class cvpl_tools.im.ndblock.NDBlock(arr: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Array | NDBlock | None)
This class represents an N-dimensional grid, each block in the grid is of arbitrary ndarray shape
When the grid is of size 1 in all axes, it represents a Numpy array; When the grid has blocks of matching size on all axes, it represents a Dask array; In the general case the blocks can be of varying sizes e.g. A block of size (2, 2) may be neighboring to a block of size (5, 10)
Currently, we assume block_index is a list always ordered in (0, …, 0), (0, …, 1)… (N, …, M-1) Increasing from the side of the tail first
- as_dask_array(storage_options: dict | None = None) Array
Get a copy of the array value as dask array
- Parameters:
storage_options – Optionally, specify a compression format when persist
- Returns:
converted/retrieved dask array
- get_chunksize() tuple[int, ...]
Get a single tuple of chunk size on each axis
- Returns:
A tuple of int of length ndim
- is_numpy() bool
Returns True if this is Numpy array
Note besides type ReprFormat.NUMPY, ReprFormat.DICT_BLOCK_INDEX_SLICES may have either Numpy arrays as each block, or Dask delayed objects each returning a Numpy array; in the former case is_numpy() will return True, and in the latter case is_numpy() will return False.
- Returns:
Returns True if this is Numpy array
- static load(file: str | RDirFileSystem, storage_options: dict | None = None) NDBlock
Load the NDBlock from the given path.
- Parameters:
file – The path to load from, same as used in the save() function
storage_options – Specifies options in saving method and saved file format; this includes ‘compressor’ - compressor (numcodecs.abc.Codec, optional): The compressor to use to compress array or chunks
- Returns:
The loaded NDBlock. Guaranteed to have the same properties as the saved one, and the content of each block will be the same as when they are saved.
- async static map_ndblocks(inputs: Sequence[NDBlock], fn: Callable, out_dtype: dtype, use_input_index_as_arrloc: int = 0, new_slices: list = None, fn_args: dict = None) NDBlock
Similar to da.map_blocks, but works with NDBlock.
Different from dask array’s map_blocks: For each input i, the block_info[i] provided to the mapping function will contain only two keys, ‘chunk-location’ and ‘array-location’.
- Parameters:
inputs – A list of inputs to be mapped, either all dask or all numpy; All inputs must have the same number of blocks, block indices, and over the same slices as well if inputs are dask images
fn – fn(*block, block_info) maps input blocks to an output block
out_dtype – Output block type (provide a Numpy dtype to this)
use_input_index_as_arrloc – output slices_list will be the same as this input (ignore this for variable sized blocks)
new_slices – will be used to replace the slices attribute if specified
fn_args – extra arguments to be passed to the mapping function
- Returns:
Result mapped, of format ReprFormat.DICT_BLOCK_INDEX_SLICES
- property ndim: int
Returns: Integer indicating the dimensionality of the image contained in the NDBlock object
- persist(compressor=None) NDBlock
Using dask client persist() to save and reload the NDBlock object
- Parameters:
ndblock – The NDBlock to be saved
compressor – The compression used
- Returns:
Reloaded NDBlock object; if is numpy, then no saving is done and the object is directly returned
- async reduce(force_numpy: bool = False) ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Array
Concatenate all blocks on the first axis
- Parameters:
force_numpy – If True, the result will be forced from dask to a numpy array, if not already; useful for outputting to analysis that requires numpy input
- Returns:
The concatenated result, is Numpy if previous array is Numpy, or if force_numpy is True
- async static save(file: str | RDirFileSystem, ndblock: NDBlock, storage_options: dict | None = None)
Save the NDBlock to the given path
Will compute immediately if the ndblock is delayed dask computations
- Storage Opitons
- multiscale (int) = 0
This only applies if ndblock is dask, will write multilevel if non-zero
- compressor = None
The compressor to use to compress array or chunks
- Parameters:
file – The file to save to
ndblock – The block which will be saved
storage_options – Specifies options in saving method and saved file format
- select_columns(cols: slice | Sequence[int] | int) NDBlock
Performs columns selection on a 2d array
- sum(axis: Sequence | None = None, keepdims: bool = False)
sum over axes for each block
- to_dask_array(storage_options: dict | None = None)
Convert representation format to dask array
- Parameters:
storage_options – Optionally, specify a compression format when persist
- to_dict_block_index_slices()
Convert representation format to ReprFormat.DICT_BLOCK_INDEX_SLICES
- async to_numpy_array()
Convert representation format to numpy array
- class cvpl_tools.im.ndblock.ReprFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
ReprFormat specifies all possible NDBlock formats to use.
NUMPY_ARRAY = 0 DASK_ARRAY = 1 DICT_BLOCK_INDEX_SLICES = 2 # block can be either numpy or dask