cvpl_tools/ome_zarr/io.py

View source at io.py.

Read and Write: For reading ome zarr image, use load_dask_array_from_path to directly read the OME ZARR file as a dask array. Alternatively, use load_zarr_group_from_path to open a zarr group in read mode and then use dask.array.from_zarr to create a dask array from that group.

For writing ome zarr image, we assume you have a dask array and would like to write it as a .zip or a directory. In such cases, write_ome_zarr_image directly writes the dask array onto disk.

APIs

cvpl_tools.ome_zarr.io.load_zarr_group_from_path(path: str, mode: str | None = None, use_zip: bool | None = None, level: int | None = None) Group

Loads either a zarr folder or zarr zip file into a zarr group.

Parameters:
  • path – path to the zarr folder or zip to be opened

  • mode – file open mode e.g. ‘r’, only pass this if the file is a zip file

  • use_zip – if True, treat path as a zip; if False, treat path as a folder; if None, use path to determine file type

  • level – If None (default), load the entire ome zarr; if an int is provided, load the corresponding level in the ome zarr array

Returns:

the opened zarr group

cvpl_tools.ome_zarr.io.load_dask_array_from_path(path: str, mode: str | None = None, use_zip: bool | None = None, level: int | None = None) Array

Loads either a zarr folder or zarr zip file into a dask array.

Compared to load_zarr_group_from_path, this function allows specifying which slice and channel to read using a query string syntax (idea thanks to Davis Bennett in the thread https://forum.image.sc/t/loading-only-one-channel-from-an-ome-zarr/97798)

Example

Loading an ome zarr array of shape (2, 200, 1000, 1000) using different slices:

arr_original = load_dask_array_from_path('file.ome.zarr', level=0)  # shape=(2, 200, 1000, 1000)
arr1 = load_dask_array_from_path('file.ome.zarr?slices=[0]', level=0)  # shape=(200, 1000, 1000)
arr2 = load_dask_array_from_path('file.ome.zarr?slices=[:, :100]', level=0)  # shape=(2, 100, 1000, 1000)
arr3 = load_dask_array_from_path('file.ome.zarr?slices=[0:1, 0, -1:, ::2]', level=0)  # shape=(1, 1, 500)

Essentially, Python multi-index slicing can be done and the effect is similar to torch or numpy indexing using slices.

Parameters:
  • path – path to the zarr folder or zip to be opened

  • mode – file open mode e.g. ‘r’, only pass this if the file is a zip file

  • use_zip – if True, treat path as a zip; if False, treat path as a folder; if None, use path to determine file type

  • level – If None (default), load the entire ome zarr; if an int is provided, load the corresponding level in the ome zarr array

Returns:

the opened zarr group

async cvpl_tools.ome_zarr.io.write_ome_zarr_image(out_ome_zarr_path: str, tmp_path: str = None, da_arr: Array | None = None, lbl_arr: Array | None = None, lbl_name: str | None = None, make_zip: bool | None = None, MAX_LAYER: int = 0, logging=False, storage_options: dict = None, lbl_storage_options: dict = None, asynchronous: bool = False)

Write dask array as an ome zarr

For writing to zip file: due to dask does not directly support write to zip file, we instead create a temp ome zarr output and copy it into a zip after done. This is why tmp_path is required if make_zip=True

Parameters:
  • out_ome_zarr_path – The path to target ome zarr folder, or ome zarr zip folder if make_zip is True

  • tmp_path – temporary files will be stored under this,

  • da_arr – If provided, this is the array to write at {out_ome_zarr_path}

  • lbl_arr – If provided, this is the array to write at {out_ome_zarr_path}/labels/{lbl_name}

  • lbl_name – name of the folder of the label array

  • make_zip – bool, if True the output is a zip; if False a folder; if None, then determine based on file suffix

  • MAX_LAYER – The maximum layer of down sampling; starting at layer=0

  • logging – If true, print message when job starts and ends

  • storage_options – options for storing the image

  • lbl_storage_options – options for storing the labels