cvpl_tools/ome_zarr/io.py
View source at io.py.
Read and Write: For reading ome zarr image, use load_dask_array_from_path
to directly read the OME
ZARR file as a dask array. Alternatively, use load_zarr_group_from_path
to open a zarr group in
read mode and then use dask.array.from_zarr
to create a dask array from that group.
For writing ome zarr image, we assume you have a dask array and would like to write it as a .zip or a
directory. In such cases, write_ome_zarr_image
directly writes the dask array onto disk.
APIs
- cvpl_tools.ome_zarr.io.load_zarr_group_from_path(path: str, mode: str | None = None, use_zip: bool | None = None, level: int | None = None) Group
Loads either a zarr folder or zarr zip file into a zarr group.
- Parameters:
path – path to the zarr folder or zip to be opened
mode – file open mode e.g. ‘r’, only pass this if the file is a zip file
use_zip – if True, treat path as a zip; if False, treat path as a folder; if None, use path to determine file type
level – If None (default), load the entire ome zarr; if an int is provided, load the corresponding level in the ome zarr array
- Returns:
the opened zarr group
- cvpl_tools.ome_zarr.io.load_dask_array_from_path(path: str, mode: str | None = None, use_zip: bool | None = None, level: int | None = None) Array
Loads either a zarr folder or zarr zip file into a dask array.
Compared to load_zarr_group_from_path, this function allows specifying which slice and channel to read using a query string syntax (idea thanks to Davis Bennett in the thread https://forum.image.sc/t/loading-only-one-channel-from-an-ome-zarr/97798)
Example
Loading an ome zarr array of shape (2, 200, 1000, 1000) using different slices:
arr_original = load_dask_array_from_path('file.ome.zarr', level=0) # shape=(2, 200, 1000, 1000) arr1 = load_dask_array_from_path('file.ome.zarr?slices=[0]', level=0) # shape=(200, 1000, 1000) arr2 = load_dask_array_from_path('file.ome.zarr?slices=[:, :100]', level=0) # shape=(2, 100, 1000, 1000) arr3 = load_dask_array_from_path('file.ome.zarr?slices=[0:1, 0, -1:, ::2]', level=0) # shape=(1, 1, 500)
Essentially, Python multi-index slicing can be done and the effect is similar to torch or numpy indexing using slices.
- Parameters:
path – path to the zarr folder or zip to be opened
mode – file open mode e.g. ‘r’, only pass this if the file is a zip file
use_zip – if True, treat path as a zip; if False, treat path as a folder; if None, use path to determine file type
level – If None (default), load the entire ome zarr; if an int is provided, load the corresponding level in the ome zarr array
- Returns:
the opened zarr group
- async cvpl_tools.ome_zarr.io.write_ome_zarr_image(out_ome_zarr_path: str, tmp_path: str = None, da_arr: Array | None = None, lbl_arr: Array | None = None, lbl_name: str | None = None, make_zip: bool | None = None, MAX_LAYER: int = 0, logging=False, storage_options: dict = None, lbl_storage_options: dict = None, asynchronous: bool = False)
Write dask array as an ome zarr
For writing to zip file: due to dask does not directly support write to zip file, we instead create a temp ome zarr output and copy it into a zip after done. This is why tmp_path is required if make_zip=True
- Parameters:
out_ome_zarr_path – The path to target ome zarr folder, or ome zarr zip folder if make_zip is True
tmp_path – temporary files will be stored under this,
da_arr – If provided, this is the array to write at {out_ome_zarr_path}
lbl_arr – If provided, this is the array to write at {out_ome_zarr_path}/labels/{lbl_name}
lbl_name – name of the folder of the label array
make_zip – bool, if True the output is a zip; if False a folder; if None, then determine based on file suffix
MAX_LAYER – The maximum layer of down sampling; starting at layer=0
logging – If true, print message when job starts and ends
storage_options – options for storing the image
lbl_storage_options – options for storing the labels