ZarrArray is an infrastructure package that leverages the Rarr package to bring Zarr datasets in R as DelayedArray objects.
Like any other Bioconductor package, ZarrArray
should always be installed with BiocManager::install():
if (!require("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("ZarrArray")Load the package:
The main class in the package is the ZarrArray class. A ZarrArray object is an array-like object that represents a Zarr dataset in R.
To create a ZarrArray object, simply call the
ZarrArray() constructor function on the path to a Zarr
dataset:
zarr_path <- system.file(package="Rarr", "extdata",
"zarr_examples", "column-first", "int32.zarr")
A <- ZarrArray(zarr_path)
A## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 1 0 0 0 . 0 0 0 0
## [30,] 1 0 0 0 . 0 0 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 0 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 0 0 0 0 . 0 0 0 0
## [30,] 0 0 0 0 . 0 0 0 0
Note that ZarrArray objects are DelayedArray derivatives and therefore support all operations (delayed or block-processed) supported by DelayedArray objects:
## [1] "ZarrArray"
## attr(,"package")
## [1] "ZarrArray"
## [1] TRUE
This allows ZarrArray objects to “look and feel” like ordinary arrays
or matrices in R by mimicking their behavior. In particular, ZarrArray
objects suppport most of the “standard array API” defined in base R like
dim(), length(), dimnames(),
[, aperm(), max(),
sum(), arithmetic and comparison operations, math
functions, etc…
## [1] 30 20 10
## [1] 6000
## <5 x 20> DelayedMatrix object of type "integer":
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## [3,] 1 0 0 0 . 0 0 0 0
## [4,] 1 0 0 0 . 0 0 0 0
## [5,] 1 0 0 0 . 0 0 0 0
## <10 x 20 x 30> DelayedArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [9,] 0 0 0 0 . 0 0 0 0
## [10,] 0 0 0 0 . 0 0 0 0
##
## ...
##
## ,,30
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [9,] 0 0 0 0 . 0 0 0 0
## [10,] 0 0 0 0 . 0 0 0 0
## [1] 20
## [1] 239
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0.5 1.5 2.5 . 18.5 19.5
## [2,] 0.5 -0.5 -0.5 . -0.5 -0.5
## ... . . . . . .
## [29,] 0.5 -0.5 -0.5 . -0.5 -0.5
## [30,] 0.5 -0.5 -0.5 . -0.5 -0.5
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] -0.5 -0.5 -0.5 . -0.5 -0.5
## [2,] -0.5 -0.5 -0.5 . -0.5 -0.5
## ... . . . . . .
## [29,] -0.5 -0.5 -0.5 . -0.5 -0.5
## [30,] -0.5 -0.5 -0.5 . -0.5 -0.5
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 1 8 27 . 6859 8000
## [2,] 1 0 0 . 0 0
## ... . . . . . .
## [29,] 1 0 0 . 0 0
## [30,] 1 0 0 . 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0 0 0 . 0 0
## [2,] 0 0 0 . 0 0
## ... . . . . . .
## [29,] 0 0 0 . 0 0
## [30,] 0 0 0 . 0 0
## <30 x 20 x 10> DelayedArray object of type "logical":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] FALSE FALSE FALSE . FALSE FALSE
## [2,] FALSE TRUE TRUE . TRUE TRUE
## ... . . . . . .
## [29,] FALSE TRUE TRUE . TRUE TRUE
## [30,] FALSE TRUE TRUE . TRUE TRUE
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] TRUE TRUE TRUE . TRUE TRUE
## [2,] TRUE TRUE TRUE . TRUE TRUE
## ... . . . . . .
## [29,] TRUE TRUE TRUE . TRUE TRUE
## [30,] TRUE TRUE TRUE . TRUE TRUE
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 1.000000 1.414214 1.732051 . 4.358899 4.472136
## [2,] 1.000000 0.000000 0.000000 . 0.000000 0.000000
## ... . . . . . .
## [29,] 1 0 0 . 0 0
## [30,] 1 0 0 . 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0 0 0 . 0 0
## [2,] 0 0 0 . 0 0
## ... . . . . . .
## [29,] 0 0 0 . 0 0
## [30,] 0 0 0 . 0 0
In the 2D case, they also support the “standard matrix API” defined
in base R like nrow(), ncol(),
rownames(), colnames(), t(),
rbind(), cbind(), rowSums(),
colSums(), %*%, etc…, as well as some
row/column summarization operations from the matrixStats
package like rowMaxs(), colVars(), etc…
Other operations are supported that are specific to DelayedArray objects and their derivatives:
## [1] "/github/workspace/pkglib/Rarr/extdata/zarr_examples/column-first/int32.zarr/"
## [1] "integer"
## [1] 10 10 5
See ?ZarrArray for more information.
The writeZarrArray() function can be used to write an
array-like object to disk in Zarr format.
For example we can write back A to disk but with a different physical chunk geometry:
## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 1 0 0 0 . 0 0 0 0
## [30,] 1 0 0 0 . 0 0 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 0 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 0 0 0 0 . 0 0 0 0
## [30,] 0 0 0 0 . 0 0 0 0
Or, we can transform A and then write it back to disk:
path2 <- tempfile(fileext=".zarr")
A2 <- sqrt(t(A[ , , 1]) + 1) # all these operations are delayed
writeZarrArray(A2, path2) # realizes the delayed operations block by block## <20 x 30> ZarrMatrix object of type "double":
## [,1] [,2] [,3] ... [,29] [,30]
## [1,] 1.414214 1.414214 1.414214 . 1.414214 1.414214
## [2,] 1.732051 1.000000 1.000000 . 1.000000 1.000000
## [3,] 2.000000 1.000000 1.000000 . 1.000000 1.000000
## [4,] 2.236068 1.000000 1.000000 . 1.000000 1.000000
## [5,] 2.449490 1.000000 1.000000 . 1.000000 1.000000
## ... . . . . . .
## [16,] 4.123106 1.000000 1.000000 . 1 1
## [17,] 4.242641 1.000000 1.000000 . 1 1
## [18,] 4.358899 1.000000 1.000000 . 1 1
## [19,] 4.472136 1.000000 1.000000 . 1 1
## [20,] 4.582576 1.000000 1.000000 . 1 1
Note that writeZarrArray() leverages lower-level
functionality implemented in the Rarr package
like create_empty_zarr_array() and
update_zarr_array(). See ?writeZarrArray for
more information.
## R version 4.6.0 (2026-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ZarrArray_1.1.0 DelayedArray_0.39.3 SparseArray_1.13.2
## [4] S4Arrays_1.13.0 IRanges_2.47.2 abind_1.4-8
## [7] S4Vectors_0.51.3 MatrixGenerics_1.25.0 matrixStats_1.5.0
## [10] BiocGenerics_0.59.6 generics_0.1.4 Matrix_1.7-5
## [13] BiocStyle_2.41.0
##
## loaded via a namespace (and not attached):
## [1] jsonlite_2.0.0 crayon_1.5.3 compiler_4.6.0
## [4] BiocManager_1.30.27 Rcpp_1.1.1-1.1 jquerylib_0.1.4
## [7] yaml_2.3.12 fastmap_1.2.0 lattice_0.22-9
## [10] R6_2.6.1 XVector_0.53.0 curl_7.1.0
## [13] httr2_1.2.2 knitr_1.51 paws.storage_0.9.0
## [16] maketools_1.3.2 paws.common_0.8.9 bslib_0.11.0
## [19] R.utils_2.13.0 rlang_1.2.0 grumpy_0.1.1.9000
## [22] cachem_1.1.0 xfun_0.57 sass_0.4.10
## [25] sys_3.4.3 cli_3.6.6 magrittr_2.0.5
## [28] digest_0.6.39 grid_4.6.0 rappdirs_0.3.4
## [31] lifecycle_1.0.5 R.oo_1.27.1 R.methodsS3_1.8.2
## [34] glue_1.8.1 evaluate_1.0.5 Rarr_2.1.17
## [37] buildtools_1.0.0 rmarkdown_2.31 tools_4.6.0
## [40] htmltools_0.5.9