Zarr (data format)

Zarr is an open-source format for storing n-dimensional arrays in chunked, compressed form, designed for cloud-native and parallel I/O workloads. In robotics, Zarr is used to store large robot demonstration datasets (images, joint states, actions) in a format that can be read efficiently from object storage (S3, GCS) without downloading entire files. Unlike HDF5, Zarr supports concurrent writes, making it suitable for distributed data collection pipelines. Zarr v3 standardized the format and added support for sharding (combining many small chunks into fewer large files), which improves cloud storage efficiency. Projects like LeRobot and several autonomous vehicle datasets have adopted Zarr for large-scale dataset hosting.
DataStorageEngineering

Explore More Terms

Browse the full robotics glossary with 70+ terms.

Back to Glossary