{ "cells": [ { "cell_type": "markdown", "id": "df3aa226-496a-4564-8c00-7194f296707b", "metadata": {}, "source": [ "# Xarray/Zarr/Icechunk on S3\n", "\n", "You will need to run this notebook in a `conda` environment created from `environment.yml`." ] }, { "cell_type": "code", "execution_count": 1, "id": "f6f18aec-e2e2-40f1-b4ab-6c73735d2b38", "metadata": {}, "outputs": [], "source": [ "import zarr\n", "from icechunk import IcechunkStore, StorageConfig" ] }, { "cell_type": "markdown", "id": "2631f462-8f8e-4ab1-8ca5-aaaa6674622a", "metadata": {}, "source": [ "## Create a new Zarr store backed by Icechunk\n", "\n", "This example uses a S3 store" ] }, { "cell_type": "code", "execution_count": 2, "id": "90890feb-ee7d-4edc-af96-f64399b20262", "metadata": {}, "outputs": [], "source": [ "s3_storage = StorageConfig.s3_from_env(\n", " bucket=\"icechunk-test\", prefix=\"oscar-demo-repository\"\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "id": "39e76b2a-e294-41a4-a1e4-2a1845eb4f2b", "metadata": {}, "outputs": [], "source": [ "store = await IcechunkStore.create(\n", " storage=s3_storage,\n", " mode=\"w\",\n", ")" ] }, { "cell_type": "markdown", "id": "2c2fec5d-123e-41d1-9dae-3d8993d8ed78", "metadata": {}, "source": [ "## Real data" ] }, { "cell_type": "code", "execution_count": 4, "id": "4169783c-3c3d-47a5-ae65-90efb3c70cd1", "metadata": {}, "outputs": [], "source": [ "import xarray as xr" ] }, { "cell_type": "code", "execution_count": 5, "id": "3ecc6e53-98a2-4698-99af-954ad27e0cff", "metadata": {}, "outputs": [], "source": [ "import fsspec\n", "\n", "fs = fsspec.filesystem(\"s3\")" ] }, { "cell_type": "code", "execution_count": 7, "id": "30dc9933-3700-4236-b958-1aeb43450a4a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 1GB\n", "Dimensions: (depth: 1, latitude: 481, longitude: 1201, time: 72, year: 72)\n", "Coordinates:\n", " * depth (depth) float32 4B 15.0\n", " * latitude (latitude) float64 4kB 80.0 79.67 79.33 ... -79.33 -79.67 -80.0\n", " * longitude (longitude) float64 10kB 20.0 20.33 20.67 ... 419.3 419.7 420.0\n", " * time (time) datetime64[ns] 576B 2018-01-01 2018-01-06 ... 2018-12-26\n", " * year (year) float32 288B 2.018e+03 2.018e+03 ... 2.019e+03 2.019e+03\n", "Data variables:\n", " u (time, depth, latitude, longitude) float64 333MB dask.array<chunksize=(72, 1, 481, 1201), meta=np.ndarray>\n", " um (time, depth, latitude, longitude) float64 333MB dask.array<chunksize=(72, 1, 481, 1201), meta=np.ndarray>\n", " v (time, depth, latitude, longitude) float64 333MB dask.array<chunksize=(72, 1, 481, 1201), meta=np.ndarray>\n", " vm (time, depth, latitude, longitude) float64 333MB dask.array<chunksize=(72, 1, 481, 1201), meta=np.ndarray>\n", "Attributes: (12/17)\n", " VARIABLE: Ocean Surface Currents\n", " DATATYPE: 1/72 YEAR Interval\n", " DATASUBTYPE: unfiltered\n", " GEORANGE: 20 to 420 -80 to 80\n", " PERIOD: Jan.01,2018 to Dec.26,2018\n", " year: 2018\n", " ... ...\n", " company: Earth & Space Research, Seattle, WA\n", " reference: Bonjean F. and G.S.E. Lagerloef, 2002 ,Diagnostic model a...\n", " note1: Maximum Mask velocity is the geostrophic component at all...\n", " note2: Longitude extends from 20 E to 420 E to avoid a break in ...\n", " history: Wed Sep 18 14:18:38 2024: ncks -4 -o oscar_vel2018.nc4 os...\n", " NCO: netCDF Operators version 5.2.8 (Homepage = http://nco.sf....