-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support datasets with differently chunked variables in DatasetToChunks
There are two major internal changes: 1. Key objects from DatasetToChunks now can include different dimensions for different variables when using split_vars=True. This makes it easier to handle large datasets with many variables and different chunking per variable. 2. Inputs inside the DatasetToChunks pipeline can now be sharded across many tasks. This is important for scalability to large datasets, especially with this chagne because the above refactor increases the number of inputs by the number of variables when split_vars=True. Otherwise, we can run into performance issues on the machine launching the pipeline when the number of inputs goes into the millions (e.g., slow speed, out of memory). See the new integration test for a concrete use-case, resembling real model output. Also revise the warning message in the README to be a bit friendlier. Fixes #43 PiperOrigin-RevId: 471948735
- Loading branch information
Showing
9 changed files
with
292 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,7 @@ jobs: | |
strategy: | ||
fail-fast: false | ||
matrix: | ||
python-version: ["3.7", "3.8", "3.9"] | ||
python-version: ["3.7", "3.8", "3.9", "3.10"] | ||
steps: | ||
- name: Cancel previous | ||
uses: styfle/[email protected] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,7 +42,7 @@ | |
|
||
setuptools.setup( | ||
name='xarray-beam', | ||
version='0.3.1', | ||
version='0.4.0', | ||
license='Apache 2.0', | ||
author='Google LLC', | ||
author_email='[email protected]', | ||
|
@@ -52,6 +52,6 @@ | |
'docs': docs_requires, | ||
}, | ||
url='https://github.com/google/xarray-beam', | ||
packages=setuptools.find_packages(exclude=["examples"]), | ||
packages=setuptools.find_packages(exclude=['examples']), | ||
python_requires='>=3', | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.