-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial setup for dask-upstream-testing #1
Conversation
/ok to test edit: that appears to have done nothing :) I'm guessing some bot isn't hooked up here. |
Maybe try adding the copy-pr-bot as well ?
|
Failure at https://github.com/rapidsai/dask-upstream-testing/actions/runs/12681051375 https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_dispatch does mention that
I'm not sure which version of the workflow file is getting run, the one from this branch or the one from |
scripts/run.sh
Outdated
|
||
uv pip install --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple \ | ||
--prerelease=allow \ | ||
"cudf-cu12" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figure out what all we want / need to install here. Probably depends on what pytest.importorskip
s we have in dask/dask
and dask/distributed
. We don't want any gpu-marked tests skipped for missing deps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually write a dependencies.yaml
file rather than using hardcoded dependency lists. You could copy-paste from https://github.com/rapidsai/cudf/blob/branch-25.02/dependencies.yaml.
Or you might be able to install the wheels with the [test]
extra?
Look at these:
- https://github.com/rapidsai/cudf/blob/branch-25.02/ci/test_wheel_dask_cudf.sh
- https://github.com/rapidsai/cudf/blob/cb77046d8baad31f4856c097f7052b3a3858c363/python/dask_cudf/pyproject.toml#L21-L29
- https://github.com/rapidsai/cudf/blob/cb77046d8baad31f4856c097f7052b3a3858c363/python/dask_cudf/pyproject.toml#L47-L54
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will take a look. Right now I'm thinking that the best place for this to live would be in dask & distributed's pyproject.toml
so that we can just install dask[test,gpu] distributed[test,gpu]
and have everything we need.
/ok to test |
/ok to test |
This adds a Cron Job to the github actions that runs the `gpu` marked tests from dask and distributed. The setup here is modeled after https://github.com/rapidsai/cudf/blob/branch-25.02/.github/workflows/pandas-tests.yaml, which is running pandas tests against the nightly versions of cudf.
891277b
to
4cc487c
Compare
|
||
# RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})" | ||
RAPIDS_PY_CUDA_SUFFIX=12 | ||
# TODO: set this to main once dask-cudf is compatible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rapidsai/dask-cuda#1424 is the PR to track here. For now, using 2024.12.1 is slightly preferable, otherwise all the tests fail on import.
I've added a few more optional dependencies (ucx-py, dask-cuda) that were causing some tests to be skipped locally. Here are the remaining skips for
And for
That Longer-term, I'd prefer to have the list of packages needed defined upstream in dask and distributed, maybe as a |
@@ -1,78 +1,29 @@ | |||
# Based off https://github.com/rapidsai/cudf/blob/branch-25.02/.github/workflows/pandas-tests.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need another job to actually trigger this, e.g. https://github.com/rapidsai/cudf/blob/cb77046d8baad31f4856c097f7052b3a3858c363/.github/workflows/build.yaml#L180. I wonder if the workflow-dispatch stuff is needed, or if we can just use a cron job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see an example of cron triggers in our nightly builds here:
https://github.com/rapidsai/workflows/blob/main/.github/workflows/nightly-pipeline-trigger.yaml
https://github.com/rapidsai/workflows/blob/main/.github/workflows/nightly-pipeline.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. I gave up searching after I didn't see it in either rapidsai/cudf or rapidsai/shared-workflows :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, rapidsai/workflows#69 should be necessary and sufficient to get this workflow running on a schedule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on a suggestion from @wence-, I've moved the cron definition here to avoid the need for this cross-repo dispatch.
I'm not sure if we can test this before merging, but I've hopefully gotten the GitHub Action set up so that we have all the inputs necessary for the rapidsai/shared-workflows/.github/workflows/[email protected]
workflow to run. Does anyone know offhand if things like ${{ inputs.date }}
will be available? My guess is that those came from the workflow dispatch, so I've added a job to collect that and the branch name.
I expect some failures once this is merged:
Looking into that (edit: fixing upstream at dask/distributed#8981).
We have the option of defining skips here when we invoke |
scripts/run.sh
Outdated
|
||
uv pip install --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple \ | ||
--prerelease=allow \ | ||
"cudf-cu12" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually write a dependencies.yaml
file rather than using hardcoded dependency lists. You could copy-paste from https://github.com/rapidsai/cudf/blob/branch-25.02/dependencies.yaml.
Or you might be able to install the wheels with the [test]
extra?
Look at these:
- https://github.com/rapidsai/cudf/blob/branch-25.02/ci/test_wheel_dask_cudf.sh
- https://github.com/rapidsai/cudf/blob/cb77046d8baad31f4856c097f7052b3a3858c363/python/dask_cudf/pyproject.toml#L21-L29
- https://github.com/rapidsai/cudf/blob/cb77046d8baad31f4856c097f7052b3a3858c363/python/dask_cudf/pyproject.toml#L47-L54
@charlesbluca I see you added the cudf spill stats test at dask/distributed#8148. Just for simplicity, I'd like to avoid adding a second Hopefully setting edit: dask/distributed#8984 is the proposed restructuring that should let us avoid another |
Thanks @TomAugspurger for this work 😄
It's been a while, my main recollection of why we chose to split this off into two separate |
Thanks for the link to dask/distributed#8148 (comment). Yeah, the changes in dask/distributed#8984 should make this toggling behavior slightly easier. We should get the spilling behavior just in this one test, while avoiding the spilling in other test. |
I plan to merge this sometime tomorrow if there aren't any objections. |
Thanks everyone for the reviews! |
https://github.com/rapidsai/dask-upstream-testing/actions/runs/12769657199/job/35592743953 has the first "successful" run (only took about a few more commits to fix typos, etc.) The failures are all expected other than Some of the failures won't be resolved until we're compatible with dask & distributed main, since that's where the fixes are landing. I'll check in on where we're at with supporting that. |
This has a quick pass at the setup for the dask-upstream testing. The basic idea is to add a cron job that
and then runs the
gpu
-marked dask and distributed tests. It's mostly based off https://github.com/rapidsai/cudf/blob/branch-25.02/.github/workflows/pandas-tests.yaml, which does something similar for pandas.I need to figure out the best place to report failures. Open an issue? A slack message?