Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc exp pull/list/push: Authentication failed for origin when running with CML #9779

Closed
kpetersen-hf opened this issue Jul 28, 2023 · 6 comments

Comments

@kpetersen-hf
Copy link

kpetersen-hf commented Jul 28, 2023

Bug Report

Description

When using dvc exp pull origin <my_experiment> as part of a github actions workflow (using DVC 3.9.1), I obtain a git authorization error (see attached). I can pull the experiment on my local machine without problems, and within github actions commands like dvc pull run perfectly fine.

Reproduce

Setup a repo with a dvc.yaml file and add the following github workflow (adapt AWS permission).

name: CML Toy Workflow

on: [push]
 
permissions:
  contents: write
  id-token: write
  pull-requests: write

jobs:
  run:
    runs-on: ubuntu-latest 
    steps:
      - uses: actions/setup-node@v3
        with:
          node-version: '16'
      - uses: actions/checkout@v3
      - name: Obtain AWS credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          role-to-assume: // MY AWS ACCESS TOKEN
          aws-region: us-east-1 
      - uses: iterative/setup-cml@v1
      - name: Install environment
        run: |
          pip install -r requirements.txt
      - name: Git fetch
        run: |
          git fetch origin -v
      - name: Run experiment
        run: |
          dvc exp run -n lr
      - name: Create CML report
        env:
          REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |                    
          git fetch --prune
          dvc exp list origin -n 20
          cml comment create report.md

Expected

No authentication error when running dvc exp list as part of GitHub Actions workflow with CML.

Environment information

@kpetersen-hf
Copy link
Author

kpetersen-hf commented Jul 28, 2023

This issue was discussed on Discord. See https://discord.com/channels/485586884165107732/1134372605705592832.

As suggested by Ivan Shcheklein, the solution is to add

cml ci --fetch-depth 0

above dvc exp list origin -n 20 in the workflow above.

@shcheklein
Copy link
Member

Thanks @kpetersen-hf . Closing this, since it was primarily for visibility. We'll think about a better approach though to support it in DVC itself.

@dacbd
Copy link

dacbd commented Jul 28, 2023

For additional context I believe the culprit issue lies with dulwich where it does not properly use the additional HTTP git config options added by GitHub Actions 'http.https://github.com/.extraheader' which contains the runners token providing access to the repo.

the cml ci command strips that config option and instead encodes the token into the git URL like: 'https://dacbd:[email protected]/iterative/dvc.git' which the library is able to correctly authenticate to GitHub with.

this is needed by dvc commands which interact with the remote like dvc exp list or dvc exp push

You can probably shorten the fix to just cml ci

@efiop
Copy link
Contributor

efiop commented Jul 28, 2023

@pmrowla Could you take a look, please?

@pmrowla
Copy link
Contributor

pmrowla commented Jul 29, 2023

This is a known issue, http.extraheader is not supported out of the box in either dulwich or libgit2/pygit2. The original ticket was in the CML repo and it was closed since cml ci fixes it (setting fetch depth is not required). iterative/cml#656

I opened an issue in scmrepo (iterative/scmrepo#262) but I don't think it needs to be prioritized given that there is a valid fix already.

IMO cml ci should just be done by default in the iterative/setup-cml github action, see also: #9612

@dberenbaum
Copy link
Collaborator

dberenbaum commented Jul 31, 2023

@pmrowla What is the level of effort to support http.extraheader? Thinking about this some more, it would be nice to have that support because I think we increasingly have valid CI use cases that don't require CML at all, and avoiding both cml ci and iterative/setup-cml would be preferable.

Edit: moved to iterative/scmrepo#262

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants