Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DVC post-checkout hook: complains about unsaved files (which have not changed) #10584

Open
JulianoLagana opened this issue Oct 8, 2024 · 1 comment
Labels
triage Needs to be triaged

Comments

@JulianoLagana
Copy link

JulianoLagana commented Oct 8, 2024

Bug Report

Description

DVC post-checkout hook complains that it can't remove unsaved files without confirmation, but these files have not changed.

We currently upgraded from dvc 2.58.1 to 3.55.2. After a while with no problems, I noticed that our post-checkout hook sometimes fails, complaining that it can't remove unsaved files without confirmation. At first I believed this was just actually having unsaved files, so I did dvc checkout --force a few times. However, the problem kept coming back every now and then when switching to different branches.

I then started to do some digging. First, I noticed that even though the post-checkout hook was failing due to unsaved files, dvc status showed no changes. Furthermore, the md5 hash for the "unsaved" file in question (which I computed with md5 filename) exactly matched the one in the .dir file in the cache (this file is inside a folder which is an output of one of our stages). Lastly, I also noticed that the md5 of the file does not change after dvc checkout --force, even though I get an Applying changes M ./ printout.

At the moment I don't really know what the problem is, and would appreciate assistance.

Reproduce

I am not able to reproduce this at will. Haven't yet figured out exactly what makes this happen.

Expected

DVC post-checkout hook would complete without errors if I don't have any unsaved files. Alternatively, if I do have unsaved files, I would expect dvc status to point them to me, or at least that their MD5 hash would not match the one tracked by dvc (and then match it after something like dvc checkout --force).

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.55.2 (pip)
-------------------------
Platform: Python 3.10.15 on macOS-15.0.1-arm64-arm-64bit
Subprojects:
	dvc_data = 3.16.5
	dvc_objects = 5.1.0
	dvc_render = 1.0.2
	dvc_task = 0.3.0
	scmrepo = 3.3.7
Supports:
	http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
	s3 (s3fs = 2024.2.0, boto3 = 1.34.34)
Config:
	Global: /Users/juliano/Library/Application Support/dvc
	System: /Library/Application Support/dvc
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk3s3s1
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk3s3s1
Repo: dvc, git
Repo.site_cache_dir: /Library/Caches/dvc/repo/8ac7a2e9eb78ffa8d315cce7b95313f0

Pre-commit configuration:

---
fail_fast: true

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: trailing-whitespace
        exclude: '.*dvc\.lock'
      - id: end-of-file-fixer
        exclude: '^(recipes|lib|datasets|zones|ipython_notebooks|statistics_worksheets|explore)/|params\.json$'
      - id: check-yaml
      - id: check-toml
      - id: check-added-large-files
        args: ["--maxkb=3000"]
      - id: debug-statements
        language_version: python3
  - repo: https://github.com/psf/black
    rev: 23.1.0
    hooks:
      - id: black
        exclude: '^(recipes|lib|datasets|zones|ipython_notebooks|statistics_worksheets|explore)/|params\.json$'
        language_version: python3
  - repo: https://github.com/pycqa/isort
    rev: 5.12.0
    hooks:
      - id: isort
        exclude: '^(recipes|lib|datasets|zones|ipython_notebooks|statistics_worksheets|explore)/|params\.json$'
        name: isort (python)
  - repo: https://github.com/pycqa/flake8
    rev: 6.0.0
    hooks:
      - id: flake8
        args: ["--max-line-length=225"]
        exclude: '^(recipes|lib|datasets|zones|ipython_notebooks|statistics_worksheets|explore)/|params\.json$|^src/catella/btr/dash/dataiku\.py$|^src/catella/btr/utils/data_utils\.py$'
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.3.0 hooks:
      - id: mypy
        additional_dependencies: [types-requests, types-PyYAML]
        exclude: '^(recipes|lib|datasets|zones|ipython_notebooks|statistics_worksheets|explore)/|params\.json$|^src/catella/btr/dash/dataiku\.py$|^src/catella/property_research_agent/main\.py$|^src/catella/property_research_agent/app\.py$'
  - repo: local
    hooks:
      - id: pytest-check
        name: pytest
        entry: pytest tests/
        language: system
        pass_filenames: false
        always_run: true
        stages:
          - pre-commit
  - repo: https://github.com/iterative/dvc
    rev: 3.55.2
    hooks:
      - id: dvc-pre-push
        additional_dependencies: [".[s3]"]
        language_version: python3
        stages:
          - push
      - always_run: true
        id: dvc-post-checkout
        additional_dependencies: [".[s3]"]
        language_version: python3
        stages:
          - post-checkout
@skshetry skshetry added the triage Needs to be triaged label Oct 9, 2024
@KansaiUser
Copy link

This actually happens too in the dagshub tutorial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs to be triaged
Projects
None yet
Development

No branches or pull requests

3 participants