Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to btrfs replace to remove unrecoverable files instead of aborting #932

Open
benpicco opened this issue Dec 19, 2024 · 0 comments

Comments

@benpicco
Copy link

When a btrfs replace is unable the recover a file on a raid56 array, it will just abort:

[167573.709048] BTRFS error (device sdf): unrepaired sectors detected, full stripe 49505622556672 data stripe 2 errors 8-15
[167573.847875] BTRFS error (device sdf): btrfs_scrub_dev(/dev/sdg, 2, /dev/sdd) failed -5

There is a formula to translate those magic numbers back to a file, for convenience I moved that to a shell script:

MNT="/mnt/data"

# unrepaired sectors detected, full stripe 49505622556672 data stripe 2 errors 8-15
#                                                 |                   |        |  |
#                                                $1                  $2       $3 $4

STRIPE=$1
INDEX=$2
E_START=$3
E_END=$4

sudo btrfs inspect-internal logical-resolve -o $(($STRIPE + $INDEX * 65536 + $E_START * 4096)) $MNT
sudo btrfs inspect-internal logical-resolve -o $(($STRIPE + $INDEX * 65536 + $E_END * 4096)) $MNT

(assuming 4k sectors and 64k stripes)

Now instead of barfing fs internals at the user and having them figure out what that do with that information and re-start the replace job from the beginning (until the next unrecoverable error is found), it would be much better if the filesystem could automatically remove those unrecoverable files and continue with the device replace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants