Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command to check an already synced bucket? #38

Open
PC-Admin opened this issue Jul 31, 2024 · 5 comments
Open

Command to check an already synced bucket? #38

PC-Admin opened this issue Jul 31, 2024 · 5 comments

Comments

@PC-Admin
Copy link

Not sure how to get this working, but trying to check if one of my buckets was synced. (In dash it does seem to be)

But I am running into the following errors:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower -b test-bucket5
Checking files in bucket test-bucket5 ...
🪣 BUCKET       | Match  | MissSrc       | MissDst       | Differ        | Error
FATA[0000] unable to check bucket                        error="rpc error: code = InvalidArgument desc = InvalidArg"
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower --check-bucket test-bucket5
Checking files in bucket test-bucket5 ...
🪣 BUCKET       | Match  | MissSrc       | MissDst       | Differ        | Error
FATA[0000] unable to check bucket                        error="rpc error: code = InvalidArgument desc = InvalidArg"

What's the correct syntax I'm looking for here?

@PC-Admin
Copy link
Author

Specifying the user account as well seems to get me a little further, although it times out:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower --check-bucket test-user:test-bucket5
Checking files in bucket test-user:test-bucket5 ...
🪣 BUCKET                 | Match        | MissSrc       | MissDst       | Differ        | Error
FATA[0020] unable to check bucket                        error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 146.118.58.214:9670: i/o timeout\""

@arttor
Copy link
Collaborator

arttor commented Jul 31, 2024

probably the bucket is too big and chorus was not able to list all objects in time.

chorctl check is just a wrapper around rclone check command.
I think it was bad idea to add this command to chorctl. I will have to rewrite it or remove it because rclone check consumes a lot of RAM and takes a lot of time for big bucket and can fail chorus worker instance.

Please try to use rclone check directly instead of chorctl check to avoid timeout error.

@PC-Admin
Copy link
Author

PC-Admin commented Aug 1, 2024

Thanks for getting back to me so quickly. It's strange as these are quite small buckets with only 50-100 objects in them...

I've created a list of "different" buckets using rclone check but I'm now wondering. What is the most efficient way to "continue" replication on these buckets?

When I try to re-add a user level rule it seems to only replicate buckets that are new, but not buckets that have new objects in them:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl delete-user -u test-user -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl add-user -u test-user -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl
NAME                                      PROGRESS                 SIZE                  OBJECTS     EVENTS     PAUSED     LAG             AGE
...
test-user:test-bucket3:main->follower     [##########] 100.0 %     4.9 GiB/4.9 GiB       50/50       0/0        false      11.607007ms     23h9m

Here we see test-bucket3, which I've added 50x objects too (making 100 in total), doesn't actually get updated this way. test-bucket7, which was new, did get updated however.

When I try to re-add a bucket level rule, it seems to start from scratch and re-transmit every single object in that bucket. :S

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl delete -u test-user -b "test-bucket4" -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl add -u test-user -b "test-bucket4" -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl
NAME                                      PROGRESS                 SIZE                  OBJECTS     EVENTS     PAUSED     LAG             AGE
test-user:test-bucket4:main->follower     [#         ]  19.1 %     1.9 GiB/9.7 GiB       19/100      0/0        false      45.017207ms     7s
...

Here we see it's starting again with test-bucket4, which already had 50x objects in it that I would have liked to skip...

@arttor
Copy link
Collaborator

arttor commented Aug 1, 2024

sorry i didn't get your question. but when you start replication chorus will list all objects from source. then it will try to sync each object to destination. if object is already exists in destination and the same size and etag, then object will not be copied.

@PC-Admin
Copy link
Author

PC-Admin commented Aug 2, 2024

That's good to know, thank you. I'll leave this one open for when you manage to remove chorctl check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants