-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sctool backup on Azure fails with: "Error: create backup target: location is not accessible" "Failed to access location from node" #9606
Comments
@kreuzerkrieg , @karol-kokoszka , can you please advise? is it a testing issue or manager issue? |
reproduced:
PackagesScylla version: Kernel Version: Issue description
Describe your issue in detail and steps it took to produce it. ImpactDescribe the impact this issue causes to the user. How frequently does it reproduce?Describe the frequency with how this issue can be reproduced. Installation detailsCluster size: 6 nodes (Standard_L8s_v3) Scylla Nodes used in this run:
OS / Image: Test: Logs and commands
Logs:
|
@fruch , please let me know for any update or anything i can do here. |
You can investigate it further, and check the underlined error in manger, and if this bucket exists in azure storage. Or wait for @mikliapko to attend to that. |
What is the context here ? Are these daily triggered jobs ? |
Yes, these are Azure daily sanity tests |
I'll try to take a look in coming days. |
Analyzing the issue reproduction from 15-01-2025 Argus From scylla-manager logs:
Location check fails for 10.0.0.14 node. At first, I was thinking that the failure might be related to the fact the node was added during the test and possibly scylla-agent might have been configured in a wrong way, but in one of the latest runs (Argus) I see the same issue with location check failure but for the node that has been in cluster from the very beginning. Also, the issue is flaky and doesn't reproduce every time. For example, here (Argus), both Frankly speaking, I see no issues from test side. Probably, it's something that should be addressed on Manager side(?) |
Here on Logs tab I see no logs for longevity-10gb-3h-master-db-node-428c95d2-eastus-11 node - the one that failed to access backup location. |
The node-11 as any other "alive" node at the point in time when the collection happens became part of the group archive "db_cluster...". So, everything is there, everything by design. |
ah, missed it, thanks for explanation |
@mikliapko I don't see anything wrong in the agent logs. Could we re-run this test with debug level logging? logger:
level: debug
sampling: null |
Yes, we can. |
Packages
Scylla version:
6.3.0~dev-20241222.200f0bb21926
with build-ide73b8c9942be689611d19d0acce447b553fba2dc
Kernel Version:
6.8.0-1018-azure
Issue description
Describe your issue in detail and steps it took to produce it.
A nemesis of
disrupt_mgmt_backup_specific_keyspaces
completed ok (on Azure).Then, another nemesis of
disrupt_mgmt_backup
failed with:manager log had:
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 6 nodes (Standard_L8s_v3)
Scylla Nodes used in this run:
OS / Image:
/subscriptions/6c268694-47ab-43ab-b306-3c5514bc4112/resourceGroups/scylla-images/providers/Microsoft.Compute/images/scylla-6.3.0-dev-x86_64-2024-12-23T02-09-38
(azure: undefined_region)Test:
longevity-10gb-3h-azure-test
Test id:
3aaf1b83-ec80-44d0-8296-43c7e7ed1072
Test name:
scylla-master/longevity/longevity-10gb-3h-azure-test
Test method:
longevity_test.LongevityTest.test_custom_time
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 3aaf1b83-ec80-44d0-8296-43c7e7ed1072
$ hydra investigate show-logs 3aaf1b83-ec80-44d0-8296-43c7e7ed1072
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: