You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got a massive stackstrace while a single line is enough like:
[email protected] does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
DC should carefully handle these types of errors.
The stack trace:
Using cached virtualenv
Listing gs://mpii-human-pose: 0 objects [00:00, ? objects/s]
Processed: 1 rows [00:00, 11.23 rows/s] [00:00, ? objects/s]
Traceback (most recent call last):
File "<string>", line 3, in <module>
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/dc.py", line 474, in from_storage
.save(list_ds_name, listing=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/dc.py", line 764, in save
query=self._query.save(
^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 1579, in save
query = self.apply_steps()
^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 1128, in apply_steps
result = step.apply(
^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 572, in apply
self.populate_udf_table(udf_table, query)
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 492, in populate_udf_table
process_udf_outputs(
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 335, in process_udf_outputs
for row in udf_output:
^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/udf.py", line 369, in <genexpr>
output = (dict(zip(self.signal_names, row)) for row in udf_outputs)
^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/udf.py", line 368, in <genexpr>
udf_outputs = (self._flatten_row(row) for row in result_objs)
^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/listing.py", line 35, in list_func
for entries in iter_over_async(client.scandir(path.rstrip("/")), get_loop()):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/asyn.py", line 238, in iter_over_async
done, obj = asyncio.run_coroutine_threadsafe(get_next(), loop).result()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/asyn.py", line 231, in get_next
obj = await ait.__anext__()
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/fsspec.py", line 225, in scandir
await main_task
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/gcs.py", line 57, in _fetch_flat
await self._get_pages(prefix, page_queue)
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/gcs.py", line 91, in _get_pages
page = await self.fs._call(
^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/core.py", line 447, in _call
status, headers, info, contents = await self._request(
^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/decorator.py", line 221, in fun
return await caller(func, *(extras + args), **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/retry.py", line 130, in retry_request
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/core.py", line 440, in _request
validate_response(status, contents, path, args)
File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/retry.py", line 111, in validate_response
raise OSError(f"Forbidden: {path}\n{msg}")
OSError: Forbidden: b/mpii-human-pose/o
[email protected] does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
Query script exited with error code 1
Version Info
The text was updated successfully, but these errors were encountered:
First step - never show stack traces unless -v or something terrible happened
Second step (and in general going forward) pay attention to "regular" exceptions and handle them properly - sometimes can be wrapper, sometimes we need to change logic (e.g. test in advance runtime so that we don't have an exception at all later).
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/aiobotocore/signers.py", line 90, in sign
auth.add_auth(request)
~~~~~~~~~~~~~^^^^^^^^^
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/botocore/auth.py", line 423, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
Not existing bucket:
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/datachain/client/s3.py", line 101, in _fetch_flat
await get_pages(it, page_queue)
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/datachain/client/s3.py", line 56, in get_pages
async for page in it:
await page_queue.put(page.get(contents_key, []))
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/aiobotocore/paginate.py", line 30, in __anext__
response = await self._make_request(current_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/tester/venv/venv-py309/lib/python3.13/site-packages/aiobotocore/client.py", line 412, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchBucket: An error occurred (NoSuchBucket) when calling the ListObjectVersions operation: The specified bucket does not exist
This should be DC native error message (not boto or other libs). Also, it it should mention bucket names. Otherwise, it's hard to nail down the issue in a large script.
Description
I got a massive stackstrace while a single line is enough like:
DC should carefully handle these types of errors.
The stack trace:
Version Info
The text was updated successfully, but these errors were encountered: