You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to fetch thousands of really small files and I realized I was getting rate limited due to a rate limit imposed on the number of Bucket GET requests which are basically operations from mountpoint s3 trying to read the location per each file read.
I am not listing the contents of the bucket. I know what the file names are beforehand and are just reading each location without listing the folder. However, I am getting as many Bucket GET requests as Object GET and Object HEAD requests.
Is this expected? Is there a way to not list the folder every time I do a get request?
Relevant log output
No response
The text was updated successfully, but these errors were encountered:
Hi, thank you for opening the issue. I assume that you see unexpected ListObjectsV2 requests and to those you refer with "Bucket GET requests". You've mentioned getting rate limited, are you getting 503 errors on ListObjectsV2 requests?
Before reading the file, Mountpoint will make both the ListObjectsV2 and HeadObject requests for the specified path. This mechanism is ensuring the shadowing semantics (e.g. directory dir/ "shadows" the file dir).
You may avoid repeatedListObjectsV2 operations for the given file by using --metadata-ttl <SECONDS>. Also I see that you're already using the --prefix argument, choosing a longer prefix may reduce the number of ListObjectsV2 in case of nested directories.
Mountpoint for Amazon S3 version
mount-s3 1.8.0
AWS Region
us-east-1
Describe the running environment
Locally running mountpoint s3 against an AWS bucket using creds stored in the environment variables. Using Ubuntu 20.04
Mountpoint options
What happened?
I'm trying to fetch thousands of really small files and I realized I was getting rate limited due to a rate limit imposed on the number of Bucket GET requests which are basically operations from mountpoint s3 trying to read the location per each file read.
I am not listing the contents of the bucket. I know what the file names are beforehand and are just reading each location without listing the folder. However, I am getting as many Bucket GET requests as Object GET and Object HEAD requests.
Is this expected? Is there a way to not list the folder every time I do a get request?
Relevant log output
No response
The text was updated successfully, but these errors were encountered: