-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer for AWS credentials refresh gets detached #110
Comments
For the purpose of testing file buffer fill up and flushes I have blocked (any) outgoing traffic and the error log appears again when the plugin tries to refresh the credentials.
When re-enabling the out traffic, the OS output plugin is not able to communicate with OpenSearch due to token expiration and not renewing the token. |
Is there any update/workaround on this? We are still experiencing the same issue in EKS. We have multiple regions that have been working perfectly for over a year, but in 2 of our regions, sometimes (once in a few days) the sts call times out resulting in a networking error, the timer stops, and the token expires. |
(check apply)
Problem
After about 1 week of running smoothly after a new deployment to AWS Fargate we got - the first time ever - an error showing up that during the AWS credentials renewal the refresh timer gets detached. Other deployments have not shown this yet, even after a month or more running.
...
Steps to replicate
Error log
Endpoint configuration
The credentials refresh interval is set to 20m. The component is acting as a logging aggregator, forwarding logs from other Fluentd and Application components to OS.
Expected Behavior or What you need to ask
We assume the unexpected error is actually coming from the aws-sdk (or a STS hickup?) and since this is the first time ever to be observed it is hard to determine the cause. Due to the detachment of the timer there is no recovery possible from this error as after the token expiration all requests against OS are failing. There is no further credentials refresh happening.
If asked for an expected behavior this might be either a later retry attempt for the token refresh (better) or maybe a Fluentd exit so that the AWS Fargate task can be restarted (worse).
...
Using Fluentd and OpenSearch plugin versions
The text was updated successfully, but these errors were encountered: