You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Open this issue after not getting a response from this comment: #832 (comment)
Currently, we are implementing job-level max_run_time for our application. Normally, if the process doesn't die, it works as expected. However, When restarting after a worker process crash, the job-level max_run_time is ignored, and only Worker.max_run_time is taken into cosideration.
As we run in a containerised environment, sometimes, containers that may have initiated processing a job are killed due to scaling down, but job remains in locked state until default max_run_time (Worker.max_run_time).
Is there any solution for this? If not, is it in the plans to implement it?
The text was updated successfully, but these errors were encountered:
tomyyn
changed the title
Jobs being picked up by workers after system death even When job-level max_run_time is exceeded.
Job-level max_run_time not being taken into account after worker crash.
Mar 20, 2024
There is no known solution AFAIK within delayed_job to resolve this however there is a package that tracks locked jobs and can unlock them provided your params:
Open this issue after not getting a response from this comment: #832 (comment)
Currently, we are implementing job-level
max_run_time
for our application. Normally, if the process doesn't die, it works as expected. However, When restarting after a worker process crash, the job-levelmax_run_time
is ignored, and onlyWorker.max_run_time
is taken into cosideration.As we run in a containerised environment, sometimes, containers that may have initiated processing a job are killed due to scaling down, but job remains in locked state until default max_run_time (
Worker.max_run_time
).Is there any solution for this? If not, is it in the plans to implement it?
The text was updated successfully, but these errors were encountered: