Job-level max_run_time not being taken into account after worker crash. #1208

tomyyn · 2024-03-20T17:55:24Z

Open this issue after not getting a response from this comment: #832 (comment)

Currently, we are implementing job-level max_run_time for our application. Normally, if the process doesn't die, it works as expected. However, When restarting after a worker process crash, the job-level max_run_time is ignored, and only Worker.max_run_time is taken into cosideration.

As we run in a containerised environment, sometimes, containers that may have initiated processing a job are killed due to scaling down, but job remains in locked state until default max_run_time (Worker.max_run_time).

Is there any solution for this? If not, is it in the plans to implement it?

The text was updated successfully, but these errors were encountered:

lreddickGNA · 2024-07-24T14:47:48Z

There is no known solution AFAIK within delayed_job to resolve this however there is a package that tracks locked jobs and can unlock them provided your params:

https://github.com/salsify/delayed_job_heartbeat_plugin

tomyyn changed the title ~~Jobs being picked up by workers after system death even When job-level max_run_time is exceeded.~~ Job-level max_run_time not being taken into account after worker crash. Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job-level max_run_time not being taken into account after worker crash. #1208

Job-level max_run_time not being taken into account after worker crash. #1208

tomyyn commented Mar 20, 2024

lreddickGNA commented Jul 24, 2024

Job-level max_run_time not being taken into account after worker crash. #1208

Job-level max_run_time not being taken into account after worker crash. #1208

Comments

tomyyn commented Mar 20, 2024

lreddickGNA commented Jul 24, 2024