-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ContinuousTimetable false triggering when last run ends in future #45175
base: main
Are you sure you want to change the base?
Conversation
336824a
to
0e8de6d
Compare
airflow/timetables/simple.py
Outdated
start = last_automated_data_interval.end | ||
end = timezone.coerce_datetime(timezone.utcnow()) | ||
end = current_time | ||
|
||
if start > end: # Skip scheduling if the last run ended in the future | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if last_automated_data_interval.end > current_time:
return None
start = ...
end = ...
Exactly the same logic, but more readable IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is returning None the right thing to do here? It would make this DAG not run at all in the future, even after the previous run is no longer in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that return None
prevents the DAG from running anymore. So, how about applying start and end
as shown below when the start_date is in the future(instead of None)?
if last_automated_data_interval is not None: # has already run once
if last_automated_data_interval.end > current_time: # start date is future
start = restriction.earliest
elapsed = last_automated_data_interval.end - last_automated_data_interval.start # elapsed already run
end = start + elapsed.as_timedelta() # set end to "start + timedelta", not current_time
else:
start = last_automated_data_interval.end
end = current_time
start
is set to restriction.earliest, and end
is calculated by adding the previous execution's time difference to start. This way, even if it's set in the future, expect the DAG to run. Could this approach cause any problems? (Also, since date calculations are involved, would it be better to put it in a separate function?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I fixed above code. Could you take a look?
bba8304
to
5170203
Compare
closes: #45081
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.