You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Currently estimateIncidence accepts cohort start and end dates in denominator and outcome tables in both date and timestamp data types. The function performs as.Date calls on these variables when it calls IncidencePrevalence:::getIncidence. This can lead to incorrect incidences when input tables contain start and end dates in different data types and the system time zone is not UTC+0
To Reproduce
Can't figure a simple example here because DuckDb only supports UTC+0 timestamps.
Additional context
In a recent DARWIN project our node ran into an issue where the number of events and incidences we got from this function were far lower than they should have been in a single cohort. What turned out to be causing this issue was the fact that the denominatorTable for specifically this cohort had cohort_start_date as a date and cohort_end_date as a timestamp due to unintended behaviour in cohort generation (both were meant to be dates by the writers of the script, but this ended up not being the case specifically in PostgreSQL). The time stamps were of the form YYYY-MM-DD 00:00:00 and had our system time zone UTC+2. The default behaviour of as.Date is to convert the time zone of POSIX* timestamps into UTC+0. This means that when as.Date calls are made in IncidencePrevalence:::getIncidence, all the cohort_end_dates ended up moving back a day (e.g. 2023-01-01 00:00:00 UTC+2 -> 2022-12-31 22:00:00 UTC+0 -> 2022-12-31). This lead to vast majority of the cases being excluded in the incidence calculations.
To prevent this, the function should probably at least check that all time variables given to it as input have the same data type. There might also be something to be said about having the option to either give relevant tz argument to the as.Date calls and/or using the system timezone by default by having the calls be as.Date(..., tz = "") when timestamp data is being used
The text was updated successfully, but these errors were encountered:
Describe the bug
Currently
estimateIncidence
accepts cohort start and end dates in denominator and outcome tables in bothdate
andtimestamp
data types. The function performsas.Date
calls on these variables when it callsIncidencePrevalence:::getIncidence.
This can lead to incorrect incidences when input tables contain start and end dates in different data types and the system time zone is not UTC+0To Reproduce
Can't figure a simple example here because DuckDb only supports UTC+0 timestamps.
Additional context
In a recent DARWIN project our node ran into an issue where the number of events and incidences we got from this function were far lower than they should have been in a single cohort. What turned out to be causing this issue was the fact that the
denominatorTable
for specifically this cohort hadcohort_start_date
as a date andcohort_end_date
as a timestamp due to unintended behaviour in cohort generation (both were meant to be dates by the writers of the script, but this ended up not being the case specifically in PostgreSQL). The time stamps were of the formYYYY-MM-DD 00:00:00
and had our system time zone UTC+2. The default behaviour ofas.Date
is to convert the time zone ofPOSIX*
timestamps into UTC+0. This means that whenas.Date
calls are made inIncidencePrevalence:::getIncidence
, all thecohort_end_dates
ended up moving back a day (e.g.2023-01-01 00:00:00 UTC+2
->2022-12-31 22:00:00 UTC+0
->2022-12-31
). This lead to vast majority of the cases being excluded in the incidence calculations.To prevent this, the function should probably at least check that all time variables given to it as input have the same data type. There might also be something to be said about having the option to either give relevant
tz
argument to theas.Date
calls and/or using the system timezone by default by having the calls beas.Date(..., tz = "")
when timestamp data is being usedThe text was updated successfully, but these errors were encountered: