Skip to content

Export descriptive statistics from a subset of Achilles results for the DARWIN EU® Database Dashboard.

License

Notifications You must be signed in to change notification settings

darwin-eu/DashboardExport

Repository files navigation

DashboardExport

R-CMD-check Lifecycle:stable

Export descriptive statistics from a subset of Achilles results for the DARWIN-EU Database Dashboard. All counts are rounded up to nearest hundred and counts below the smallCellCount are not exported. For an overview of the exported Achilles analyses, see required_analysis_ids.csv.

Publicly available repo available on (DARWIN-EU/DashboardExport)[https://github.com/darwin-eu/DashboardExport]

How to execute

  1. Install Achilles and DashboardExport.
  2. Run Achilles. See this guide for detailed instructions
  3. Run DashboardExport, providing the schema where the Achilles results are located.

One csv file will be written to the given output folder. This file can be uploaded to the DARWIN-EU Database Dashboard.

Sample code

if (!require(remotes)) {install.packages('remotes')}
if (!require(Achilles)) {
    remotes::install_github('OHDSI/Achilles')
}
if (!require(DashboardExport)) {
    remotes::install_github('DARWIN-EU/DashboardExport')
}

library(Achilles)
library(DashboardExport)

dbms <- Sys.getenv("DBMS")
user <- Sys.getenv("DB_USER")
password <- Sys.getenv("DB_PASSWORD")
server <- Sys.getenv("DB_SERVER")
port <- Sys.getenv("DB_PORT")
pathToDriver <- Sys.getenv("PATH_TO_DRIVER")

connectionDetails <- DatabaseConnector::createConnectionDetails(
    dbms = dbms,
    server = server,
    port = port,
    user = user,
    password = password,
    pathToDriver = pathToDriver
)

cdmDatabaseSchema <- Sys.getenv("CDM_SCHEMA")
resultsDatabaseSchema <- Sys.getenv("RESULTS_SCHEMA")
databaseId <- Sys.getenv("DATABASE_ID")

Achilles::achilles(
    connectionDetails = connectionDetails, 
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema, 
    outputFolder = "achilles_output"
    # For running only the minimally required Achilles analyses:
    # analysisIds = DashboardExport::getRequiredAnalysisIds()
)

DashboardExport::dashboardExport(
    connectionDetails = connectionDetails,
    cdmDatabaseSchema = cdmDatabaseSchema,
    resultsDatabaseSchema = resultsDatabaseSchema,
    outputFolder = "output",
    databaseId = databaseId,
    smallCellCount = 5
)

Technical description

This package runs one sql script that unions the records from the achilles_results and achilles_results_dist tables. In addition, it adds one new analysis 5000 which contains a subset of columns from the cdm_source table (cdm_source_name, source_release_date, cdm_release_date, cdm_version, vocabulary_version). The result from this query is written to one csv file. This combines the columns from both achilles tables.

Output structure

The output of DashboardExport has the following columns:

Column Data type Description
analysis_id* int The Achilles analysis id
stratum_1 varchar Analysis stratified by given column
stratum_2 varchar Analysis stratified by given column
stratum_3 varchar Analysis stratified by given column
stratum_4 varchar Analysis stratified by given column
stratum_5 varchar Analysis stratified by given column
count_value* int The main count output
min_value float Minimum, for distributed analyses
max_value float Maximum, for distributed analyses
avg_value float Average, for distributed analyses
stdev_value float Standard deviation, for distributed analyses
median_value float Median, for distributed analyses
p10_value float 10% percentile, for distributed analyses
p25_value float 25% percentile, for distributed analyses
p75_value float 75% percentile, for distributed analyses
p90_value float 90% percentile, for distributed analyses

*required fields

The meaning of the stratum columns differs per analysis_id. Some analyses are not stratified. The stratums can be found in required_analysis_ids.csv.

The count is always given (if not below the provided smallCellCount), and is rounded up to the nearest hundred. The other numeric statistics are only given for analyses from the achilles_results_dist table.

About

Export descriptive statistics from a subset of Achilles results for the DARWIN EU® Database Dashboard.

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages