library(Strategus)
analysisSpecifications <- ParallelLogger::loadSettingsFromJson(
fileName = system.file(
"testdata/cdmModulesAnalysisSpecifications.json",
package = "Strategus"
)
)Module 05: Introduction to Strategus
Designing and Executing Reproducible OHDSI Analysis Pipelines
Agenda
- Understand the Strategus mental model
- Review a real Strategus study repository
- Run a guided practice script on Eunomia
- Connect this foundation to upcoming methods sessions
1. Strategus Mental Model
Strategus separates study logic into two objects:
- Analysis specifications: what to run (methods + settings)
- Execution settings: where/how to run (database schema, folders, etc.)
This is a key design for OHDSI network studies: one protocol, many sites.
Core idea
Think of Strategus as a study orchestration layer over HADES modules.
Analysis specification JSON
+
Execution settings JSON
+
Connection details
=>
execute()
2. Load and Inspect an Analysis Specification
For this module, we will use the built-in Strategus test specification.
Explore what is inside
names(analysisSpecifications)Questions:
- Do you see
sharedResources? - Do you see
moduleSpecifications? - How many module specifications are present?
length(analysisSpecifications$moduleSpecifications)Peek at module names
vapply(
X = analysisSpecifications$moduleSpecifications,
FUN = function(x) x$module,
FUN.VALUE = character(1)
)2b. Real Study Repository Orientation
Use this repository as a concrete example of a modern Strategus study:
- Semaglutide NAION: https://github.com/ohdsi-studies/SemaglutideNaion
As you browse, focus on:
- Where analysis specifications live
- Where execution settings are expected
- How cohort definitions are organized and named
Why this matters: your study’s quality depends heavily on clean cohort assets (CSV, JSON, SQL), stable IDs, and sensible labels before you start composing modules.
3. Practice Script: Build a Minimal Strategus Study
For class, we will use one script end-to-end:
modules/05_strategus-intro/strategus-practice.R
This script demonstrates a practical pattern:
- Pull cohort definitions from Strategus example assets
- Relabel cohort names clearly for readability
- Build shared resources + module specifications
- Create execution settings for Eunomia
- Execute and inspect output folders
Run it in chunks so you can inspect each object as it is created.
4. Create Execution Settings
Now we define site-local execution settings using Eunomia.
library(Eunomia)
connectionDetails <- getEunomiaConnectionDetails()
outputFolder <- file.path(tempdir(), "strategus-intro")
dir.create(outputFolder, recursive = TRUE, showWarnings = FALSE)
executionSettings <- createCdmExecutionSettings(
workDatabaseSchema = "main",
cdmDatabaseSchema = "main",
cohortTableNames = CohortGenerator::getCohortTableNames(),
workFolder = file.path(outputFolder, "work_folder"),
resultsFolder = file.path(outputFolder, "results_folder"),
minCellCount = 5
)Save execution settings
ParallelLogger::saveSettingsToJson(
object = executionSettings,
fileName = file.path(outputFolder, "eunomiaExecutionSettings.json")
)Why save this file?
- Makes your run configuration explicit
- Supports reproducibility
- Lets collaborators review site-level execution settings
4. Execute Strategus
Run the study pipeline using the three required inputs.
execute(
connectionDetails = connectionDetails,
analysisSpecifications = analysisSpecifications,
executionSettings = executionSettings
)Strategus will instantiate needed modules and execute each one in sequence.
Inspect output structure
list.files(outputFolder, recursive = TRUE)Look for:
- Work artifacts
- Module-specific results folders
- CSV outputs that can be loaded for review
5. Read Results in Vanilla R
For a pure Strategus workflow, read module output CSV files directly.
library(readr)
# Example pattern (replace with an actual Strategus output CSV path)
result_tbl <- read_csv(file.path(outputFolder, "results_folder", "some_module", "some_result.csv"))
dplyr::glimpse(result_tbl)If you create derived summaries, write them as normal files in your project.
summary_tbl <- result_tbl |>
dplyr::count(.data$analysisId, name = "n_rows")
readr::write_csv(summary_tbl, file.path(outputFolder, "summary_by_analysis.csv"))6. Reflection Questions
- What parts of a study belong in analysis specifications vs execution settings?
- Why is JSON-based configuration useful for multisite studies?
- What QA checks would you run before sharing module outputs?
- How would
renvcomplement Strategus for reproducibility?
7. Session Positioning
This session is foundation:
- Strategus as orchestrator
- Study-as-configuration model
- Cohort organization and naming discipline
Next sessions will go deeper into:
- Characterization and incidence design details
- CohortMethod design and estimation choices
Resources
- Strategus docs: https://ohdsi.github.io/Strategus/
- Introduction article: https://ohdsi.github.io/Strategus/articles/IntroductionToStrategus.html
- Creating specifications: https://ohdsi.github.io/Strategus/articles/CreatingAnalysisSpecification.html
- Execute Strategus: https://ohdsi.github.io/Strategus/articles/ExecuteStrategus.html
- Strategus study template: https://github.com/ohdsi-studies/StrategusStudyRepoTemplate