dbt Computation¶
STRIDE uses dbt (data build tool) to transform validated input data into energy projections. dbt provides a SQL-based workflow for defining and executing data transformations.
Why dbt?¶
dbt offers several advantages for STRIDE’s computation pipeline:
SQL-based - Transformations are expressed in familiar SQL
Modular - Models can be composed and reused
Documented - Built-in support for model documentation
Testable - Define tests for data quality
Incremental - Only recompute what’s changed
Project Structure¶
Each STRIDE project includes a dbt project in the dbt/ directory:
<project>/dbt/
├── dbt_project.yml # dbt project configuration
├── profiles.yml # Database connection settings
├── models/ # SQL transformation models
│ ├── sources.yml # Source table definitions
│ ├── energy_intensity_*.sql
│ ├── load_shapes_*.sql
│ ├── energy_projection.sql
│ └── ev_*.sql # Electric vehicle models
├── macros/ # Reusable SQL macros
│ ├── table_ref.sql # Override reference macro
│ └── get_custom_schema.sql
└── target/ # Compiled output (generated)
How dbt is Invoked¶
When you create a project or call compute_energy_projection(), STRIDE runs dbt for each scenario:
dbt run --vars '{"scenario": "baseline", "country": "USA", ...}'
Variables Passed to dbt¶
Each dbt run receives these variables:
Variable |
Description |
Example |
|---|---|---|
|
Scenario name |
|
|
Country identifier |
|
|
Years to compute |
|
|
Reference weather year |
|
|
Temperature for heating loads |
|
|
Temperature for cooling loads |
|
|
Enable EV calculations |
|
Data Flow¶
The computation follows this flow:
Input Tables (dsgrid_data schema)
↓
dbt Models (SQL transformations)
↓
Scenario Tables ({scenario} schema)
↓
Combined energy_projection table
Input Tables¶
dbt reads from tables in the dsgrid_data schema:
energy_intensity- Regression parameters for energy intensitygdp- Gross domestic product projectionshdi- Human development indexpopulation- Population projectionsload_shapes- Hourly load profilesweather_bait- Building-adjusted temperatures
Transformation Models¶
Key transformations include:
Energy Intensity Parsing - Extract regression parameters from source data
Driver Combination - Join intensity coefficients with GDP, HDI, population
Regression Application - Apply exponential/linear regression formulas
Load Shape Scaling - Scale hourly profiles to match annual projections
Final Aggregation - Combine all sectors into the final projection
Output Tables¶
Each scenario produces tables in its own schema:
baseline.energy_projection
baseline.energy_intensity_parsed
baseline.load_shapes_scaled
...
All scenarios are then combined into the main energy_projection table.
The Override Mechanism¶
STRIDE supports overriding calculated tables at any point in the pipeline. This is implemented through the table_ref macro:
-- In a dbt model
SELECT * FROM {{ table_ref('energy_intensity_parsed') }}
The macro checks if an override variable exists:
If override exists: use the override table
If no override: use the default table
This allows you to inject custom data at any transformation step without modifying the SQL models.
Debugging dbt¶
View Compiled SQL¶
After running, compiled SQL is available in:
<project>/dbt/target/compiled/stride/models/
Check Logs¶
dbt logs are written to:
<project>/stride.log
Run dbt Manually¶
You can run dbt directly for debugging:
cd <project>/dbt
dbt run --vars '{"scenario": "baseline", "country": "USA", "model_years": "(2025,2030)", "weather_year": 2019, "heating_threshold": 18, "cooling_threshold": 18, "use_ev_projection": false}'
Customizing Calculations¶
Override a Calculated Table¶
To replace a table with custom data:
from stride.models import CalculatedTableOverride
project.override_calculated_tables([
CalculatedTableOverride(
scenario="baseline",
table_name="energy_intensity_parsed",
filename="my_custom_intensity.parquet",
)
])
Modify dbt Models¶
For advanced customization, you can edit the SQL models directly:
Navigate to
<project>/dbt/models/Edit the relevant
.sqlfileRun
project.compute_energy_projection()to regenerate