(dbt-computation)= # dbt Computation STRIDE uses [dbt (data build tool)](https://www.getdbt.com/) to transform validated input data into energy projections. dbt provides a SQL-based workflow for defining and executing data transformations. ## Why dbt? dbt offers several advantages for STRIDE's computation pipeline: - **SQL-based** - Transformations are expressed in familiar SQL - **Modular** - Models can be composed and reused - **Documented** - Built-in support for model documentation - **Testable** - Define tests for data quality - **Incremental** - Only recompute what's changed ## Project Structure Each STRIDE project includes a dbt project in the `dbt/` directory: ``` /dbt/ ├── dbt_project.yml # dbt project configuration ├── profiles.yml # Database connection settings ├── models/ # SQL transformation models │ ├── sources.yml # Source table definitions │ ├── energy_intensity_*.sql │ ├── load_shapes_*.sql │ ├── energy_projection.sql │ └── ev_*.sql # Electric vehicle models ├── macros/ # Reusable SQL macros │ ├── table_ref.sql # Override reference macro │ └── get_custom_schema.sql └── target/ # Compiled output (generated) ``` ## How dbt is Invoked When you create a project or call `compute_energy_projection()`, STRIDE runs dbt for each scenario: ```python dbt run --vars '{"scenario": "baseline", "country": "USA", ...}' ``` ### Variables Passed to dbt Each dbt run receives these variables: | Variable | Description | Example | |----------|-------------|---------| | `scenario` | Scenario name | `"baseline"` | | `country` | Country identifier | `"USA"` | | `model_years` | Years to compute | `"(2025,2030,2035)"` | | `weather_year` | Reference weather year | `2019` | | `heating_threshold` | Temperature for heating loads | `18` | | `cooling_threshold` | Temperature for cooling loads | `18` | | `use_ev_projection` | Enable EV calculations | `true` | ## Data Flow The computation follows this flow: ``` Input Tables (dsgrid_data schema) ↓ dbt Models (SQL transformations) ↓ Scenario Tables ({scenario} schema) ↓ Combined energy_projection table ``` ### Input Tables dbt reads from tables in the `dsgrid_data` schema: - `energy_intensity` - Regression parameters for energy intensity - `gdp` - Gross domestic product projections - `hdi` - Human development index - `population` - Population projections - `load_shapes` - Hourly load profiles - `weather_bait` - Building-adjusted temperatures ### Transformation Models Key transformations include: 1. **Energy Intensity Parsing** - Extract regression parameters from source data 2. **Driver Combination** - Join intensity coefficients with GDP, HDI, population 3. **Regression Application** - Apply exponential/linear regression formulas 4. **Load Shape Scaling** - Scale hourly profiles to match annual projections 5. **Final Aggregation** - Combine all sectors into the final projection ### Output Tables Each scenario produces tables in its own schema: ```sql baseline.energy_projection baseline.energy_intensity_parsed baseline.load_shapes_scaled ... ``` All scenarios are then combined into the main `energy_projection` table. ## The Override Mechanism STRIDE supports overriding calculated tables at any point in the pipeline. This is implemented through the `table_ref` macro: ```sql+jinja -- In a dbt model SELECT * FROM {{ table_ref('energy_intensity_parsed') }} ``` The macro checks if an override variable exists: - If override exists: use the override table - If no override: use the default table This allows you to inject custom data at any transformation step without modifying the SQL models. ## Debugging dbt ### View Compiled SQL After running, compiled SQL is available in: ``` /dbt/target/compiled/stride/models/ ``` ### Check Logs dbt logs are written to: ``` /stride.log ``` ### Run dbt Manually You can run dbt directly for debugging: ```bash cd /dbt dbt run --vars '{"scenario": "baseline", "country": "USA", "model_years": "(2025,2030)", "weather_year": 2019, "heating_threshold": 18, "cooling_threshold": 18, "use_ev_projection": false}' ``` ## Customizing Calculations ### Override a Calculated Table To replace a table with custom data: ```python from stride.models import CalculatedTableOverride project.override_calculated_tables([ CalculatedTableOverride( scenario="baseline", table_name="energy_intensity_parsed", filename="my_custom_intensity.parquet", ) ]) ``` ### Modify dbt Models For advanced customization, you can edit the SQL models directly: 1. Navigate to `/dbt/models/` 2. Edit the relevant `.sql` file 3. Run `project.compute_energy_projection()` to regenerate ## Related Topics - {ref}`data-validation` - How input data is validated before dbt runs - {ref}`customizing-checks` - Configuring validation behavior