CLI Reference

dsgrid

dsgrid commands

dsgrid [OPTIONS] COMMAND [ARGS]...

Options

-c, --console-level <console_level>

Console log level.

Default:

info

-f, --file-level <file_level>

File log level.

Default:

info

-l, --log-file <log_file>

Log to this file.

-n, --no-prompts

Do not prompt.

Default:

False

--offline, --online

Run registry commands in offline mode. WARNING: any commands you perform in offline mode run the risk of being out-of-sync with the latest dsgrid registry, and any write commands will not be officially synced with the remote registry

Default:

True

--timings, --no-timings

Enable tracking of function timings.

Default:

False

-N, --database-name <database_name>

Database name

-u, --url <url>

Database URL. Ex: http://localhost:8529

-U, --username <username>

Database username

-P, --password <password>

dsgrid registry password. Will prompt unless it is passed or the username matches the runtime config file.

-r, --reraise-exceptions

Re-raise any dsgrid exception. Default is to log the exception and exit.

Default:

False

-s, --scratch-dir <scratch_dir>

Base directory for dsgrid temporary directories. Must be accessible on all compute nodes. Defaults to the current directory.

Environment variables

DSGRID_REGISTRY_DATABASE_NAME

Provide a default for -N

DSGRID_REGISTRY_DATABASE_URL

Provide a default for -u

config

Config commands

dsgrid config [OPTIONS] COMMAND [ARGS]...

create

Create a local dsgrid runtime configuration file.

dsgrid config create [OPTIONS]

Options

--timings, --no-timings

Enable tracking of function timings.

Default:

False

-N, --database-name <database_name>

Database name

-u, --url <url>

Database URL. Ex: http://localhost:8529

-U, --username <username>

Database username

-P, --password <password>

Database username

-o, --offline

Run registry commands in offline mode. WARNING: any commands you perform in offline mode run the risk of being out-of-sync with the latest dsgrid registry, and any write commands will not be officially synced with the remote registry

Default:

False

--console-level <console_level>

Console log level.

Default:

info

--file-level <file_level>

File log level.

Default:

info

-r, --reraise-exceptions

Re-raise any dsgrid exception. Default is to log the exception and exit.

Default:

False

-s, --scratch-dir <scratch_dir>

Base directory for dsgrid temporary directories. Must be accessible on all compute nodes. Defaults to the current directory.

download

Download a dataset.

dsgrid download [OPTIONS] DATASET

Arguments

DATASET

Required argument

install-notebooks

Install dsgrid notebooks to a local path.

dsgrid install-notebooks [OPTIONS]

Options

-p, --path <path>

Path to install dsgrid notebooks.

Default:

/home/runner

-f, --force

If true, overwrite existing files.

Default:

False

query

Query group commands

dsgrid query [OPTIONS] COMMAND [ARGS]...

composite-dataset

Composite dataset group commands

dsgrid query composite-dataset [OPTIONS] COMMAND [ARGS]...
create_dataset

Run a query to create a composite dataset.

dsgrid query composite-dataset create_dataset [OPTIONS] QUERY_DEFINITION_FILE

Options

-o, --output <output>

Output directory for query results

Default:

query_output

--load-cached-table, --no-load-cached-table

Try to load a cached table if one exists.

Default:

True

--force

Overwrite results directory if it exists.

Default:

False

Arguments

QUERY_DEFINITION_FILE

Required argument

run

Run a query on a composite dataset.

dsgrid query composite-dataset run [OPTIONS] QUERY_DEFINITION_FILE

Options

-o, --output <output>

Output directory for query results

Default:

query_output

--load-cached-table, --no-load-cached-table

Try to load a cached table if one exists.

Default:

True

--force

Overwrite results directory if it exists.

Default:

False

Arguments

QUERY_DEFINITION_FILE

Required argument

project

Project group commands

dsgrid query project [OPTIONS] COMMAND [ARGS]...
create

Create a default query file for a dsgrid project.

dsgrid query project create [OPTIONS] QUERY_NAME PROJECT_ID DATASET_ID

Options

-F, --filters <filters>

Add a dimension filter. Requires user customization.

Options:

expression | expression_raw | column_operator | between_column_operator | subset | supplemental_column_operator

-a, --aggregation-function <aggregation_function>

Aggregation function for any included default aggregations.

Default:

sum

-f, --query-file <query_file>

Query file to create.

Default:

query.json5

-r, --default-result-aggregation

Add default result aggregration.

Default:

False

--force

Overwrite query file if it exists.

Default:

False

--remote-path <remote_path>

Path to dsgrid remote registry

Default:

s3://nrel-dsgrid-registry

Arguments

QUERY_NAME

Required argument

PROJECT_ID

Required argument

DATASET_ID

Required argument

Examples:

$ dsgrid query project create my_query_result_name my_project_id my_dataset_id

$ dsgrid query project create –default-result-aggregation my_query_result_name my_project_id my_dataset_id

create-derived-dataset-config

Create a derived dataset configuration and dimensions from a query result.

dsgrid query project create-derived-dataset-config [OPTIONS] SRC DST

Options

--remote-path <remote_path>

Path to dsgrid remote registry

Default:

s3://nrel-dsgrid-registry

--force

Overwrite results directory if it exists.

Default:

False

Arguments

SRC

Required argument

DST

Required argument

Examples:

$ dsgrid query project create-derived-dataset-config query_output/my_query_result_name my_dataset_config

run

Run a query on a dsgrid project.

dsgrid query project run [OPTIONS] QUERY_DEFINITION_FILE

Options

--persist-intermediate-table, --no-persist-intermediate-table

Persist the intermediate table to the filesystem to allow for reuse.

Default:

True

-z, --zip-file

Create a zip file containing all output files.

Default:

False

--remote-path <remote_path>

Path to dsgrid remote registry

Default:

s3://nrel-dsgrid-registry

-o, --output <output>

Output directory for query results

Default:

query_output

--load-cached-table, --no-load-cached-table

Try to load a cached table if one exists.

Default:

True

--force

Overwrite results directory if it exists.

Default:

False

Arguments

QUERY_DEFINITION_FILE

Required argument

Examples:

$ dsgrid query project run query.json5

validate
dsgrid query project validate [OPTIONS] QUERY_FILE

Arguments

QUERY_FILE

Required argument

registry

Manage a registry.

dsgrid registry [OPTIONS] COMMAND [ARGS]...

Options

--remote-path <remote_path>

path to dsgrid remote registry

Default:

s3://nrel-dsgrid-registry

bulk-register

Bulk register projects, datasets, and their dimensions. If any failure occurs, the code records successfully registered project and dataset IDs to a journal file and prints its filename to the console. Users can pass that filename with the –journal-file option to avoid registering those projects and datasets on subsequent attempts.

The JSON/JSON5 filename must match the data model defined by this documentation:

https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.registration_models.RegistrationModel

dsgrid registry bulk-register [OPTIONS] REGISTRATION_FILE

Options

-d, --base-data-dir <base_data_dir>

Base directory for input data. If set, and if the dataset paths are relative, prepend them with this path.

-r, --base-repo-dir <base_repo_dir>

Base directory for dsgrid project/dataset repository. If set, and if the config file paths are relative, prepend them with this path.

-j, --journal-file <journal_file>

Journal file created by a previous bulk register operation. If passed, the code will read it and skip all projects and datasets that were successfully registered. The file will be updated with IDs that are successfully registered.

Arguments

REGISTRATION_FILE

Required argument

Examples:

$ dsgrid registry bulk-register registration.json5 $ dsgrid registry bulk-register registration.json5 -j journal__11f733f6-ac9b-4f70-ad4b-df75b291f150.json5

data-sync

Sync the official dsgrid registry data to the local system.

dsgrid registry data-sync [OPTIONS]

Options

-P, --project-id <project_id>

Sync latest dataset(s) version based on Project ID

-D, --dataset-id <dataset_id>

Sync latest dataset version based on Dataset ID

datasets

Dataset subcommands

dsgrid registry datasets [OPTIONS] COMMAND [ARGS]...
dump

Dump a dataset config file from the registry.

dsgrid registry datasets dump [OPTIONS] DATASET_ID

Options

-v, --version <version>

Version to dump; defaults to latest

-d, --directory <directory>

Directory in which to create the config file

--force

Overwrite files if they exist.

Default:

False

Arguments

DATASET_ID

Required argument

Examples:

$ dsgrid registry datasets dump my-dataset-id

list

List the registered dimensions.

dsgrid registry datasets list [OPTIONS]

Options

-f, --filter <filter>

Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags

valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]

Examples:

$ dsgrid registry datasets list

$ dsgrid registry datasets list -f “ID contains com” -f “Submitter == username”

register

Register a new dataset with the registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dataset.html#dsgrid.config.dataset_config.DatasetConfigModel

dsgrid registry datasets register [OPTIONS] DATASET_CONFIG_FILE DATASET_PATH

Options

-l, --log-message <log_message>

Required reason for submission

Arguments

DATASET_CONFIG_FILE

Required argument

DATASET_PATH

Required argument

Examples:

$ dsgrid registry datasets register dataset.json5 -l “Register dataset my-dataset-id.”

update

Update an existing dataset in the registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dataset.html#dsgrid.config.dataset_config.DatasetConfigModel

dsgrid registry datasets update [OPTIONS] DATASET_CONFIG_FILE

Options

-d, --dataset-id <dataset_id>

Required dataset ID

-l, --log-message <log_message>

Required reason for submission

-t, --update-type <update_type>

Required

Options:

major | minor | patch

-v, --version <version>

Required Version to update; must be the current version.

Arguments

DATASET_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry datasets update

-l “Update the description for dataset my-dataset-id.”

-u patch

-v 1.0.0

dataset.json5

dimension-mappings

Dimension mapping subcommands

dsgrid registry dimension-mappings [OPTIONS] COMMAND [ARGS]...
dump

Dump a dimension mapping config file (and any related data) from the registry.

dsgrid registry dimension-mappings dump [OPTIONS] DIMENSION_MAPPING_ID

Options

-v, --version <version>

Version to dump; defaults to latest

-d, --directory <directory>

Directory in which to create config and data files

--force

Overwrite files if they exist.

Default:

False

Arguments

DIMENSION_MAPPING_ID

Required argument

Examples:

$ dsgrid registry dimension-mappings dump 17565575

list

List the registered dimension mappings.

dsgrid registry dimension-mappings list [OPTIONS]

Options

-f, --filter <filter>

Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags

valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]

Examples:

$ dsgrid registry dimension-mappings list

$ dsgrid registry dimension-mappings list -f “Type [From, To] contains geography” -f “Submitter == username”

register

Register new dimension mappings with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel

dsgrid registry dimension-mappings register [OPTIONS]
                                            DIMENSION_MAPPING_CONFIG_FILE

Options

-l, --log-message <log_message>

Required reason for submission

Arguments

DIMENSION_MAPPING_CONFIG_FILE

Required argument

Examples:$ dsgrid registry dimension-mappings register -l “Register dimension mappings for my-project” dimension_mappings.json5

update

Update an existing dimension mapping registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.mapping_tables.MappingTableModel

dsgrid registry dimension-mappings update [OPTIONS]
                                          DIMENSION_MAPPING_CONFIG_FILE

Options

-d, --dimension-mapping-id <dimension_mapping_id>

Required dimension mapping ID

-l, --log-message <log_message>

Required reason for submission

-t, --update-type <update_type>

Required

Options:

major | minor | patch

-v, --version <version>

Required Version to update; must be the current version.

Arguments

DIMENSION_MAPPING_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry dimension-mappings update

-d 17565575

-l “Swap out the state to county mapping for my-dataset to that-project”

-u major

-v 1.0.0 dimension_mappings.json5”

dimensions

Dimension subcommands

dsgrid registry dimensions [OPTIONS] COMMAND [ARGS]...
dump

Dump a dimension config file (and any related data) from the registry.

dsgrid registry dimensions dump [OPTIONS] DIMENSION_ID

Options

-v, --version <version>

Version to dump; defaults to latest

-d, --directory <directory>

Directory in which to create config and data files

--force

Overwrite files if they exist.

Default:

False

Arguments

DIMENSION_ID

Required argument

Examples:

$ dsgrid registry dimensions dump 17565829

list

List the registered dimensions.

dsgrid registry dimensions list [OPTIONS]

Options

-f, --filter <filter>

Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags

valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]

Examples:

$ dsgrid registry dimensions list

$ dsgrid registry dimensions list -f “Type == sector”

$ dsgrid registry dimensions list -f “Submitter == username”

register

Register new dimensions with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension.html#dsgrid.config.dimensions.DimensionsConfigModel

dsgrid registry dimensions register [OPTIONS] DIMENSION_CONFIG_FILE

Options

-l, --log-message <log_message>

Required reason for submission

Arguments

DIMENSION_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry dimensions register -l “Register dimensions for my-project” dimensions.json5

update

Update an existing dimension in the registry.

dsgrid registry dimensions update [OPTIONS] DIMENSION_CONFIG_FILE

Options

-d, --dimension-id <dimension_id>

Required dimension ID

-l, --log-message <log_message>

Required reason for submission

-t, --update-type <update_type>

Required

Options:

major | minor | patch

-v, --version <version>

Required Version to update; must be the current version.

Arguments

DIMENSION_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry dimensions update -d 17565829 -l “Update county dimension” -u major -v 1.0.0 dimension.json5

list

List the contents of a registry.

dsgrid registry list [OPTIONS]

projects

Project subcommands

dsgrid registry projects [OPTIONS] COMMAND [ARGS]...
add-dataset-requirements

Add requirements for one or more datasets to a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.input_dataset_requirements.InputDatasetListModel

dsgrid registry projects add-dataset-requirements [OPTIONS] PROJECT_ID
                                                  FILENAME

Options

-l, --log-message <log_message>

Required Please specify the reason for the new datasets.

Arguments

PROJECT_ID

Required argument

FILENAME

Required argument

Examples:

$ dsgrid registry projects add-dataset-requirements

-l “Add requirements for dataset my-dataset-id to my-project-id.”

my-project-id

dataset_requirements.json5

dump

Dump a project config file from the registry.

dsgrid registry projects dump [OPTIONS] PROJECT_ID

Options

-v, --version <version>

Version to dump; defaults to latest

-d, --directory <directory>

Directory in which to create the config file

--force

Overwrite files if they exist.

Default:

False

Arguments

PROJECT_ID

Required argument

Examples:

$ dsgrid registry projects dump my-project-id

list

List the registered projects.

dsgrid registry projects list [OPTIONS]

Options

-f, --filter <filter>

Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags

valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]

Examples:

$ dsgrid registry projects list

$ dsgrid registry projects list -f “ID contains efs”

list-dimension-query-names

List the project’s dimension query names.

dsgrid registry projects list-dimension-query-names [OPTIONS] PROJECT_ID

Options

-b, --exclude-base

Exclude base dimension query names.

Default:

False

-S, --exclude-subset

Exclude subset dimension query names.

Default:

False

-s, --exclude-supplemental

Exclude supplemental dimension query names.

Default:

False

Arguments

PROJECT_ID

Required argument

Examples:

$ dsgrid registry projects list-dimension-query-names my_project_id

$ dsgrid registry projects list-dimension-query-names –exclude-subset my_project_id

$ dsgrid registry projects list-dimension-query-names –exclude-supplemental my_project_id

register

Register a new project with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.project_config.ProjectConfigModel

dsgrid registry projects register [OPTIONS] PROJECT_CONFIG_FILE

Options

-l, --log-message <log_message>

Required reason for submission

Arguments

PROJECT_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry projects register -l “Register project my-project” project.json5

register-and-submit-dataset

Register a dataset and then submit it to a dsgrid project.

dsgrid registry projects register-and-submit-dataset [OPTIONS]

Options

-c, --dataset-config-file <dataset_config_file>

Required Dataset config file

-d, --dataset-path <dataset_path>

Required Path to directory containing load data (Parquet) files.

-m, --dimension-mapping-file <dimension_mapping_file>

Dimension mapping file. Must match the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel

-r, --dimension-mapping-references-file <dimension_mapping_references_file>

dimension mapping references file. Mutually exclusive with dimension_mapping_file. Use it when the mappings are already registered. Must mach the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mapping_base.DimensionMappingReferenceListModel

-a, --autogen-reverse-supplemental-mappings <autogen_reverse_supplemental_mappings>

For any dimension listed here, if the dataset’s dimension is a project’s supplemental dimension and no mapping is provided, create a reverse mapping from that supplemental dimension.

Options:

metric | geography | sector | subsector | time | weather_year | model_year | scenario

-p, --project-id <project_id>

Required project identifier

-l, --log-message <log_message>

Required reason for submission

Examples:

$ dsgrid registry projects register-and-submit-dataset

-c dataset.json5

-d path/to/my/dataset

-p my-project-id

-d my-dataset-id

-m dimension_mappings.json5

-l “Register and submit dataset my-dataset to project my-project.”

register-subset-dimensions

Register new subset dimensions with a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation:

https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.project_config.SubsetDimensionGroupListModel

dsgrid registry projects register-subset-dimensions [OPTIONS] PROJECT_ID
                                                    FILENAME

Options

-l, --log-message <log_message>

Required Please specify the reason for this addition.

Arguments

PROJECT_ID

Required argument

FILENAME

Required argument

Examples:

$ dsgrid registry projects register-subset-dimensions

-l “Register subset dimensions for end uses by fuel type for my-project-id.”

my-project-id

subset_dimensions.json5

register-supplemental-dimensions

Register new supplemental dimensions with a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.supplemental_dimension.SupplementalDimensionsListModel

dsgrid registry projects register-supplemental-dimensions [OPTIONS] PROJECT_ID
                                                          FILENAME

Options

-l, --log-message <log_message>

Required Please specify the reason for this addition.

Arguments

PROJECT_ID

Required argument

FILENAME

Required argument

Examples:

$ dsgrid registry projects register-supplemental-dimensions

-l “Register states supplemental dimension for my-project-id”

my-project-id

supplemental_dimensions.json5

replace-dataset-dimension-requirements

Replace dimension requirements for one or more datasets in a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation:

https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.input_dataset_requirements.InputDatasetDimensionRequirementsListModel

dsgrid registry projects replace-dataset-dimension-requirements 
    [OPTIONS] PROJECT_ID FILENAME

Options

-l, --log-message <log_message>

Required Please specify the reason for the new requirements.

Arguments

PROJECT_ID

Required argument

FILENAME

Required argument

Examples:

$ dsgrid registry projects replace-dataset-dimension-requirements

-l “Replace dimension requirements for dataset my-dataset-id in my-project-id.”

project_id

dataset_dimension_requirements.json5

submit-dataset

Submit a dataset to a dsgrid project.

dsgrid registry projects submit-dataset [OPTIONS]

Options

-d, --dataset-id <dataset_id>

Required dataset identifier

-p, --project-id <project_id>

Required project identifier

-m, --dimension-mapping-file <dimension_mapping_file>

Dimension mapping file. Must match the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel

-r, --dimension-mapping-references-file <dimension_mapping_references_file>

dimension mapping references file. Mutually exclusive with dimension_mapping_file. Use it when the mappings are already registered. Must mach the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mapping_base.DimensionMappingReferenceListModel

-a, --autogen-reverse-supplemental-mappings <autogen_reverse_supplemental_mappings>

For any dimension listed here, if the dataset’s dimension is a project’s supplemental dimension and no mapping is provided, create a reverse mapping from that supplemental dimension.

Options:

metric | geography | sector | subsector | time | weather_year | model_year | scenario

-l, --log-message <log_message>

Required reason for submission

Examples:

$ dsgrid registry projects submit-dataset

-p my-project-id

-d my-dataset-id

-m dimension_mappings.json5

-l “Submit dataset my-dataset to project my-project.”

update

Update an existing project in the registry.

dsgrid registry projects update [OPTIONS] PROJECT_CONFIG_FILE

Options

-p, --project-id <project_id>

Required project ID

-l, --log-message <log_message>

Required reason for submission

-t, --update-type <update_type>

Required

Options:

major | minor | patch

-v, --version <version>

Required Version to update; must be the current version.

Arguments

PROJECT_CONFIG_FILE

Required argument

Examples:

$ dsgrid registry projects update

-p my-project-id

-u patch

-v 1.5.0

-l “Update description for project my-project-id.”