CLI Reference¶
dsgrid¶
dsgrid commands
dsgrid [OPTIONS] COMMAND [ARGS]...
Options
- -c, --console-level <console_level>¶
Console log level.
- Default:
info
- -f, --file-level <file_level>¶
File log level.
- Default:
info
- -l, --log-file <log_file>¶
Log to this file.
- -n, --no-prompts¶
Do not prompt.
- Default:
False
- --offline, --online¶
Run registry commands in offline mode. WARNING: any commands you perform in offline mode run the risk of being out-of-sync with the latest dsgrid registry, and any write commands will not be officially synced with the remote registry
- Default:
True
- --timings, --no-timings¶
Enable tracking of function timings.
- Default:
False
- -N, --database-name <database_name>¶
Database name
- -u, --url <url>¶
Database URL. Ex: http://localhost:8529
- -U, --username <username>¶
Database username
- -P, --password <password>¶
dsgrid registry password. Will prompt unless it is passed or the username matches the runtime config file.
- -r, --reraise-exceptions¶
Re-raise any dsgrid exception. Default is to log the exception and exit.
- Default:
False
- -s, --scratch-dir <scratch_dir>¶
Base directory for dsgrid temporary directories. Must be accessible on all compute nodes. Defaults to the current directory.
Environment variables
- DSGRID_REGISTRY_DATABASE_NAME
Provide a default for
-N
- DSGRID_REGISTRY_DATABASE_URL
Provide a default for
-u
config¶
Config commands
dsgrid config [OPTIONS] COMMAND [ARGS]...
create¶
Create a local dsgrid runtime configuration file.
dsgrid config create [OPTIONS]
Options
- --timings, --no-timings¶
Enable tracking of function timings.
- Default:
False
- -N, --database-name <database_name>¶
Database name
- -u, --url <url>¶
Database URL. Ex: http://localhost:8529
- -U, --username <username>¶
Database username
- -P, --password <password>¶
Database username
- -o, --offline¶
Run registry commands in offline mode. WARNING: any commands you perform in offline mode run the risk of being out-of-sync with the latest dsgrid registry, and any write commands will not be officially synced with the remote registry
- Default:
False
- --console-level <console_level>¶
Console log level.
- Default:
info
- --file-level <file_level>¶
File log level.
- Default:
info
- -r, --reraise-exceptions¶
Re-raise any dsgrid exception. Default is to log the exception and exit.
- Default:
False
- -s, --scratch-dir <scratch_dir>¶
Base directory for dsgrid temporary directories. Must be accessible on all compute nodes. Defaults to the current directory.
download¶
Download a dataset.
dsgrid download [OPTIONS] DATASET
Arguments
- DATASET¶
Required argument
install-notebooks¶
Install dsgrid notebooks to a local path.
dsgrid install-notebooks [OPTIONS]
Options
- -p, --path <path>¶
Path to install dsgrid notebooks.
- Default:
/home/runner
- -f, --force¶
If true, overwrite existing files.
- Default:
False
query¶
Query group commands
dsgrid query [OPTIONS] COMMAND [ARGS]...
composite-dataset¶
Composite dataset group commands
dsgrid query composite-dataset [OPTIONS] COMMAND [ARGS]...
create_dataset¶
Run a query to create a composite dataset.
dsgrid query composite-dataset create_dataset [OPTIONS] QUERY_DEFINITION_FILE
Options
- -o, --output <output>¶
Output directory for query results
- Default:
query_output
- --load-cached-table, --no-load-cached-table¶
Try to load a cached table if one exists.
- Default:
True
- --force¶
Overwrite results directory if it exists.
- Default:
False
Arguments
- QUERY_DEFINITION_FILE¶
Required argument
run¶
Run a query on a composite dataset.
dsgrid query composite-dataset run [OPTIONS] QUERY_DEFINITION_FILE
Options
- -o, --output <output>¶
Output directory for query results
- Default:
query_output
- --load-cached-table, --no-load-cached-table¶
Try to load a cached table if one exists.
- Default:
True
- --force¶
Overwrite results directory if it exists.
- Default:
False
Arguments
- QUERY_DEFINITION_FILE¶
Required argument
project¶
Project group commands
dsgrid query project [OPTIONS] COMMAND [ARGS]...
create¶
Create a default query file for a dsgrid project.
dsgrid query project create [OPTIONS] QUERY_NAME PROJECT_ID DATASET_ID
Options
- -F, --filters <filters>¶
Add a dimension filter. Requires user customization.
- Options:
expression | expression_raw | column_operator | between_column_operator | subset | supplemental_column_operator
- -a, --aggregation-function <aggregation_function>¶
Aggregation function for any included default aggregations.
- Default:
sum
- -f, --query-file <query_file>¶
Query file to create.
- Default:
query.json5
- -r, --default-result-aggregation¶
Add default result aggregration.
- Default:
False
- --force¶
Overwrite query file if it exists.
- Default:
False
- --remote-path <remote_path>¶
Path to dsgrid remote registry
- Default:
s3://nrel-dsgrid-registry
Arguments
- QUERY_NAME¶
Required argument
- PROJECT_ID¶
Required argument
- DATASET_ID¶
Required argument
Examples:
$ dsgrid query project create my_query_result_name my_project_id my_dataset_id
$ dsgrid query project create –default-result-aggregation my_query_result_name my_project_id my_dataset_id
create-derived-dataset-config¶
Create a derived dataset configuration and dimensions from a query result.
dsgrid query project create-derived-dataset-config [OPTIONS] SRC DST
Options
- --remote-path <remote_path>¶
Path to dsgrid remote registry
- Default:
s3://nrel-dsgrid-registry
- --force¶
Overwrite results directory if it exists.
- Default:
False
Arguments
- SRC¶
Required argument
- DST¶
Required argument
Examples:
$ dsgrid query project create-derived-dataset-config query_output/my_query_result_name my_dataset_config
run¶
Run a query on a dsgrid project.
dsgrid query project run [OPTIONS] QUERY_DEFINITION_FILE
Options
- --persist-intermediate-table, --no-persist-intermediate-table¶
Persist the intermediate table to the filesystem to allow for reuse.
- Default:
True
- -z, --zip-file¶
Create a zip file containing all output files.
- Default:
False
- --remote-path <remote_path>¶
Path to dsgrid remote registry
- Default:
s3://nrel-dsgrid-registry
- -o, --output <output>¶
Output directory for query results
- Default:
query_output
- --load-cached-table, --no-load-cached-table¶
Try to load a cached table if one exists.
- Default:
True
- --force¶
Overwrite results directory if it exists.
- Default:
False
Arguments
- QUERY_DEFINITION_FILE¶
Required argument
Examples:
$ dsgrid query project run query.json5
validate¶
dsgrid query project validate [OPTIONS] QUERY_FILE
Arguments
- QUERY_FILE¶
Required argument
registry¶
Manage a registry.
dsgrid registry [OPTIONS] COMMAND [ARGS]...
Options
- --remote-path <remote_path>¶
path to dsgrid remote registry
- Default:
s3://nrel-dsgrid-registry
bulk-register¶
Bulk register projects, datasets, and their dimensions. If any failure occurs, the code records successfully registered project and dataset IDs to a journal file and prints its filename to the console. Users can pass that filename with the –journal-file option to avoid registering those projects and datasets on subsequent attempts.
The JSON/JSON5 filename must match the data model defined by this documentation:
dsgrid registry bulk-register [OPTIONS] REGISTRATION_FILE
Options
- -d, --base-data-dir <base_data_dir>¶
Base directory for input data. If set, and if the dataset paths are relative, prepend them with this path.
- -r, --base-repo-dir <base_repo_dir>¶
Base directory for dsgrid project/dataset repository. If set, and if the config file paths are relative, prepend them with this path.
- -j, --journal-file <journal_file>¶
Journal file created by a previous bulk register operation. If passed, the code will read it and skip all projects and datasets that were successfully registered. The file will be updated with IDs that are successfully registered.
Arguments
- REGISTRATION_FILE¶
Required argument
Examples:
$ dsgrid registry bulk-register registration.json5 $ dsgrid registry bulk-register registration.json5 -j journal__11f733f6-ac9b-4f70-ad4b-df75b291f150.json5
data-sync¶
Sync the official dsgrid registry data to the local system.
dsgrid registry data-sync [OPTIONS]
Options
- -P, --project-id <project_id>¶
Sync latest dataset(s) version based on Project ID
- -D, --dataset-id <dataset_id>¶
Sync latest dataset version based on Dataset ID
datasets¶
Dataset subcommands
dsgrid registry datasets [OPTIONS] COMMAND [ARGS]...
dump¶
Dump a dataset config file from the registry.
dsgrid registry datasets dump [OPTIONS] DATASET_ID
Options
- -v, --version <version>¶
Version to dump; defaults to latest
- -d, --directory <directory>¶
Directory in which to create the config file
- --force¶
Overwrite files if they exist.
- Default:
False
Arguments
- DATASET_ID¶
Required argument
Examples:
$ dsgrid registry datasets dump my-dataset-id
list¶
List the registered dimensions.
dsgrid registry datasets list [OPTIONS]
Options
- -f, --filter <filter>¶
Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags
valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]
Examples:
$ dsgrid registry datasets list
$ dsgrid registry datasets list -f “ID contains com” -f “Submitter == username”
register¶
Register a new dataset with the registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dataset.html#dsgrid.config.dataset_config.DatasetConfigModel
dsgrid registry datasets register [OPTIONS] DATASET_CONFIG_FILE DATASET_PATH
Options
- -l, --log-message <log_message>¶
Required reason for submission
Arguments
- DATASET_CONFIG_FILE¶
Required argument
- DATASET_PATH¶
Required argument
Examples:
$ dsgrid registry datasets register dataset.json5 -l “Register dataset my-dataset-id.”
update¶
Update an existing dataset in the registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dataset.html#dsgrid.config.dataset_config.DatasetConfigModel
dsgrid registry datasets update [OPTIONS] DATASET_CONFIG_FILE
Options
- -d, --dataset-id <dataset_id>¶
Required dataset ID
- -l, --log-message <log_message>¶
Required reason for submission
- -t, --update-type <update_type>¶
Required
- Options:
major | minor | patch
- -v, --version <version>¶
Required Version to update; must be the current version.
Arguments
- DATASET_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry datasets update
-l “Update the description for dataset my-dataset-id.”
-u patch
-v 1.0.0
dataset.json5
dimension-mappings¶
Dimension mapping subcommands
dsgrid registry dimension-mappings [OPTIONS] COMMAND [ARGS]...
dump¶
Dump a dimension mapping config file (and any related data) from the registry.
dsgrid registry dimension-mappings dump [OPTIONS] DIMENSION_MAPPING_ID
Options
- -v, --version <version>¶
Version to dump; defaults to latest
- -d, --directory <directory>¶
Directory in which to create config and data files
- --force¶
Overwrite files if they exist.
- Default:
False
Arguments
- DIMENSION_MAPPING_ID¶
Required argument
Examples:
$ dsgrid registry dimension-mappings dump 17565575
list¶
List the registered dimension mappings.
dsgrid registry dimension-mappings list [OPTIONS]
Options
- -f, --filter <filter>¶
Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags
valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]
Examples:
$ dsgrid registry dimension-mappings list
$ dsgrid registry dimension-mappings list -f “Type [From, To] contains geography” -f “Submitter == username”
register¶
Register new dimension mappings with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel
dsgrid registry dimension-mappings register [OPTIONS]
DIMENSION_MAPPING_CONFIG_FILE
Options
- -l, --log-message <log_message>¶
Required reason for submission
Arguments
- DIMENSION_MAPPING_CONFIG_FILE¶
Required argument
Examples:$ dsgrid registry dimension-mappings register -l “Register dimension mappings for my-project” dimension_mappings.json5
update¶
Update an existing dimension mapping registry. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.mapping_tables.MappingTableModel
dsgrid registry dimension-mappings update [OPTIONS]
DIMENSION_MAPPING_CONFIG_FILE
Options
- -d, --dimension-mapping-id <dimension_mapping_id>¶
Required dimension mapping ID
- -l, --log-message <log_message>¶
Required reason for submission
- -t, --update-type <update_type>¶
Required
- Options:
major | minor | patch
- -v, --version <version>¶
Required Version to update; must be the current version.
Arguments
- DIMENSION_MAPPING_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry dimension-mappings update
-d 17565575
-l “Swap out the state to county mapping for my-dataset to that-project”
-u major
-v 1.0.0 dimension_mappings.json5”
dimensions¶
Dimension subcommands
dsgrid registry dimensions [OPTIONS] COMMAND [ARGS]...
dump¶
Dump a dimension config file (and any related data) from the registry.
dsgrid registry dimensions dump [OPTIONS] DIMENSION_ID
Options
- -v, --version <version>¶
Version to dump; defaults to latest
- -d, --directory <directory>¶
Directory in which to create config and data files
- --force¶
Overwrite files if they exist.
- Default:
False
Arguments
- DIMENSION_ID¶
Required argument
Examples:
$ dsgrid registry dimensions dump 17565829
list¶
List the registered dimensions.
dsgrid registry dimensions list [OPTIONS]
Options
- -f, --filter <filter>¶
Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags
valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]
Examples:
$ dsgrid registry dimensions list
$ dsgrid registry dimensions list -f “Type == sector”
$ dsgrid registry dimensions list -f “Submitter == username”
register¶
Register new dimensions with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/dimension.html#dsgrid.config.dimensions.DimensionsConfigModel
dsgrid registry dimensions register [OPTIONS] DIMENSION_CONFIG_FILE
Options
- -l, --log-message <log_message>¶
Required reason for submission
Arguments
- DIMENSION_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry dimensions register -l “Register dimensions for my-project” dimensions.json5
update¶
Update an existing dimension in the registry.
dsgrid registry dimensions update [OPTIONS] DIMENSION_CONFIG_FILE
Options
- -d, --dimension-id <dimension_id>¶
Required dimension ID
- -l, --log-message <log_message>¶
Required reason for submission
- -t, --update-type <update_type>¶
Required
- Options:
major | minor | patch
- -v, --version <version>¶
Required Version to update; must be the current version.
Arguments
- DIMENSION_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry dimensions update -d 17565829 -l “Update county dimension” -u major -v 1.0.0 dimension.json5
list¶
List the contents of a registry.
dsgrid registry list [OPTIONS]
projects¶
Project subcommands
dsgrid registry projects [OPTIONS] COMMAND [ARGS]...
add-dataset-requirements¶
Add requirements for one or more datasets to a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.input_dataset_requirements.InputDatasetListModel
dsgrid registry projects add-dataset-requirements [OPTIONS] PROJECT_ID
FILENAME
Options
- -l, --log-message <log_message>¶
Required Please specify the reason for the new datasets.
Arguments
- PROJECT_ID¶
Required argument
- FILENAME¶
Required argument
Examples:
$ dsgrid registry projects add-dataset-requirements
-l “Add requirements for dataset my-dataset-id to my-project-id.”
my-project-id
dataset_requirements.json5
dump¶
Dump a project config file from the registry.
dsgrid registry projects dump [OPTIONS] PROJECT_ID
Options
- -v, --version <version>¶
Version to dump; defaults to latest
- -d, --directory <directory>¶
Directory in which to create the config file
- --force¶
Overwrite files if they exist.
- Default:
False
Arguments
- PROJECT_ID¶
Required argument
Examples:
$ dsgrid registry projects dump my-project-id
list¶
List the registered projects.
dsgrid registry projects list [OPTIONS]
Options
- -f, --filter <filter>¶
Filter table with a case-insensitive expression in the format ‘column operation value’, accepts multiple flags
valid operations: [‘==’, ‘!=’, ‘contains’, ‘not contains’]
Examples:
$ dsgrid registry projects list
$ dsgrid registry projects list -f “ID contains efs”
list-dimension-query-names¶
List the project’s dimension query names.
dsgrid registry projects list-dimension-query-names [OPTIONS] PROJECT_ID
Options
- -b, --exclude-base¶
Exclude base dimension query names.
- Default:
False
- -S, --exclude-subset¶
Exclude subset dimension query names.
- Default:
False
- -s, --exclude-supplemental¶
Exclude supplemental dimension query names.
- Default:
False
Arguments
- PROJECT_ID¶
Required argument
Examples:
$ dsgrid registry projects list-dimension-query-names my_project_id
$ dsgrid registry projects list-dimension-query-names –exclude-subset my_project_id
$ dsgrid registry projects list-dimension-query-names –exclude-supplemental my_project_id
register¶
Register a new project with the dsgrid repository. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.project_config.ProjectConfigModel
dsgrid registry projects register [OPTIONS] PROJECT_CONFIG_FILE
Options
- -l, --log-message <log_message>¶
Required reason for submission
Arguments
- PROJECT_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry projects register -l “Register project my-project” project.json5
register-and-submit-dataset¶
Register a dataset and then submit it to a dsgrid project.
dsgrid registry projects register-and-submit-dataset [OPTIONS]
Options
- -c, --dataset-config-file <dataset_config_file>¶
Required Dataset config file
- -d, --dataset-path <dataset_path>¶
Required Path to directory containing load data (Parquet) files.
- -m, --dimension-mapping-file <dimension_mapping_file>¶
Dimension mapping file. Must match the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel
- -r, --dimension-mapping-references-file <dimension_mapping_references_file>¶
dimension mapping references file. Mutually exclusive with dimension_mapping_file. Use it when the mappings are already registered. Must mach the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mapping_base.DimensionMappingReferenceListModel
- -a, --autogen-reverse-supplemental-mappings <autogen_reverse_supplemental_mappings>¶
For any dimension listed here, if the dataset’s dimension is a project’s supplemental dimension and no mapping is provided, create a reverse mapping from that supplemental dimension.
- Options:
metric | geography | sector | subsector | time | weather_year | model_year | scenario
- -p, --project-id <project_id>¶
Required project identifier
- -l, --log-message <log_message>¶
Required reason for submission
Examples:
$ dsgrid registry projects register-and-submit-dataset
-c dataset.json5
-d path/to/my/dataset
-p my-project-id
-d my-dataset-id
-m dimension_mappings.json5
-l “Register and submit dataset my-dataset to project my-project.”
register-subset-dimensions¶
Register new subset dimensions with a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation:
dsgrid registry projects register-subset-dimensions [OPTIONS] PROJECT_ID
FILENAME
Options
- -l, --log-message <log_message>¶
Required Please specify the reason for this addition.
Arguments
- PROJECT_ID¶
Required argument
- FILENAME¶
Required argument
Examples:
$ dsgrid registry projects register-subset-dimensions
-l “Register subset dimensions for end uses by fuel type for my-project-id.”
my-project-id
subset_dimensions.json5
register-supplemental-dimensions¶
Register new supplemental dimensions with a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation: https://dsgrid.github.io/dsgrid/reference/data_models/project.html#dsgrid.config.supplemental_dimension.SupplementalDimensionsListModel
dsgrid registry projects register-supplemental-dimensions [OPTIONS] PROJECT_ID
FILENAME
Options
- -l, --log-message <log_message>¶
Required Please specify the reason for this addition.
Arguments
- PROJECT_ID¶
Required argument
- FILENAME¶
Required argument
Examples:
$ dsgrid registry projects register-supplemental-dimensions
-l “Register states supplemental dimension for my-project-id”
my-project-id
supplemental_dimensions.json5
replace-dataset-dimension-requirements¶
Replace dimension requirements for one or more datasets in a project. The contents of the JSON/JSON5 file must match the data model defined by this documentation:
dsgrid registry projects replace-dataset-dimension-requirements
[OPTIONS] PROJECT_ID FILENAME
Options
- -l, --log-message <log_message>¶
Required Please specify the reason for the new requirements.
Arguments
- PROJECT_ID¶
Required argument
- FILENAME¶
Required argument
Examples:
$ dsgrid registry projects replace-dataset-dimension-requirements
-l “Replace dimension requirements for dataset my-dataset-id in my-project-id.”
project_id
dataset_dimension_requirements.json5
submit-dataset¶
Submit a dataset to a dsgrid project.
dsgrid registry projects submit-dataset [OPTIONS]
Options
- -d, --dataset-id <dataset_id>¶
Required dataset identifier
- -p, --project-id <project_id>¶
Required project identifier
- -m, --dimension-mapping-file <dimension_mapping_file>¶
Dimension mapping file. Must match the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mappings_config.DimensionMappingsConfigModel
- -r, --dimension-mapping-references-file <dimension_mapping_references_file>¶
dimension mapping references file. Mutually exclusive with dimension_mapping_file. Use it when the mappings are already registered. Must mach the data model defined by https://dsgrid.github.io/dsgrid/reference/data_models/dimension_mapping.html#dsgrid.config.dimension_mapping_base.DimensionMappingReferenceListModel
- -a, --autogen-reverse-supplemental-mappings <autogen_reverse_supplemental_mappings>¶
For any dimension listed here, if the dataset’s dimension is a project’s supplemental dimension and no mapping is provided, create a reverse mapping from that supplemental dimension.
- Options:
metric | geography | sector | subsector | time | weather_year | model_year | scenario
- -l, --log-message <log_message>¶
Required reason for submission
Examples:
$ dsgrid registry projects submit-dataset
-p my-project-id
-d my-dataset-id
-m dimension_mappings.json5
-l “Submit dataset my-dataset to project my-project.”
update¶
Update an existing project in the registry.
dsgrid registry projects update [OPTIONS] PROJECT_CONFIG_FILE
Options
- -p, --project-id <project_id>¶
Required project ID
- -l, --log-message <log_message>¶
Required reason for submission
- -t, --update-type <update_type>¶
Required
- Options:
major | minor | patch
- -v, --version <version>¶
Required Version to update; must be the current version.
Arguments
- PROJECT_CONFIG_FILE¶
Required argument
Examples:
$ dsgrid registry projects update
-p my-project-id
-u patch
-v 1.5.0
-l “Update description for project my-project-id.”