Project Coordinators¶

Project coordinators define the structure of a dsgrid project: its base dimensions, supplemental dimensions, dataset requirements, and queries. They are responsible for assembling datasets from multiple contributors into a coherent, queryable whole.

Prerequisites¶

Install dsgrid on your system, including Spark extras: pip install "dsgrid-toolkit[spark]"
Access to NLR HPC (most project coordination tasks involve large datasets)
See How to Start a Spark Cluster on Kestrel for cluster setup

Workflow Overview¶

Design the project — Define the dimensional structure that all datasets will map to. Read Project Concepts for an overview, design considerations, and links to key Dimension and Dataset concepts.
Create base dimensions — Define the finest-grained dimensions the project will support. Follow How to Create Base Dimensions.
Create supplemental dimensions — Define alternative dimensions, typically aggregations, that can be used for querying (e.g., counties → states). Follow How to Create Supplemental Dimensions.
Register the project — Create the project config and register it. See the Create a Project tutorial.
Coordinate dataset submissions — Work with dataset submitters to register and validate their contributions.
Define and run queries — Assemble the data using queries. See Query Concepts and How to Filter Query Results.
Create derived datasets — Build derived datasets from query results for publication. See Derived Dataset Concepts.

Project Coordinators¶

Prerequisites¶

Workflow Overview¶

Key Resources¶

Core Concepts¶

How-Tos¶

Tutorials¶

Software Reference¶